Re: [Xen-devel] [PATCH v8 02/13] x86: detect and initialize Intel CAT feature

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 04:40, chao.p.p...@linux.intel.com wrote:
 On Thu, May 28, 2015 at 01:54:39PM +0100, Jan Beulich wrote:
  On 21.05.15 at 10:41, chao.p.p...@linux.intel.com wrote:
  +
  +if ( !cpu_has(c, X86_FEATURE_CAT) )
  +return;
  +
  +socket = cpu_to_socket(cpu);
  +if ( test_bit(socket, cat_socket_enable) )
  +return;
  +
  +cpuid_count(PSR_CPUID_LEVEL_CAT, 0, eax, ebx, ecx, edx);
 
 While one would hope that X86_FEATURE_CAT implies the respective
 CPUID leaf being available, I think explicitly checking this should still
 be done just like is the case elsewhere.
 
 Against cpuid_level?

Of course.

  +static void __init init_psr_cat(void)
  +{
  +if ( opt_cos_max  1 )
  +{
  +printk(XENLOG_INFO CAT: disabled, cos_max is too small\n);
  +return;
  +}
 
 Is opt_cos_max == 1 really useful for anything?
 
 That means two COSes are available. cos=0 is reserved and cos=1 can
 still be used anyway.

Ah, sorry, this is _max_, not _count_.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 01/13] x86: add socket_cpumask

2015-05-29 Thread Chao Peng
On Fri, May 29, 2015 at 09:01:53AM +0100, Jan Beulich wrote:
  On 29.05.15 at 04:35, chao.p.p...@linux.intel.com wrote:
  On Thu, May 28, 2015 at 01:38:05PM +0100, Jan Beulich wrote:
   On 21.05.15 at 10:41, chao.p.p...@linux.intel.com wrote:
   --- a/xen/arch/x86/mpparse.c
   +++ b/xen/arch/x86/mpparse.c
   @@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
#endif
}

   +void __init set_nr_sockets(void)
   +{
   +unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
   +  boot_cpu_data.x86_max_cores *
   +  boot_cpu_data.x86_num_siblings);
  
  How did you come to this expression for the bitmap size? I.e.
  why not simply physids_weight(phys_cpu_present_map)?
  
  physids_weight(phys_cpu_present_map) gives me cpus for all sockets.
  While here the 'cpus' is actually _cpus_per_socket_. I used the max
  possible cpus indicated in cpuid as the upper bound so bitmap_weight()
  returns the actual available cpus on socket 0.
 
 In which case the variable name is badly chosen, or a respective
 comment is missing.
 
   +
   +if ( cpus == 0 )
   +cpus = 1;
   +
   +nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
   +}
  
  Is there a reason why this can't just be added to the end of the
  immediately preceding set_nr_cpu_ids()?
  
  You mean the declaration or invocation? If the former I have no special
  reason for it (e.g. I can change it).
 
 Neither - I just don't see the need for a new function.

In which case the invocation of set_nr_cpu_ids() should move to the
place where now set_nr_sockets() is invoked, to make sure
boot_cpu_data.x86_max_cores/x86_num_siblings available, which may not be
your expectation.

Chao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-4.3-testing test] 57458: regressions - trouble: blocked/broken/fail/pass

2015-05-29 Thread osstest service user
flight 57458 xen-4.3-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/57458/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xend-qemut-winxpsp3 16 guest-stop fail REGR. vs. 53768

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-libvirt  3 host-install(3)   broken pass in 57412
 test-amd64-amd64-xl-qemuu-win7-amd64 12 guest-localmigrate  fail pass in 57412

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail in 57412 
like 53768

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  6 xen-boot  fail in 57412 never pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install  fail never pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail never pass
 build-amd64-rumpuserxen   6 xen-buildfail   never pass
 build-i386-rumpuserxen6 xen-buildfail   never pass
 test-armhf-armhf-xl-sedf-pin  6 xen-boot fail   never pass
 test-armhf-armhf-xl-sedf  6 xen-boot fail   never pass
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale   6 xen-boot fail   never pass
 test-armhf-armhf-xl-credit2   6 xen-boot fail   never pass
 test-armhf-armhf-xl   6 xen-boot fail   never pass
 test-armhf-armhf-xl-cubietruck  6 xen-boot fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass

version targeted for testing:
 xen  e580a92dd53dbba62518e17a3f4ebd57b626926c
baseline version:
 xen  58db71c5cdd48126b9380c230dc5b61554bad7d8


People who touched revisions under test:
  Ian Jackson ian.jack...@eu.citrix.com
  Petr Matousek pmato...@redhat.com


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-armhf-armhf-xl-arndale  fail
 test-amd64-amd64-xl-credit2  pass
 test-armhf-armhf-xl-credit2  fail
 test-armhf-armhf-xl-cubietruck   fail
 test-amd64-i386-freebsd10-i386   pass
 

[Xen-devel] [PATCH] efi: Reallocate memory map if ExitBootServices() fails

2015-05-29 Thread Ross Lagerwall
If calling ExitBootServices() fails, the memory map size may have
increased, so determine the new size and reallocate the memory map
before calling GetMemoryMap() again.

This was seen on the following machine when using the iscsidxe UEFI
driver. The machine would consistently fail the first call to
ExitBootServices().
System Information
Manufacturer: Supermicro
Product Name: X10SLE-F/HF
BIOS Information
Vendor: American Megatrends Inc.
Version: 2.00
Release Date: 04/24/2014

Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
---
 xen/common/efi/boot.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index ef8476c..078f9b8 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -1053,14 +1053,14 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 efi_arch_video_init(gop, info_size, mode_info);
 }
 
-efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
- efi_mdesc_size, mdesc_ver);
-efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
-if ( !efi_memmap )
-blexit(LUnable to allocate memory for EFI memory map);
-
 for ( retry = 0; ; retry = 1 )
 {
+efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
+ efi_mdesc_size, mdesc_ver);
+efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
+if ( !efi_memmap )
+blexit(LUnable to allocate memory for EFI memory map);
+
 status = efi_bs-GetMemoryMap(efi_memmap_size, efi_memmap, map_key,
   efi_mdesc_size, mdesc_ver);
 if ( EFI_ERROR(status) )
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][v2][PATCH 00/14] Fix RMRR

2015-05-29 Thread Chen, Tiejun

On 2015/5/28 15:55, Jan Beulich wrote:

On 28.05.15 at 07:48, tiejun.c...@intel.com wrote:

On 2015/5/22 17:46, Jan Beulich wrote:

On 22.05.15 at 11:35, tiejun.c...@intel.com wrote:

As you know all devices are owned by Dom0 firstly before we create any
DomU, right? Do we allow Dom0 still own a group device while assign another
device in the same group?


Clearly not, or - just like anything else putting the security of a system
at risk - only at explicit host admin request.



You're right.

After we discussed internally, we're intending to cover this simply
since the case of shared RMRR is a rare case according to our previous
experiences. Furthermore, Xen doesn't have a good existing API to
directly assign this sort of group devices and even Xen doesn't identify
these devices,  so currently we always assign devices one by one, right?
This means we have to put more efforts to concern a good implementation
to address something like, identification, atomic, hotplug and so on.
Obviously, this would involve hypervisor and tools at the same time so
this has a little bit of difficulty to work along with 4.6.

So could we do this separately?

#1. Phase 1 to 4.6

#1.1. Do a simple implementation

We just prevent from that device assignment if we're assigning this sort
of group devices like this,


Right.


@@ -2291,6 +2291,16 @@ static int intel_iommu_assign_device(
PCI_BUS(bdf) == bus 
PCI_DEVFN2(bdf) == devfn )
   {
+if ( rmrr-scope.devices_cnt  1 )
+{
+reassign_device_ownership(d, hardware_domain, devfn, pdev);


I think if this is really needed here, the check comes too late.



So we can do this at the begging of this function

@@ -2277,13 +2277,37 @@ static int intel_iommu_assign_device(
 if ( list_empty(acpi_drhd_units) )
 return -ENODEV;

+seg = pdev-seg;
+bus = pdev-bus;
+/*
+ * In rare cases one given rmrr is shared by multiple devices but
+ * obviously this would put the security of a system at risk. So
+ * we should prevent from this sort of device assignment.
+ *
+ * TODO: actually we can group these devices which shared rmrr, and
+ * then allow all devices within a group to be assigned to same domain.
+ */
+for_each_rmrr_device( rmrr, bdf, i )
+{
+if ( rmrr-segment == seg 
+ PCI_BUS(bdf) == bus 
+ PCI_DEVFN2(bdf) == devfn )
+{
+if ( rmrr-scope.devices_cnt  1 )
+{
+ret = -EPERM;
+printk(XENLOG_G_ERR VTDPREFIX
+cannot assign this device with shared RMRR for 
Dom%d (%d)\n,

+   d-domain_id, ret);
+return ret;
+}
+}
+}
+
 ret = reassign_device_ownership(hardware_domain, d, devfn, pdev);
 if ( ret )
 return ret;

-seg = pdev-seg;
-bus = pdev-bus;
-
 /* Setup rmrr identity mapping */
 for_each_rmrr_device( rmrr, bdf, i )
 {

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 5/9] x86/intel_pstate: relocate the driver register/unregister function

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 04:47, wei.w.w...@intel.com wrote:
 On 26/05/2015 21:06, Jan Beulich wrote
  On 13.05.16 at 09:50, wei.w.w...@intel.com wrote:
  Register/unregister the CPU hotplug notifier when the driver is
  registered, and move the driver register/unregister function to the
  cpufreq.c.
 
 Without saying why I'm afraid I don't even see much reason to review this in
 any detail.
 
  --- a/xen/drivers/cpufreq/cpufreq.c
  +++ b/xen/drivers/cpufreq/cpufreq.c
  @@ -630,12 +630,31 @@ static struct notifier_block cpu_nfb = {
   .notifier_call = cpu_callback
   };
 
  -static int __init cpufreq_presmp_init(void)
  +int cpufreq_register_driver(struct cpufreq_driver *driver_data)
   {
  -void *cpu = (void *)(long)smp_processor_id();
  -cpu_callback(cpu_nfb, CPU_ONLINE, cpu);
 
 Why is this being removed without replacement?
 
 I think they are redundant here. 
 If we go and check the hypercall code path (the bottom of set_px_pminfo()),  
 the cpufreq_add_cpu() is called there (inside cpufreq_cpu_init()), too. Why 
 do we need to initialize this CPU twice?

If this is indeed being done twice, removing it is fine. But in a
separate patch with a proper description.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests

2015-05-29 Thread Ross Lagerwall
When doing passthrough of a PCI device for an HVM guest, don't insert
the device into xenstore, otherwise pciback attempts to use it which
conflicts with QEMU.

This manifests itself such that the first time a device is passed to a
domain, it succeeds. Subsequent attempts fail unless the device is
unbound from pciback or the machine rebooted.

Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
---
 tools/libxl/libxl_pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index e0743f8..2552889 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -994,7 +994,7 @@ out:
 }
 }
 
-if (!starting)
+if (!starting  type == LIBXL_DOMAIN_TYPE_PV)
 rc = libxl__device_pci_add_xenstore(gc, domid, pcidev, starting);
 else
 rc = 0;
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 01/13] x86: add socket_cpumask

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 04:35, chao.p.p...@linux.intel.com wrote:
 On Thu, May 28, 2015 at 01:38:05PM +0100, Jan Beulich wrote:
  On 21.05.15 at 10:41, chao.p.p...@linux.intel.com wrote:
  --- a/xen/arch/x86/mpparse.c
  +++ b/xen/arch/x86/mpparse.c
  @@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
   #endif
   }
   
  +void __init set_nr_sockets(void)
  +{
  +unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
  +  boot_cpu_data.x86_max_cores *
  +  boot_cpu_data.x86_num_siblings);
 
 How did you come to this expression for the bitmap size? I.e.
 why not simply physids_weight(phys_cpu_present_map)?
 
 physids_weight(phys_cpu_present_map) gives me cpus for all sockets.
 While here the 'cpus' is actually _cpus_per_socket_. I used the max
 possible cpus indicated in cpuid as the upper bound so bitmap_weight()
 returns the actual available cpus on socket 0.

In which case the variable name is badly chosen, or a respective
comment is missing.

  +
  +if ( cpus == 0 )
  +cpus = 1;
  +
  +nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
  +}
 
 Is there a reason why this can't just be added to the end of the
 immediately preceding set_nr_cpu_ids()?
 
 You mean the declaration or invocation? If the former I have no special
 reason for it (e.g. I can change it).

Neither - I just don't see the need for a new function.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 6/9] x86/intel_pstate: the main boby of the intel_pstate driver

2015-05-29 Thread Wang, Wei W
On 26/05/2015 21:58, Jan Beulich wrote
  On 13.05.16 at 09:50, wei.w.w...@intel.com wrote:
  +static inline int ceiling_fp(int32_t x) {
  +int mask, ret;
 
 Please here and below, consider whether types really need to be signed.
 One exception: If you intend to import the Linux source file with minimal
 modifications, and if that indeed results in only very few differences, then
 keeping the types as they are is probably fine. But in that case you should
 extend the patch description to say that the goal is to have minimal changes.
 All comments below are to be taken with the possible minimal change goal in
 mind.
 
  +static int byt_get_min_pstate(void)
  +{
  +u64 value;
  +
  +rdmsrl(BYT_RATIOS, value);
  +return (value  8)  0x7F;
  +}
  +
  +static int byt_get_max_pstate(void)
  +{
  +u64 value;
  +
  +rdmsrl(BYT_RATIOS, value);
  +return (value  16)  0x7F;
  +}
  +
  +static int byt_get_turbo_pstate(void) {
  +u64 value;
  +
  +rdmsrl(BYT_TURBO_RATIOS, value);
  +return value  0x7F;
  +}
  +
  +static void byt_set_pstate(struct cpudata *cpudata, int pstate) {
  +u64 val;
  +int32_t vid_fp;
  +u32 vid;
  +
  +val = pstate  8;
  +if (limits.no_turbo  !limits.turbo_disabled)
  +val |= (u64)1  32;
 
 All of the literal numbers above (and there are more further down) would
 better become self-documenting manifest constants.
 
  +#define BYT_BCLK_FREQS 5
  +static int byt_freq_table[BYT_BCLK_FREQS] = { 833, 1000, 1333, 1167,
  +800};
 
 This can be both const and local to the only function it's being used by.
 
  +static struct cpu_defaults core_params = {
 
 const? __initconst?
 
  +static struct cpu_defaults byt_params = {
 
 Same here.
 
  +static void intel_pstate_timer_func(void *__data) {
  +struct cpudata *cpu = (struct cpudata *) __data;
 
 Double underscores are completely unnecessary here.
 
  +#define ICPU(model, policy) \
  +{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_APERFMPERF,\
  +(unsigned long)policy }
  +
  +static const struct x86_cpu_id intel_pstate_cpu_ids[] = {
 
 __initconst
 
  +ICPU(0x2a, core_params),
  +ICPU(0x2d, core_params),
  +ICPU(0x37, byt_params),
  +ICPU(0x3a, core_params),
  +ICPU(0x3c, core_params),
  +ICPU(0x3d, core_params),
  +ICPU(0x3e, core_params),
  +ICPU(0x3f, core_params),
  +ICPU(0x45, core_params),
  +ICPU(0x46, core_params),
  +ICPU(0x47, core_params),
  +ICPU(0x4c, byt_params),
  +ICPU(0x4e, core_params),
  +ICPU(0x4f, core_params),
  +ICPU(0x56, core_params),
 
 Please handle the _params name tag inside ICPU().
 
  +static int intel_pstate_set_policy(struct cpufreq_policy *policy) {
  +if (!policy-cpuinfo.max_freq)
  +return -ENODEV;
  +
  +if (policy-policy == CPUFREQ_POLICY_PERFORMANCE) {
 
 switch(policy-policy) please.
 
  +limits.no_turbo = 0;
  +limits.max_perf_pct = 100;
  +limits.max_perf = int_tofp(1);
  +limits.min_perf_pct = 100;
  +limits.min_perf = int_tofp(1);
  +policy-max_perf_pct = 100;
  +policy-min_perf_pct = 100;
  +return 0;
 
 And no need for identical return statement in all four branches.
 
  +} else if (policy-policy == CPUFREQ_POLICY_USERSPACE) {
  +limits.max_perf_pct = max(limits.min_policy_pct, policy-
 max_perf_pct);
  +limits.max_perf_pct = min(limits.max_policy_pct,
  + limits.max_perf_pct);
 
 Why are you not using clamp() here?
 
  +} else {
 
 This should at least have a comment saying CPUFREQ_POLICY_ONDEMAND.
 
  +static int intel_pstate_verify_policy(struct cpufreq_policy *policy)
  +{
  +cpufreq_verify_within_limits(policy, policy-cpuinfo.min_freq,
  + policy-cpuinfo.max_freq);
  +
  +if ( policy-policy != CPUFREQ_POLICY_POWERSAVE   
  + policy-policy != CPUFREQ_POLICY_PERFORMANCE 
  + policy-policy != CPUFREQ_POLICY_USERSPACE   
  + policy-policy != CPUFREQ_POLICY_ONDEMAND )
 
 switch()

How would we use switch() here? 

 
  +static int intel_pstate_cpu_init(struct cpufreq_policy *policy) {
  +struct cpudata *cpu;
  +int rc;
  +
  +rc = intel_pstate_init_cpu(policy-cpu);
 
 Having both intel_pstate_cpu_init() and intel_pstate_init_cpu() is at least
 odd, the more that the latter is being called from only here.

Are you suggesting to change the function name? If so, how about changing 
intel_pstate_cpu_init() to intel_pstate_setup()?

 
  +if (rc)
  +return rc;
  +
  +cpu = all_cpu_data[policy-cpu];
  +if (limits.min_perf_pct == 100  limits.max_perf_pct == 100)
  +policy-policy = CPUFREQ_POLICY_PERFORMANCE;
  +else
  +policy-policy = CPUFREQ_POLICY_ONDEMAND;
  +
  +policy-min = cpu-pstate.min_pstate * cpu-pstate.scaling;
  +policy-max = cpu-pstate.turbo_pstate * cpu-pstate.scaling;
  +policy-min_perf_pct = 0;
  +policy-max_perf_pct = 100;
  

Re: [Xen-devel] [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket

2015-05-29 Thread Chao Peng
On Fri, May 29, 2015 at 09:06:46AM +0100, Jan Beulich wrote:
  On 29.05.15 at 04:43, chao.p.p...@linux.intel.com wrote:
  On Thu, May 28, 2015 at 02:17:54PM +0100, Jan Beulich wrote:
   On 21.05.15 at 10:41, chao.p.p...@linux.intel.com wrote:
   +static int cat_cpu_init(unsigned int cpu)
   +{
   +int rc;
   +const struct cpuinfo_x86 *c = cpu_data + cpu;
   +
   +if ( !cpu_has(c, X86_FEATURE_CAT) )
   +return 0;
   +
   +if ( test_bit(cpu_to_socket(cpu), cat_socket_enable) )
   +return 0;
   +
   +if ( cpu == smp_processor_id() )
   +do_cat_cpu_init(rc);
   +else
   +on_selected_cpus(cpumask_of(cpu), do_cat_cpu_init, rc, 1);
  
  This now being called in the context of CPU_UP_PREPARE, I can't see
  how this works at all: Neither would the CPU's cpu_data[] instance be
  initialized by that time, nor would you be able to IPI that CPU, nor can I
  see how the if() branch could ever get entered. Was this tested at all?
  
  Ah, yes! So it sounds really a little difficult to move the memory
  allocation from CPU_STARTING to CPU_PREPARA for this case.
 
 Not sure why you talk about memory allocation again. That should
 be done in CPU_UP_PREPARE. But stuff that needs to happen on
 the CPU should happen in CPU_STARTING. The memory allocation's
 size depending on a CPU characteristic of course makes this a little
 problematic, but (I think I said so before) since we're assuming
 symmetry in many other places, I don't see anything wrong with
 you assuming symmetry here too, and hence use e.g. the boot CPU's
 value to determine the allocation size.

No problem, then I can just forget the support for asymmetry in XEN.

Chao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 6/9] x86/intel_pstate: the main boby of the intel_pstate driver

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 10:19, wei.w.w...@intel.com wrote:
 On 26/05/2015 21:58, Jan Beulich wrote
  On 13.05.16 at 09:50, wei.w.w...@intel.com wrote:
  +static int intel_pstate_verify_policy(struct cpufreq_policy *policy)
  +{
  +cpufreq_verify_within_limits(policy, policy-cpuinfo.min_freq,
  + policy-cpuinfo.max_freq);
  +
  +if ( policy-policy != CPUFREQ_POLICY_POWERSAVE   
  + policy-policy != CPUFREQ_POLICY_PERFORMANCE 
  + policy-policy != CPUFREQ_POLICY_USERSPACE   
  + policy-policy != CPUFREQ_POLICY_ONDEMAND )
 
 switch()
 
 How would we use switch() here? 

switch ( policy-policy )
{
case CPUFREQ_POLICY_POWERSAVE:

etc. But I thought that to be obvious, so I'm not sure I understand
what you don't understand.

  +static int intel_pstate_cpu_init(struct cpufreq_policy *policy) {
  +struct cpudata *cpu;
  +int rc;
  +
  +rc = intel_pstate_init_cpu(policy-cpu);
 
 Having both intel_pstate_cpu_init() and intel_pstate_init_cpu() is at least
 odd, the more that the latter is being called from only here.
 
 Are you suggesting to change the function name? If so, how about changing 
 intel_pstate_cpu_init() to intel_pstate_setup()?

Either that, or fold the called function into the caller.

  +if (rc)
  +return rc;
  +
  +cpu = all_cpu_data[policy-cpu];
  +if (limits.min_perf_pct == 100  limits.max_perf_pct == 100)
  +policy-policy = CPUFREQ_POLICY_PERFORMANCE;
  +else
  +policy-policy = CPUFREQ_POLICY_ONDEMAND;
  +
  +policy-min = cpu-pstate.min_pstate * cpu-pstate.scaling;
  +policy-max = cpu-pstate.turbo_pstate * cpu-pstate.scaling;
  +policy-min_perf_pct = 0;
  +policy-max_perf_pct = 100;
  +policy-turbo_pct = get_turbo_pct();
  +
  +/* cpuinfo and default policy values */
  +policy-cpuinfo.min_freq = cpu-pstate.min_pstate * cpu-
 pstate.scaling;
  +policy-cpuinfo.max_freq =
  +cpu-pstate.turbo_pstate * cpu-pstate.scaling;
  +policy-cpuinfo.transition_latency = CPUFREQ_ETERNAL;
  +cpumask_set_cpu(policy-cpu, policy-cpus);
  +
  +/* the first cpu do the init work for limits.min/max_policy_pct */
  +if (cpu-cpu == 0) {
 
 cpu-cpu == 0 doesn't mean this is the first CPU to come here.
 
 How about this one:
 
 If (limits.min_policy_pct == 0) {
   limits.min_policy_pct = 
   limits.xx = 
 }
 
 limits.min_policy_pct won't be 0 if it is initialized.

If that's guaranteed - fine with me.

  +static int intel_pstate_msrs_not_valid(void) {
  +/* Check that all the msr's we are using are valid. */
  +u64 aperf, mperf, tmp;
  +
  +rdmsrl(MSR_IA32_APERF, aperf);
  +rdmsrl(MSR_IA32_MPERF, mperf);
  +
  +if (!pstate_funcs.get_max() ||
  +!pstate_funcs.get_min() ||
  +!pstate_funcs.get_turbo())
  +return -ENODEV;
  +
  +rdmsrl(MSR_IA32_APERF, tmp);
  +if (!(tmp - aperf))
 
 Why not if(tmp == aperf)? And - is it guaranteed that APERF (and MPERF
 below) incremented in the meantime?
 
 Will change it to if (tmp == aperf).
 According to the SDM, IA32_MPERF MSR (E7H) increments in proportion to a 
 fixed frequency, and IA32_APERF MSR increments in proportion to actual 
 performance. So, as long as the two MSRs are valid, their values will be 
 increased.

The in proportion is what makes me nervous: What if the proportion
is 1 in every 1000 cycles?

  +int __init intel_pstate_init(void)
  +{
  +int cpu, rc = 0;
  +const struct x86_cpu_id *id;
  +struct cpu_defaults *cpu_info;
  +
  +if (cpuid_ecx(6)  0x1)
  +set_bit(X86_FEATURE_APERFMPERF,
  + boot_cpu_data.x86_capability);
 
 This should be consolidated out of the other cpufreq drivers into
 cpu/common.c.
 
 How about moving it to the bottom of init_intel() in cpu/intel.c ?

It's not Intel specific, so it belongs in cpu/common.c.

  --- a/xen/include/asm-x86/acpi.h
  +++ b/xen/include/asm-x86/acpi.h
  @@ -32,6 +32,8 @@
   #define COMPILER_DEPENDENT_INT64   long long
   #define COMPILER_DEPENDENT_UINT64  unsigned long long
 
  +extern int intel_pstate_init(void);
 
 Why in acpi.h?
 
 I was thinking that p-state is part of ACPI. Do you prefer to create a new .h 
 called cpufreq.h, just like the cpuidle.h there?

ACPI is a way to convey information about P- (and C- and T-) states,
but the latter are imo not tied to ACPI. In fact your patch series here
proves the opposite: You add code dealing with P-states without
using information coming from ACPI. I think this should go into what
currently is acpi/cpufreq/cpufreq.h, which is expected to be moved
out of acpi/ anyway (I seem to even recall an ARM side series aiming
at doing that).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] x86: Don't crash when mapping a page using EFI runtime page tables

2015-05-29 Thread Ross Lagerwall
When an interrupt is received during an EFI runtime service call, Xen
may call map_domain_page() while using the EFI runtime page tables.
This fails because, although the EFI runtime page tables are a
copy of the idle domain's page tables, current points at a different
domain's vCPU.

To fix this, return NULL from mapcache_current_vcpu() when using the EFI
runtime page tables which is treated equivalently to running in an idle
vCPU.

This issue can be reproduced by repeatedly calling GetVariable() from
dom0 while using VT-d, since VT-d frequently maps a page from interrupt
context.

Example call trace:
[82d0801615dc] __find_next_zero_bit+0x28/0x60
[82d08016a10e] map_domain_page+0x4c6/0x4eb
[82d080156ae6] map_vtd_domain_page+0xd/0xf
[82d08015533a] msi_msg_read_remap_rte+0xe3/0x1d8
[82d08014e516] iommu_read_msi_from_ire+0x31/0x34
[82d08016ff6c] set_msi_affinity+0x134/0x17a
[82d0801737b5] move_masked_irq+0x5c/0x98
[82d080173816] move_native_irq+0x25/0x36
[82d08016ffcb] ack_nonmaskable_msi_irq+0x19/0x20
[82d08016ffdb] ack_maskable_msi_irq+0x9/0x37
[82d080173e8b] do_IRQ+0x251/0x635
[82d080234502] common_interrupt+0x62/0x70
[df7ed2be] df7ed2be

Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
---

Changed in v2:
* Only read CR3 when checking if running on the runtime page tables
  if inside the efi runtime services lock and on the same CPU that
  took the lock.

 xen/arch/x86/domain_page.c | 13 +
 xen/arch/x86/efi/stub.c|  2 +-
 xen/common/efi/runtime.c   | 10 --
 xen/include/xen/efi.h  |  2 +-
 4 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c
index 5f6f397..7954998 100644
--- a/xen/arch/x86/domain_page.c
+++ b/xen/arch/x86/domain_page.c
@@ -32,20 +32,25 @@ static inline struct vcpu *mapcache_current_vcpu(void)
 return NULL;
 
 /*
+ * When using efi runtime page tables, we have the equivalent of the idle
+ * domain's page tables but current may point at another domain's VCPU.
+ * Return NULL as though current is not properly set up yet.
+ */
+if ( efi_enabled  efi_rs_using_pgtables() )
+return NULL;
+
+/*
  * If guest_table is NULL, and we are running a paravirtualised guest,
  * then it means we are running on the idle domain's page table and must
  * therefore use its mapcache.
  */
 if ( unlikely(pagetable_is_null(v-arch.guest_table))  is_pv_vcpu(v) )
 {
-unsigned long cr3;
-
 /* If we really are idling, perform lazy context switch now. */
 if ( (v = idle_vcpu[smp_processor_id()]) == current )
 sync_local_execstate();
 /* We must now be running on the idle page table. */
-ASSERT((cr3 = read_cr3()) == __pa(idle_pg_table) ||
-   (efi_enabled  cr3 == efi_rs_page_table()));
+ASSERT(read_cr3() == __pa(idle_pg_table));
 }
 
 return v;
diff --git a/xen/arch/x86/efi/stub.c b/xen/arch/x86/efi/stub.c
index 627009f..07c2bd0 100644
--- a/xen/arch/x86/efi/stub.c
+++ b/xen/arch/x86/efi/stub.c
@@ -12,7 +12,7 @@ void __init efi_init_memory(void) { }
 
 void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e) { }
 
-paddr_t efi_rs_page_table(void)
+bool_t efi_rs_using_pgtables(void)
 {
 BUG();
 return 0;
diff --git a/xen/common/efi/runtime.c b/xen/common/efi/runtime.c
index 5ed8b01..ae87557 100644
--- a/xen/common/efi/runtime.c
+++ b/xen/common/efi/runtime.c
@@ -30,6 +30,7 @@ const CHAR16 *__read_mostly efi_fw_vendor;
 const EFI_RUNTIME_SERVICES *__read_mostly efi_rs;
 #ifndef CONFIG_ARM /* TODO - disabled until implemented on ARM */
 static DEFINE_SPINLOCK(efi_rs_lock);
+static unsigned int efi_rs_on_cpu = NR_CPUS;
 #endif
 
 UINTN __read_mostly efi_memmap_size;
@@ -66,6 +67,8 @@ unsigned long efi_rs_enter(void)
 
 spin_lock(efi_rs_lock);
 
+efi_rs_on_cpu = smp_processor_id();
+
 /* prevent fixup_page_fault() from doing anything */
 irq_enter();
 
@@ -100,13 +103,16 @@ void efi_rs_leave(unsigned long cr3)
 asm volatile ( lgdt %0 : : m (gdt_desc) );
 }
 irq_exit();
+efi_rs_on_cpu = NR_CPUS;
 spin_unlock(efi_rs_lock);
 stts();
 }
 
-paddr_t efi_rs_page_table(void)
+bool_t efi_rs_using_pgtables(void)
 {
-return efi_l4_pgtable ? virt_to_maddr(efi_l4_pgtable) : 0;
+return efi_l4_pgtable 
+   (smp_processor_id() == efi_rs_on_cpu) 
+   (read_cr3() == virt_to_maddr(efi_l4_pgtable));
 }
 
 unsigned long efi_get_time(void)
diff --git a/xen/include/xen/efi.h b/xen/include/xen/efi.h
index 48de8e0..e74dad1 100644
--- a/xen/include/xen/efi.h
+++ b/xen/include/xen/efi.h
@@ -29,7 +29,7 @@ struct xenpf_efi_runtime_call;
 struct compat_pf_efi_runtime_call;
 
 void efi_init_memory(void);
-paddr_t efi_rs_page_table(void);
+bool_t efi_rs_using_pgtables(void);
 unsigned long efi_get_time(void);
 void efi_halt_system(void);
 void 

Re: [Xen-devel] [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 04:43, chao.p.p...@linux.intel.com wrote:
 On Thu, May 28, 2015 at 02:17:54PM +0100, Jan Beulich wrote:
  On 21.05.15 at 10:41, chao.p.p...@linux.intel.com wrote:
  +static int cat_cpu_init(unsigned int cpu)
  +{
  +int rc;
  +const struct cpuinfo_x86 *c = cpu_data + cpu;
  +
  +if ( !cpu_has(c, X86_FEATURE_CAT) )
  +return 0;
  +
  +if ( test_bit(cpu_to_socket(cpu), cat_socket_enable) )
  +return 0;
  +
  +if ( cpu == smp_processor_id() )
  +do_cat_cpu_init(rc);
  +else
  +on_selected_cpus(cpumask_of(cpu), do_cat_cpu_init, rc, 1);
 
 This now being called in the context of CPU_UP_PREPARE, I can't see
 how this works at all: Neither would the CPU's cpu_data[] instance be
 initialized by that time, nor would you be able to IPI that CPU, nor can I
 see how the if() branch could ever get entered. Was this tested at all?
 
 Ah, yes! So it sounds really a little difficult to move the memory
 allocation from CPU_STARTING to CPU_PREPARA for this case.

Not sure why you talk about memory allocation again. That should
be done in CPU_UP_PREPARE. But stuff that needs to happen on
the CPU should happen in CPU_STARTING. The memory allocation's
size depending on a CPU characteristic of course makes this a little
problematic, but (I think I said so before) since we're assuming
symmetry in many other places, I don't see anything wrong with
you assuming symmetry here too, and hence use e.g. the boot CPU's
value to determine the allocation size.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 05/13] x86: expose CBM length and COS number information

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 04:47, chao.p.p...@linux.intel.com wrote:
 On Thu, May 28, 2015 at 02:26:03PM +0100, Jan Beulich wrote:
 
  --- a/xen/include/public/sysctl.h
  +++ b/xen/include/public/sysctl.h
  @@ -694,6 +694,20 @@ struct xen_sysctl_pcitopoinfo {
   typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
   DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
   
  +#define XEN_SYSCTL_PSR_CAT_get_l3_info   0
  +struct xen_sysctl_psr_cat_op {
  +uint32_t cmd;   /* IN: XEN_SYSCTL_PSR_CAT_* */
  +uint32_t target;/* IN: socket to be operated on */
 
 If this is always the socket number, why would the variable be
 named anything other than socket. If otoh subsequent patches
 use it differently, I think the comment should be omitted now
 rather than being dropped then (or it should be given its final
 wording from the beginning).
 
 Or 'target to be operated on'?

Fine with me. Just not something that may end up being confusing.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv10 3/4] gnttab: make the grant table lock a read-write lock

2015-05-29 Thread Jan Beulich
 On 28.05.15 at 18:09, dvra...@cantab.net wrote:
 On 28/05/15 15:55, Jan Beulich wrote:
 On 26.05.15 at 20:00, david.vra...@citrix.com wrote:
 @@ -254,23 +254,23 @@ double_gt_lock(struct grant_table *lgt, struct 
 grant_table *rgt)
  {
  if ( lgt  rgt )
  {
 -spin_lock(lgt-lock);
 -spin_lock(rgt-lock);
 +write_lock(lgt-lock);
 +write_lock(rgt-lock);
  }
  else
  {
  if ( lgt != rgt )
 -spin_lock(rgt-lock);
 -spin_lock(lgt-lock);
 +write_lock(rgt-lock);
 +write_lock(lgt-lock);
  }
  }
 
 So I looked at the two uses of double_gt_lock() again: in both cases
 only a read lock is needed on rgt (which is also the natural thing to
 expect: we aren't supposed to modify the remote domain's grant
 table in any way here). Albeit that's contradicting ...
 
 See comment below.
 
 @@ -568,10 +568,10 @@ static void mapcount(
  *wrc = *rdc = 0;
  
  /*
 - * Must have the remote domain's grant table lock while counting
 - * its active entries.
 + * Must have the remote domain's grant table write lock while
 + * counting its active entries.
   */
 -ASSERT(spin_is_locked(rd-grant_table-lock));
 +ASSERT(rw_is_write_locked(rd-grant_table-lock));
 
 ... this: Why would we need to hold the write lock here? We're
 not changing anything in rd-grant_table.
 
 @@ -837,12 +838,22 @@ __gnttab_map_grant_ref(
  
  TRACE_1D(TRC_MEM_PAGE_GRANT_MAP, op-dom);
  
 +/*
 + * All maptrack entry users check mt-flags first before using the
 + * other fields so just ensure the flags field is stored last.
 + *
 + * However, if gnttab_need_iommu_mapping() then this would race
 + * with a concurrent mapcount() call (on an unmap, for example)
 + * and a lock is required.
 + */
  mt = maptrack_entry(lgt, handle);
  mt-domid = op-dom;
  mt-ref   = op-ref;
 -mt-flags = op-flags;
 +wmb();
 +write_atomic(mt-flags, op-flags);
 Further, why are only races against mapcount()
 a problem, but not such against __gnttab_unmap_common() as a
 whole? I.e. what's the locking around the op-map-flags and
 op-map-domid accesses below good for? Or, alternatively, isn't
 this an indication of a problem with the previous patch splitting off
 the maptrack lock (effectively leaving some map track entry
 accesses without any guard)?
 
 The double_gt_lock() takes both write locks, thus does not race with
 __gnttab_unmap_common clearing the flag on the maptrack entry which is
 done while holding the remote read lock.

The maptrack entries are items of the local domain, i.e. the state
of the remote domain's lock shouldn't matter there at all. Anything
else would be extremely counterintuitive and hence prone to future
breakage. With that the earlier two comments (above) remain un-
addressed too.

 -double_gt_unlock(lgt, rgt);
 +if ( gnttab_need_iommu_mapping(ld) )
 +double_gt_unlock(lgt, rgt);
 
 I think you shouldn't rely on gnttab_need_iommu_mapping() to
 produce the same result upon this and the earlier invocation, i.e.
 you ought to latch the first call's result into a local variable.
 
 Um. Okay. But if this changes during the life time of a domain it's
 going to either leak iommu mappings or fail to create them which sounds
 rather broken to me.

Hotplugging a passed through device into a domain can change the
outcome of iommu_needed(). I wouldn't be surprised if the
combination of mapped grants and passed through devices didn't
correctly deal with all (corner) cases, as it seems likely to me that
such domains aren't of wide spread use (yet). In particular I don't
see arch_iommu_populate_page_table() take care of active grant
mappings at all.

 @@ -2645,7 +2663,7 @@ __gnttab_swap_grant_ref(grant_ref_t ref_a, 
 grant_ref_t ref_b)
  struct active_grant_entry *act_b = NULL;
  s16 rc = GNTST_okay;
  
 -spin_lock(gt-lock);
 +write_lock(gt-lock);
  
  /* Bounds check on the grant refs */
  if ( unlikely(ref_a = nr_grant_entries(d-grant_table)))
 @@ -2689,7 +2707,7 @@ out:
  active_entry_release(act_b);
  if ( act_a != NULL )
  active_entry_release(act_a);
 -spin_unlock(gt-lock);
 +write_unlock(gt-lock);
 
 It would seem to me that these could be dropped when the per-active-
 entry locks get introduced.
 
 I'm not sure what you want dropped here?  We require the write lock here
 because we're taking two active entries at once.

Ah, right. But couldn't the write lock then be dropped as soon as the
two active entries got locked?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv10 4/4] gnttab: use per-VCPU maptrack free lists

2015-05-29 Thread Jan Beulich
 On 28.05.15 at 18:10, dvra...@cantab.net wrote:
 On 28/05/15 16:39, Jan Beulich wrote:
 On 26.05.15 at 20:00, david.vra...@citrix.com wrote:
 --- a/xen/common/grant_table.c
 +++ b/xen/common/grant_table.c
 @@ -57,7 +57,7 @@ integer_param(gnttab_max_frames, max_grant_frames);
   * New options allow to set max_maptrack_frames and
   * map_grant_table_frames independently.
   */
 -#define DEFAULT_MAX_MAPTRACK_FRAMES 256
 +#define DEFAULT_MAX_MAPTRACK_FRAMES 1024
 
 Apart from everything else this again results in ...
 
 @@ -1457,6 +1491,17 @@ gnttab_setup_table(
  gt = d-grant_table;
  write_lock(gt-lock);
  
 +/* Tracking of mapped foreign frames table */
 +gt-maptrack = xzalloc_array(struct grant_mapping *, 
 max_maptrack_frames);
 
 ... this becoming an order-1 runtime allocation on other than 32-bit
 ARM.
 
 I thought we agreed that this was temporary until vzalloc() was added?

Was it this one? And anyway, the vzalloc() addition went in almost
two weeks ago.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 01/13] x86: add socket_cpumask

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 10:28, chao.p.p...@linux.intel.com wrote:
 On Fri, May 29, 2015 at 09:01:53AM +0100, Jan Beulich wrote:
  On 29.05.15 at 04:35, chao.p.p...@linux.intel.com wrote:
  On Thu, May 28, 2015 at 01:38:05PM +0100, Jan Beulich wrote:
   On 21.05.15 at 10:41, chao.p.p...@linux.intel.com wrote:
   --- a/xen/arch/x86/mpparse.c
   +++ b/xen/arch/x86/mpparse.c
   @@ -87,6 +87,18 @@ void __init set_nr_cpu_ids(unsigned int max_cpus)
#endif
}

   +void __init set_nr_sockets(void)
   +{
   +unsigned int cpus = bitmap_weight(phys_cpu_present_map.mask,
   +  boot_cpu_data.x86_max_cores *
   +  boot_cpu_data.x86_num_siblings);
   +
   +if ( cpus == 0 )
   +cpus = 1;
   +
   +nr_sockets = DIV_ROUND_UP(num_processors + disabled_cpus, cpus);
   +}
  
  Is there a reason why this can't just be added to the end of the
  immediately preceding set_nr_cpu_ids()?
  
  You mean the declaration or invocation? If the former I have no special
  reason for it (e.g. I can change it).
 
 Neither - I just don't see the need for a new function.
 
 In which case the invocation of set_nr_cpu_ids() should move to the
 place where now set_nr_sockets() is invoked, to make sure
 boot_cpu_data.x86_max_cores/x86_num_siblings available, which may not be
 your expectation.

Ah, in which case this _is_ the explanation, albeit only provided the
use of the two boot_cpu_data fields has to remain (which I had put
under question). And if these have to remain, couldn't this be done
in a presmp initcall instead of an explicitly called function?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/paging: remove pointless current domain checks

2015-05-29 Thread Jan Beulich
Checking that the subject domain is not the current one is pointless
when already having paused that domain: domain_pause() already
ASSERT()s this to be the case.

Signed-off-by: Jan Beulich jbeul...@suse.com

--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -464,13 +464,6 @@ int hap_enable(struct domain *d, u32 mod
 
 domain_pause(d);
 
-/* error check */
-if ( (d == current-domain) )
-{
-rv = -EINVAL;
-goto out;
-}
-
 old_pages = d-arch.paging.hap.total_pages;
 if ( old_pages == 0 )
 {
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2972,8 +2972,7 @@ int shadow_enable(struct domain *d, u32 
 domain_pause(d);
 
 /* Sanity check the arguments */
-if ( (d == current-domain) ||
- shadow_mode_enabled(d) ||
+if ( shadow_mode_enabled(d) ||
  ((mode  PG_translate)  !(mode  PG_refcounts)) ||
  ((mode  PG_external)  !(mode  PG_translate)) )
 {



x86/paging: remove pointless current domain checks

Checking that the subject domain is not the current one is pointless
when already having paused that domain: domain_pause() already
ASSERT()s this to be the case.

Signed-off-by: Jan Beulich jbeul...@suse.com

--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -464,13 +464,6 @@ int hap_enable(struct domain *d, u32 mod
 
 domain_pause(d);
 
-/* error check */
-if ( (d == current-domain) )
-{
-rv = -EINVAL;
-goto out;
-}
-
 old_pages = d-arch.paging.hap.total_pages;
 if ( old_pages == 0 )
 {
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2972,8 +2972,7 @@ int shadow_enable(struct domain *d, u32 
 domain_pause(d);
 
 /* Sanity check the arguments */
-if ( (d == current-domain) ||
- shadow_mode_enabled(d) ||
+if ( shadow_mode_enabled(d) ||
  ((mode  PG_translate)  !(mode  PG_refcounts)) ||
  ((mode  PG_external)  !(mode  PG_translate)) )
 {
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 00/13] Persistent grant maps for xen net drivers

2015-05-29 Thread Yuzhou (C)
Hi,

About rx zerocopy, I have a question:

If some application make a socket, then listen and accept, the client 
sends packets to it, but it doesn't recv from this socket right now, all 
persistent grant page would be in used.
So other application cannot receive any packets.  Is my guess right or wrong?

YuZhou

-Original Message-
From: xen-devel-boun...@lists.xen.org [mailto:xen-devel-boun...@lists.xen.org] 
On Behalf Of Joao Martins
Sent: Friday, May 22, 2015 6:27 PM
To: Wei Liu
Cc: ian.campb...@citrix.com; net...@vger.kernel.org; david.vra...@citrix.com; 
xen-de...@lists.xenproject.org; boris.ostrov...@oracle.com
Subject: Re: [Xen-devel] [RFC PATCH 00/13] Persistent grant maps for xen net 
drivers


On 19 May 2015, at 17:39, Wei Liu wei.l...@citrix.com wrote:

 On Tue, May 12, 2015 at 07:18:24PM +0200, Joao Martins wrote:
 
 There have been recently[3] some discussions and issues raised on 
 persistent grants for the block layer, though the numbers above show 
 some significant improvements specially on more network intensive 
 workloads and provide a margin for comparison against future 
 map/unmap improvements.
 
 Any comments or suggestions are welcome, Thanks!
 
 Thanks, the numbers certainly look interesting.
 
 I'm just a bit concerned about the complexity of netback. I've 
 commented on individual patches, we can discuss the issues there.

Thanks a lot for the review! It does add more complexity, mainly for the TX 
path, but I also would like to mention that a portion of this changeset is also 
the persistent grants ops that could potentially live outside.

Joao

 [1] http://article.gmane.org/gmane.linux.network/249383
 [2] http://bit.ly/1IhJfXD
 [3] 
 http://lists.xen.org/archives/html/xen-devel/2015-02/msg02292.html
 
 Joao Martins (13):
  xen-netback: add persistent grant tree ops
  xen-netback: xenbus feature persistent support
  xen-netback: implement TX persistent grants
  xen-netback: implement RX persistent grants
  xen-netback: refactor xenvif_rx_action
  xen-netback: copy buffer on xenvif_start_xmit()
  xen-netback: add persistent tree counters to debugfs
  xen-netback: clone skb if skb-xmit_more is set
  xen-netfront: move grant_{ref,page} to struct grant
  xen-netfront: refactor claim/release grant
  xen-netfront: feature-persistent xenbus support
  xen-netfront: implement TX persistent grants
  xen-netfront: implement RX persistent grants
 
 drivers/net/xen-netback/common.h|  79 
 drivers/net/xen-netback/interface.c |  78 +++-
 drivers/net/xen-netback/netback.c   | 873 
 ++--
 drivers/net/xen-netback/xenbus.c|  24 +
 drivers/net/xen-netfront.c  | 362 ---
 5 files changed, 1216 insertions(+), 200 deletions(-)
 
 --
 2.1.3



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen BUG at page_alloc.c:1738 (Xen 4.5)

2015-05-29 Thread Jason Fritcher
On Wed, 20 May 2015, Major Hayden wrote:

 On 05/20/2015 05:41 AM, Jan Beulich wrote:
  Considering that no-one else is seeing this - is this perhaps connected
  to you building Xen with pre-release gcc 5.0.1? This is also because in
  order for the above to indeed occur, mmio_ro_do_page_fault()'s
  put_page() would need to drop the last reference of a page, yet
  page_get_owner_and_reference() doesn't obtain a reference when
  a page is unallocated (and hence unowned), i.e. normally a page
  would have a refcount of at least 2 here. Hence this would be
  possible only due to a race, but the exact same race to be observed
  on different hardware _and_ under an emulator is extremely unlikely.

You could try with the xen.gz file from 
https://copr-be.cloud.fedoraproject.org/results/myoung/xentest/fedora-21-x86_64/xen-4.5.1-0.rc1.fc21/xen-hypervisor-4.5.1-0.rc1.fc21.x86_64.rpm
 
https://copr-be.cloud.fedoraproject.org/results/myoung/xentest/fedora-21-x86_64/xen-4.5.1-0.rc1.fc21/xen-hypervisor-4.5.1-0.rc1.fc21.x86_64.rpm
It is roughly the same version of xen but built against Fedora 21 and gcc 
4.9.2. If that works then it probably is gcc 5.
Greetings,

I have run into pretty much the same issue as the original poster.

I am running a recently updated Arch Linux system, with GCC 5.1.0, using UEFI 
and gummiboot to boot. I currently have a build of Xen 4.4.1, built with GCC 
4.9.2 from before my last update, that is functioning correctly on this 
machine. But the builds of Xen 4.5.0, using GCC 5 and mingw64-binutils for the 
EFI binary, are all failing when Xen starts the Linux kernel, with the same 
error mentioned in the subject. Below is the boot log I captured via the serial 
port.

http://pastebin.com/bBC78306

Wondering if my specific toolchain was the issue, I downloaded the Fedora 22 
version of xen-hypervisor and installed its EFI Xen binary over my compiled 
binary and received an identical error message, with slightly different 
addresses in the panic dump. The Fedora version was compiled with GCC 5.0.1. 
Below is the boot log I captured from that binary.

http://pastebin.com/jvg1JazC http://pastebin.com/jvg1JazC

After finding this thread, and specifically, the quoted message above, I 
downloaded that xen-hypervisor package and installed its EFI Xen binary. That 
binary boots successfully, as seen by the captured boot log below.

http://pastebin.com/DKxwaU2U

So, while I’m not familiar enough with Xen to begin to have an idea of what 
could possibly be wrong with Xen or GCC 5 to be causing this bug, I’d like to 
do what I can to track down the issue so I can get a working build of Xen 4.5. 
:)

Thanks!

—
Jason Fritcher



smime.p7s
Description: S/MIME cryptographic signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 2/4] libxc: print more error messages when failed

2015-05-29 Thread Wei Liu
No functional changes introduced.

Signed-off-by: Wei Liu wei.l...@citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Ian Jackson ian.jack...@eu.citrix.com
---
 tools/libxc/xc_hvm_build_x86.c | 30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index 92422bf..df4b7ed 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -259,7 +259,10 @@ static int setup_guest(xc_interface *xch,
 
 memset(elf, 0, sizeof(elf));
 if ( elf_init(elf, image, image_size) != 0 )
+{
+PERROR(Could not initialise ELF image);
 goto error_out;
+}
 
 xc_elf_set_logfile(xch, elf, 1);
 
@@ -522,15 +525,24 @@ static int setup_guest(xc_interface *xch,
 DPRINTF(  1GB PAGES: 0x%016lx\n, stat_1gb_pages);
 
 if ( loadelfimage(xch, elf, dom, page_array) != 0 )
+{
+PERROR(Could not load ELF image);
 goto error_out;
+}
 
 if ( loadmodules(xch, args, m_start, m_end, dom, page_array) != 0 )
-goto error_out;
+{
+PERROR(Could not load ACPI modules);
+goto error_out;
+}
 
 if ( (hvm_info_page = xc_map_foreign_range(
   xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
   HVM_INFO_PFN)) == NULL )
+{
+PERROR(Could not map hvm info page);
 goto error_out;
+}
 build_hvm_info(hvm_info_page, args);
 munmap(hvm_info_page, PAGE_SIZE);
 
@@ -547,7 +559,10 @@ static int setup_guest(xc_interface *xch,
 }
 
 if ( xc_clear_domain_pages(xch, dom, special_pfn(0), NR_SPECIAL_PAGES) )
-goto error_out;
+{
+PERROR(Could not clear special pages);
+goto error_out;
+}
 
 xc_hvm_param_set(xch, dom, HVM_PARAM_STORE_PFN,
  special_pfn(SPECIALPAGE_XENSTORE));
@@ -580,7 +595,10 @@ static int setup_guest(xc_interface *xch,
 }
 
 if ( xc_clear_domain_pages(xch, dom, ioreq_server_pfn(0), 
NR_IOREQ_SERVER_PAGES) )
-goto error_out;
+{
+PERROR(Could not clear ioreq page);
+goto error_out;
+}
 
 /* Tell the domain where the pages are and how many there are */
 xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_SERVER_PFN,
@@ -595,7 +613,10 @@ static int setup_guest(xc_interface *xch,
 if ( (ident_pt = xc_map_foreign_range(
   xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
   special_pfn(SPECIALPAGE_IDENT_PT))) == NULL )
+{
+PERROR(Could not map special page ident_pt);
 goto error_out;
+}
 for ( i = 0; i  PAGE_SIZE / sizeof(*ident_pt); i++ )
 ident_pt[i] = ((i  22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
_PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
@@ -610,7 +631,10 @@ static int setup_guest(xc_interface *xch,
 char *page0 = xc_map_foreign_range(
 xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, 0);
 if ( page0 == NULL )
+{
+PERROR(Could not map page0);
 goto error_out;
+}
 page0[0] = 0xe9;
 *(uint32_t *)page0[1] = entry_eip - 5;
 munmap(page0, PAGE_SIZE);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 3/4] libxc: rework vnuma bits in setup_guest

2015-05-29 Thread Wei Liu
Make the setup process similar to PV counterpart. That is, to allocate a
P2M array that covers the whole memory range and start from there. This
is clearer than using an array with no holes in it.

Also the dummy layout should take MMIO hole into consideration. We might
end up having two vmemranges in the dummy layout.

Signed-off-by: Wei Liu wei.l...@citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Ian Jackson ian.jack...@eu.citrix.com
---
v2: only generate 1 vnode in dummy layout
---
 tools/libxc/xc_hvm_build_x86.c | 62 --
 1 file changed, 47 insertions(+), 15 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index df4b7ed..b3855e0 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -238,6 +238,7 @@ static int setup_guest(xc_interface *xch,
 {
 xen_pfn_t *page_array = NULL;
 unsigned long i, vmemid, nr_pages = args-mem_size  PAGE_SHIFT;
+unsigned long p2m_size;
 unsigned long target_pages = args-mem_target  PAGE_SHIFT;
 unsigned long entry_eip, cur_pages, cur_pfn;
 void *hvm_info_page;
@@ -254,8 +255,8 @@ static int setup_guest(xc_interface *xch,
 xen_pfn_t special_array[NR_SPECIAL_PAGES];
 xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
 uint64_t total_pages;
-xen_vmemrange_t dummy_vmemrange;
-unsigned int dummy_vnode_to_pnode;
+xen_vmemrange_t dummy_vmemrange[2];
+unsigned int dummy_vnode_to_pnode[1];
 
 memset(elf, 0, sizeof(elf));
 if ( elf_init(elf, image, image_size) != 0 )
@@ -275,17 +276,35 @@ static int setup_guest(xc_interface *xch,
 
 if ( args-nr_vmemranges == 0 )
 {
-/* Build dummy vnode information */
-dummy_vmemrange.start = 0;
-dummy_vmemrange.end   = args-mem_size;
-dummy_vmemrange.flags = 0;
-dummy_vmemrange.nid   = 0;
+/* Build dummy vnode information
+ *
+ * Guest physical address space layout:
+ * [0, hole_start) [hole_start, 4G) [4G, highmem_end)
+ *
+ * Of course if there is no high memory, the second vmemrange
+ * has no effect on the actual result.
+ */
+
+dummy_vmemrange[0].start = 0;
+dummy_vmemrange[0].end   = args-lowmem_end;
+dummy_vmemrange[0].flags = 0;
+dummy_vmemrange[0].nid   = 0;
 args-nr_vmemranges = 1;
-args-vmemranges = dummy_vmemrange;
 
-dummy_vnode_to_pnode = XC_NUMA_NO_NODE;
+if ( args-highmem_end  (1ULL  32) )
+{
+dummy_vmemrange[1].start = 1ULL  32;
+dummy_vmemrange[1].end   = args-highmem_end;
+dummy_vmemrange[1].flags = 0;
+dummy_vmemrange[1].nid   = 0;
+
+args-nr_vmemranges++;
+}
+
+dummy_vnode_to_pnode[0] = XC_NUMA_NO_NODE;
 args-nr_vnodes = 1;
-args-vnode_to_pnode = dummy_vnode_to_pnode;
+args-vmemranges = dummy_vmemrange;
+args-vnode_to_pnode = dummy_vnode_to_pnode;
 }
 else
 {
@@ -297,9 +316,15 @@ static int setup_guest(xc_interface *xch,
 }
 
 total_pages = 0;
+p2m_size = 0;
 for ( i = 0; i  args-nr_vmemranges; i++ )
+{
 total_pages += ((args-vmemranges[i].end - args-vmemranges[i].start)
  PAGE_SHIFT);
+p2m_size = p2m_size  (args-vmemranges[i].end  PAGE_SHIFT) ?
+p2m_size : (args-vmemranges[i].end  PAGE_SHIFT);
+}
+
 if ( total_pages != (args-mem_size  PAGE_SHIFT) )
 {
 PERROR(vNUMA memory pages mismatch (0x%PRIx64 != 0x%PRIx64),
@@ -325,16 +350,23 @@ static int setup_guest(xc_interface *xch,
 DPRINTF(  TOTAL:%016PRIx64-%016PRIx64\n, v_start, v_end);
 DPRINTF(  ENTRY:%016PRIx64\n, elf_uval(elf, elf.ehdr, e_entry));
 
-if ( (page_array = malloc(nr_pages * sizeof(xen_pfn_t))) == NULL )
+if ( (page_array = malloc(p2m_size * sizeof(xen_pfn_t))) == NULL )
 {
 PERROR(Could not allocate memory.);
 goto error_out;
 }
 
-for ( i = 0; i  nr_pages; i++ )
-page_array[i] = i;
-for ( i = args-mmio_start  PAGE_SHIFT; i  nr_pages; i++ )
-page_array[i] += args-mmio_size  PAGE_SHIFT;
+for ( i = 0; i  p2m_size; i++ )
+page_array[i] = ((xen_pfn_t)-1);
+for ( vmemid = 0; vmemid  args-nr_vmemranges; vmemid++ )
+{
+uint64_t pfn;
+
+for ( pfn = args-vmemranges[vmemid].start  PAGE_SHIFT;
+  pfn  args-vmemranges[vmemid].end  PAGE_SHIFT;
+  pfn++ )
+page_array[pfn] = pfn;
+}
 
 /*
  * Try to claim pages for early warning of insufficient memory available.
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 4/4] libxl: fix HVM vNUMA

2015-05-29 Thread Wei Liu
This patch does two thing:

The original code erroneously fills in xc_hvm_build_args before
generating vmemranges. The effect is that guest memory is populated
without vNUMA information. Move the hunk to right place to fix this.

Move the subtraction of video ram to libxl__vnuma_build_vmemrange_hvm
because it's the central place for generating vmemranges.

Reported-by: Boris Ostrovsky boris.ostrov...@oracle.com
Signed-off-by: Wei Liu wei.l...@citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Ian Jackson ian.jack...@eu.citrix.com
Cc: Dario Faggioli dario.faggi...@citrix.com
---
 tools/libxl/libxl_dom.c   | 32 ++--
 tools/libxl/libxl_vnuma.c | 15 ++-
 2 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index dccc9ac..867172a 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -961,6 +961,16 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
 if (info-num_vnuma_nodes != 0) {
 int i;
 
+ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, args);
+if (ret) {
+LOGEV(ERROR, ret, hvm build vmemranges failed);
+goto out;
+}
+ret = libxl__vnuma_config_check(gc, info, state);
+if (ret) goto out;
+ret = set_vnuma_info(gc, domid, info, state);
+if (ret) goto out;
+
 args.nr_vmemranges = state-num_vmemranges;
 args.vmemranges = libxl__malloc(gc, sizeof(*args.vmemranges) *
 args.nr_vmemranges);
@@ -972,17 +982,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
 args.vmemranges[i].nid   = state-vmemranges[i].nid;
 }
 
-/* Consider video ram belongs to vmemrange 0 -- just shrink it
- * by the size of video ram.
- */
-if (((args.vmemranges[0].end - args.vmemranges[0].start)  10)
- info-video_memkb) {
-LOG(ERROR, vmemrange 0 too small to contain video ram);
-goto out;
-}
-
-args.vmemranges[0].end -= (info-video_memkb  10);
-
 args.nr_vnodes = info-num_vnuma_nodes;
 args.vnode_to_pnode = libxl__malloc(gc, sizeof(*args.vnode_to_pnode) *
 args.nr_vnodes);
@@ -996,17 +995,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
 goto out;
 }
 
-if (info-num_vnuma_nodes != 0) {
-ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, args);
-if (ret) {
-LOGEV(ERROR, ret, hvm build vmemranges failed);
-goto out;
-}
-ret = libxl__vnuma_config_check(gc, info, state);
-if (ret) goto out;
-ret = set_vnuma_info(gc, domid, info, state);
-if (ret) goto out;
-}
 ret = hvm_build_set_params(ctx-xch, domid, info, state-store_port,
state-store_mfn, state-console_port,
state-console_mfn, state-store_domid,
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
index cac78d7..56856d2 100644
--- a/tools/libxl/libxl_vnuma.c
+++ b/tools/libxl/libxl_vnuma.c
@@ -257,6 +257,7 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
 uint64_t hole_start, hole_end, next;
 int nid, nr_vmemrange;
 xen_vmemrange_t *vmemranges;
+int rc;
 
 /* Derive vmemranges from vnode size and memory hole.
  *
@@ -277,6 +278,16 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
 libxl_vnode_info *p = b_info-vnuma_nodes[nid];
 uint64_t remaining_bytes = p-memkb  10;
 
+/* Consider video ram belongs to vnode 0 */
+if (nid == 0) {
+if (p-memkb  b_info-video_memkb) {
+LOG(ERROR, vnode 0 too small to contain video ram);
+rc = ERROR_INVAL;
+goto out;
+}
+remaining_bytes -= (b_info-video_memkb  10);
+}
+
 while (remaining_bytes  0) {
 uint64_t count = remaining_bytes;
 
@@ -300,7 +311,9 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
 state-vmemranges = vmemranges;
 state-num_vmemranges = nr_vmemrange;
 
-return 0;
+rc = 0;
+out:
+return rc;
 }
 
 /*
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 0/4] Fix HVM vNUMA

2015-05-29 Thread Wei Liu
Boris discovered that HVM vNUMA didn't actually work. This patch series fixes
that.

The first patch is a prerequisite patch for the actual fixes. The second patch
is to help debugging. The fixes are in the last two patches, which can be
squashed into one if necessary.

Patch 3 is updated in this series. Other patches remain the same as last
version.

Wei.

Wei Liu (4):
  libxc/libxl: fill xc_hvm_build_args in libxl
  libxc: print more error messages when failed
  libxc: rework vnuma bits in setup_guest
  libxl: fix HVM vNUMA

 tools/libxc/xc_hvm_build_x86.c | 125 ++---
 tools/libxl/libxl_dom.c|  48 
 tools/libxl/libxl_vnuma.c  |  15 -
 3 files changed, 119 insertions(+), 69 deletions(-)

-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/4] libxc/libxl: fill xc_hvm_build_args in libxl

2015-05-29 Thread Wei Liu
When building HVM guests, originally some fields of xc_hvm_build_args
are filled in xc_hvm_build (and buried in the wrong function), some are
set in libxl__build_hvm before passing xc_hvm_build_args to
xc_hvm_build. This is fragile.

After examining the code in xc_hvm_build that sets those fields, we can
in fact move setting of mmio_start etc in libxl. This way we consolidate
memory layout setting in libxl.

The setting of firmware data related fields is left in xc_hvm_build
because it depends on parsing ELF image. Those fields only point to
scratch data that doesn't affect memory layout.

There should be no change in the generated guest memory layout.

Signed-off-by: Wei Liu wei.l...@citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Ian Jackson ian.jack...@eu.citrix.com
---
Cc: Chen, Tiejun tiejun.c...@intel.com

This might affect your RMRR patch series.

I once said xc_hvm_build would touch various xc_hvm_build_args fields
that would affect guest memory layout. It won't be that case anymore
with this patch.
---
 tools/libxc/xc_hvm_build_x86.c | 37 +++--
 tools/libxl/libxl_dom.c| 16 
 2 files changed, 23 insertions(+), 30 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index e45ae4a..92422bf 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -88,22 +88,14 @@ static int modules_init(struct xc_hvm_build_args *args,
 return 0;
 }
 
-static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
-   uint64_t mmio_start, uint64_t mmio_size,
+static void build_hvm_info(void *hvm_info_page,
struct xc_hvm_build_args *args)
 {
 struct hvm_info_table *hvm_info = (struct hvm_info_table *)
 (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
-uint64_t lowmem_end = mem_size, highmem_end = 0;
 uint8_t sum;
 int i;
 
-if ( lowmem_end  mmio_start )
-{
-highmem_end = (1ull32) + (lowmem_end - mmio_start);
-lowmem_end = mmio_start;
-}
-
 memset(hvm_info_page, 0, PAGE_SIZE);
 
 /* Fill in the header. */
@@ -116,14 +108,10 @@ static void build_hvm_info(void *hvm_info_page, uint64_t 
mem_size,
 memset(hvm_info-vcpu_online, 0xff, sizeof(hvm_info-vcpu_online));
 
 /* Memory parameters. */
-hvm_info-low_mem_pgend = lowmem_end  PAGE_SHIFT;
-hvm_info-high_mem_pgend = highmem_end  PAGE_SHIFT;
+hvm_info-low_mem_pgend = args-lowmem_end  PAGE_SHIFT;
+hvm_info-high_mem_pgend = args-highmem_end  PAGE_SHIFT;
 hvm_info-reserved_mem_pgstart = ioreq_server_pfn(0);
 
-args-lowmem_end = lowmem_end;
-args-highmem_end = highmem_end;
-args-mmio_start = mmio_start;
-
 /* Finish with the checksum. */
 for ( i = 0, sum = 0; i  hvm_info-length; i++ )
 sum += ((uint8_t *)hvm_info)[i];
@@ -251,8 +239,6 @@ static int setup_guest(xc_interface *xch,
 xen_pfn_t *page_array = NULL;
 unsigned long i, vmemid, nr_pages = args-mem_size  PAGE_SHIFT;
 unsigned long target_pages = args-mem_target  PAGE_SHIFT;
-uint64_t mmio_start = (1ull  32) - args-mmio_size;
-uint64_t mmio_size = args-mmio_size;
 unsigned long entry_eip, cur_pages, cur_pfn;
 void *hvm_info_page;
 uint32_t *ident_pt;
@@ -344,8 +330,8 @@ static int setup_guest(xc_interface *xch,
 
 for ( i = 0; i  nr_pages; i++ )
 page_array[i] = i;
-for ( i = mmio_start  PAGE_SHIFT; i  nr_pages; i++ )
-page_array[i] += mmio_size  PAGE_SHIFT;
+for ( i = args-mmio_start  PAGE_SHIFT; i  nr_pages; i++ )
+page_array[i] += args-mmio_size  PAGE_SHIFT;
 
 /*
  * Try to claim pages for early warning of insufficient memory available.
@@ -446,7 +432,7 @@ static int setup_guest(xc_interface *xch,
   * range */
  !check_mmio_hole(cur_pfn  PAGE_SHIFT,
   SUPERPAGE_1GB_NR_PFNS  PAGE_SHIFT,
-  mmio_start, mmio_size) )
+  args-mmio_start, args-mmio_size) )
 {
 long done;
 unsigned long nr_extents = count  SUPERPAGE_1GB_SHIFT;
@@ -545,7 +531,7 @@ static int setup_guest(xc_interface *xch,
   xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
   HVM_INFO_PFN)) == NULL )
 goto error_out;
-build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size, args);
+build_hvm_info(hvm_info_page, args);
 munmap(hvm_info_page, PAGE_SIZE);
 
 /* Allocate and clear special pages. */
@@ -661,12 +647,6 @@ int xc_hvm_build(xc_interface *xch, uint32_t domid,
 if ( args.image_file_name == NULL )
 return -1;
 
-if ( args.mem_target == 0 )
-args.mem_target = args.mem_size;
-
-if ( args.mmio_size == 0 )
-args.mmio_size = HVM_BELOW_4G_MMIO_LENGTH;
-
 /* An HVM guest must be initialised with at least 2MB memory. */
 if 

Re: [Xen-devel] [PATCH Remus v7 0/3] Remus support for Migration-v2

2015-05-29 Thread Ian Campbell
On Mon, 2015-05-25 at 17:06 +0800, Yang Hongyang wrote:
 Ping...

Acked + Applied.

Sorry for the delay, for some reason I thought I was waiting for Andrew
to comment even though they all already had Reviewed-by.

Ian.

 
 On 05/18/2015 03:03 PM, Yang Hongyang wrote:
  This patchset implement the Remus support for Migration v2 but without
  memory compressing.
 
  Git tree available at:
   https://github.com/macrosheep/xen/tree/Remus-newmig-v7
 
  v6-v7:
 - clear deffered_pages after suspend and send dirty pages
 - initialise allocated_rec_num
 
  v5-v6:
 - refactor send_domain_memory_live()
 - introduce buffer_record()
 - remove the records buffer size limit
 
  Yang Hongyang (3):
 libxc/save: refactor of send_domain_memory_live()
 libxc/save: implement Remus checkpointed save
 libxc/restore: implement Remus checkpointed restore
 
tools/libxc/xc_sr_common.h  |  15 +++
tools/libxc/xc_sr_restore.c | 134 ---
tools/libxc/xc_sr_save.c| 217 
  
3 files changed, 295 insertions(+), 71 deletions(-)
 
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] efi: Reallocate memory map if ExitBootServices() fails

2015-05-29 Thread Andrew Cooper
On 29/05/15 08:48, Ross Lagerwall wrote:
 If calling ExitBootServices() fails, the memory map size may have
 increased, so determine the new size and reallocate the memory map
 before calling GetMemoryMap() again.

 This was seen on the following machine when using the iscsidxe UEFI
 driver. The machine would consistently fail the first call to
 ExitBootServices().
 System Information
 Manufacturer: Supermicro
 Product Name: X10SLE-F/HF
 BIOS Information
 Vendor: American Megatrends Inc.
 Version: 2.00
 Release Date: 04/24/2014

 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com

Reviewed-by: Andrew Cooper andrew.coop...@citrix.com

 ---
  xen/common/efi/boot.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

 diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
 index ef8476c..078f9b8 100644
 --- a/xen/common/efi/boot.c
 +++ b/xen/common/efi/boot.c
 @@ -1053,14 +1053,14 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
 *SystemTable)
  efi_arch_video_init(gop, info_size, mode_info);
  }
  
 -efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
 - efi_mdesc_size, mdesc_ver);
 -efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
 -if ( !efi_memmap )
 -blexit(LUnable to allocate memory for EFI memory map);
 -
  for ( retry = 0; ; retry = 1 )
  {
 +efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
 + efi_mdesc_size, mdesc_ver);
 +efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
 +if ( !efi_memmap )
 +blexit(LUnable to allocate memory for EFI memory map);
 +
  status = efi_bs-GetMemoryMap(efi_memmap_size, efi_memmap, map_key,
efi_mdesc_size, mdesc_ver);
  if ( EFI_ERROR(status) )


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC][PATCH] x86: remove vmalloc.h from asm/io.h

2015-05-29 Thread Stephen Rothwell
Nothing in asm/io.h uses anything from vmalloc.h, so remove the include
and fix up the build problems in an allmodconfig (64 bit and 32 bit)
build.

This may be the place where x86 builds get vmalloc.h implicitly included
and that tends to hide places where vmalloc() et al are added to files
but the include of vmalloc.h is forgotten.

Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
Cc: x...@kernel.org
Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
Cc: Boris Ostrovsky boris.ostrov...@oracle.com
Cc: David Vrabel david.vra...@citrix.com
Cc: Anton Vorontsov an...@enomsg.org
Cc: Colin Cross ccr...@android.com
Cc: Kees Cook keesc...@chromium.org
Cc: Tony Luck tony.l...@intel.com
Cc: Rafael J. Wysocki r...@rjwysocki.net
Cc: Len Brown l...@kernel.org
Cc: Kristen Carlson Accardi kris...@linux.intel.com
Cc: Viresh Kumar viresh.ku...@linaro.org
Cc: Vinod Koul vinod.k...@intel.com
Cc: K. Y. Srinivasan k...@microsoft.com
Cc: Haiyang Zhang haiya...@microsoft.com
Cc: Hiral Patel hiral...@cisco.com
Cc: Suma Ramars sram...@cisco.com
Cc: Brian Uchino buch...@cisco.com
Cc: James E.J. Bottomley jbottom...@odin.com
Cc: Jaroslav Kysela pe...@perex.cz
Cc: Takashi Iwai ti...@suse.de
Cc: Andrew Morton a...@linux-foundation.org
Suggested-by: David Miller da...@davemloft.net
Signed-off-by: Stephen Rothwell s...@canb.auug.org.au

---

Based in Linus' tree of today.

There are probably more places that need vmalloc.h included, but this
passes 64 bit and 32 bit allmodconfig builds, so is a place to start.

Dave Miller suggested that I start this journey.

 arch/x86/include/asm/io.h  | 2 --
 arch/x86/kernel/crash.c| 1 +
 arch/x86/kernel/machine_kexec_64.c | 1 +
 arch/x86/mm/pageattr-test.c| 1 +
 arch/x86/mm/pageattr.c | 1 +
 arch/x86/xen/p2m.c | 1 +
 drivers/acpi/apei/erst.c   | 1 +
 drivers/cpufreq/intel_pstate.c | 1 +
 drivers/dma/mic_x100_dma.c | 1 +
 drivers/net/hyperv/netvsc.c| 1 +
 drivers/net/hyperv/rndis_filter.c  | 1 +
 drivers/scsi/fnic/fnic_debugfs.c   | 1 +
 drivers/scsi/fnic/fnic_trace.c | 1 +
 sound/pci/asihpi/hpioctl.c | 1 +
 14 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 34a5b93704d3..5791e7ace9db 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -197,8 +197,6 @@ extern void set_iounmap_nonlazy(void);
 
 #include asm-generic/iomap.h
 
-#include linux/vmalloc.h
-
 /*
  * Convert a virtual cached pointer to an uncached pointer
  */
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index c76d3e37c6e1..e068d6683dba 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -22,6 +22,7 @@
 #include linux/elfcore.h
 #include linux/module.h
 #include linux/slab.h
+#include linux/vmalloc.h
 
 #include asm/processor.h
 #include asm/hardirq.h
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 415480d3ea84..11546b462fa6 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -17,6 +17,7 @@
 #include linux/ftrace.h
 #include linux/io.h
 #include linux/suspend.h
+#include linux/vmalloc.h
 
 #include asm/init.h
 #include asm/pgtable.h
diff --git a/arch/x86/mm/pageattr-test.c b/arch/x86/mm/pageattr-test.c
index 6629f397b467..8ff686aa7e8c 100644
--- a/arch/x86/mm/pageattr-test.c
+++ b/arch/x86/mm/pageattr-test.c
@@ -9,6 +9,7 @@
 #include linux/random.h
 #include linux/kernel.h
 #include linux/mm.h
+#include linux/vmalloc.h
 
 #include asm/cacheflush.h
 #include asm/pgtable.h
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 89af288ec674..bedfc794b4ba 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -14,6 +14,7 @@
 #include linux/percpu.h
 #include linux/gfp.h
 #include linux/pci.h
+#include linux/vmalloc.h
 
 #include asm/e820.h
 #include asm/processor.h
diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c
index b47124d4cd67..8b7f18e200aa 100644
--- a/arch/x86/xen/p2m.c
+++ b/arch/x86/xen/p2m.c
@@ -67,6 +67,7 @@
 #include linux/seq_file.h
 #include linux/bootmem.h
 #include linux/slab.h
+#include linux/vmalloc.h
 
 #include asm/cache.h
 #include asm/setup.h
diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index ed65e9c4b5b0..3670bbab57a3 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -35,6 +35,7 @@
 #include linux/nmi.h
 #include linux/hardirq.h
 #include linux/pstore.h
+#include linux/vmalloc.h
 #include acpi/apei.h
 
 #include apei-internal.h
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 6414661ac1c4..2ba53f4f6af2 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -26,6 +26,7 @@
 #include linux/fs.h
 #include linux/debugfs.h
 #include linux/acpi.h
+#include linux/vmalloc.h
 #include trace/events/power.h
 
 #include asm/div64.h
diff --git 

Re: [Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests

2015-05-29 Thread Ross Lagerwall

On 05/29/2015 10:41 AM, Wei Liu wrote:

On Fri, May 29, 2015 at 08:59:45AM +0100, Ross Lagerwall wrote:

When doing passthrough of a PCI device for an HVM guest, don't insert
the device into xenstore, otherwise pciback attempts to use it which
conflicts with QEMU.

This manifests itself such that the first time a device is passed to a
domain, it succeeds. Subsequent attempts fail unless the device is
unbound from pciback or the machine rebooted.



The commit message looks sensible to me. However this might break
qemu-trad PCI passthrough if I'm not mistaken. What QEMU version are you
using?  Upstream or trad? Have you tested both of them?



qemu-trad. I haven't tested with qemu-upstream.

--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable test] 56759: regressions - FAIL

2015-05-29 Thread Andrew Cooper
On 28/05/15 11:10, Jan Beulich wrote:
 On 28.05.15 at 11:26, ian.campb...@citrix.com wrote:
 On Thu, 2015-05-28 at 09:50 +0100, Jan Beulich wrote:
 On 27.05.15 at 18:04, ian.campb...@citrix.com wrote:
 On Tue, 2015-05-26 at 14:29 +0100, Ian Campbell wrote:
 I've now managed to reproduce using the arndale on my desk.
 ... and now I've confirmed that reverting the spin lock change causes
 the issue to not happen any more.
 Considering that this issue has prevented a push for almost
 two weeks, I think we ought to consider reverting the two
 offending commits until the problem got sorted out.
 I think that would probably be wise. I'll try and figure out exactly
 what is going on and propose some patches ASAP.
 Now done and pushed.

Wait what?  This failure is not related to spinlocks; It is a networking
behavioural bug (hardware specific, even) which has been uncovered,
showing that there is a preexisting race condition.

It is not reasonable to revert a correct change because it has exposed
an existing race condition elsewhere.  IMO, this should have been a
force push to mark the test as non-blocking.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] efi: Reallocate memory map if ExitBootServices() fails

2015-05-29 Thread Ross Lagerwall

On 05/29/2015 10:45 AM, Jan Beulich wrote:

On 29.05.15 at 09:48, ross.lagerw...@citrix.com wrote:

--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -1053,14 +1053,14 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
  efi_arch_video_init(gop, info_size, mode_info);
  }

-efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
- efi_mdesc_size, mdesc_ver);
-efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
-if ( !efi_memmap )
-blexit(LUnable to allocate memory for EFI memory map);
-
  for ( retry = 0; ; retry = 1 )
  {
+efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
+ efi_mdesc_size, mdesc_ver);
+efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
+if ( !efi_memmap )
+blexit(LUnable to allocate memory for EFI memory map);


You can't blexit() anymore after having called ExitBootServices() once.
Admittedly even the PrintErrMesg() used for error handling a few
lines down is on the edge of being invalid (but this is a best effort
thing anyway).


OK, I'll convert it into a PrintErrMesg.



Further you should do a second allocation only if you positively
identified that the new size is larger than what the existing buffer
can hold. It may be worth allocating a couple of extra entries the
first time through anyway; perhaps that would even avoid th need
for this workaround.


OK.



Since, finally, this is only a workaround, as the spec clearly says:
After an Operating System calls ExitBootServices(), firmware boot
services are no longer available and it is illegal to call any boot
service. Even our re-invocation of GetMemoryMap() is bending
that (and I only hesitantly agreed for it to be added).



The spec kinda disagrees with that:
It is suggested that GetMemoryMap()be called immediately before calling 
ExitBootServices(). If MapKey value is incorrect, ExitBootServices() 
returns EFI_INVALID_PARAMETER and GetMemoryMap() with ExitBootServices() 
must be called again. Firmware implementation may choose to do a partial 
shutdown of the boot services during the first call to 
ExitBootServices(). EFI OS loader should not make calls to any boot 
service function other then GetMemoryMap() after the first call to 
ExitBootServices().


I think the patch does the right thing in the regard.

--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen BUG at page_alloc.c:1738 (Xen 4.5)

2015-05-29 Thread Andrew Cooper
On 29/05/15 07:24, Jason Fritcher wrote:
 On Wed, 20 May 2015, Major Hayden wrote:

 / On 05/20/2015 05:41 AM, Jan Beulich wrote:/
 /  Considering that no-one else is seeing this - is this perhaps connected/
 /  to you building Xen with pre-release gcc 5.0.1? This is also because in/
 /  order for the above to indeed occur, mmio_ro_do_page_fault()'s/
 /  put_page() would need to drop the last reference of a page, yet/
 /  page_get_owner_and_reference() doesn't obtain a reference when/
 /  a page is unallocated (and hence unowned), i.e. normally a page/
 /  would have a refcount of at least 2 here. Hence this would be/
 /  possible only due to a race, but the exact same race to be observed/
 /  on different hardware _and_ under an emulator is extremely unlikely./

 You could try with the xen.gz file from
 https://copr-be.cloud.fedoraproject.org/results/myoung/xentest/fedora-21-x86_64/xen-4.5.1-0.rc1.fc21/xen-hypervisor-4.5.1-0.rc1.fc21.x86_64.rpm
 It is roughly the same version of xen but built against Fedora 21 and gcc
 4.9.2. If that works then it probably is gcc 5.
 Greetings,

 I have run into pretty much the same issue as the original poster.

 I am running a recently updated Arch Linux system, with GCC 5.1.0,
 using UEFI and gummiboot to boot. I currently have a build of Xen
 4.4.1, built with GCC 4.9.2 from before my last update, that is
 functioning correctly on this machine. But the builds of Xen 4.5.0,
 using GCC 5 and mingw64-binutils for the EFI binary, are all failing
 when Xen starts the Linux kernel, with the same error mentioned in the
 subject. Below is the boot log I captured via the serial port.

 http://pastebin.com/bBC78306

 Wondering if my specific toolchain was the issue, I downloaded the
 Fedora 22 version of xen-hypervisor and installed its EFI Xen binary
 over my compiled binary and received an identical error message, with
 slightly different addresses in the panic dump. The Fedora version was
 compiled with GCC 5.0.1. Below is the boot log I captured from that
 binary.

 http://pastebin.com/jvg1JazC

 After finding this thread, and specifically, the quoted message above,
 I downloaded that xen-hypervisor package and installed its EFI Xen
 binary. That binary boots successfully, as seen by the captured boot
 log below.

 http://pastebin.com/DKxwaU2U

 So, while I’m not familiar enough with Xen to begin to have an idea of
 what could possibly be wrong with Xen or GCC 5 to be causing this bug,
 I’d like to do what I can to track down the issue so I can get a
 working build of Xen 4.5. :)

Are you in a position to compile identical Xen 4.5 source with two
different versions of gcc?  (current staging-4.5 staging even has the
gcc5 build fix in)

If it is a gcc compiler bug, we would expect the version compiled with
gcc 4.9 to work fine, but the one compiled with 5 to fail in the
identified manor.

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests

2015-05-29 Thread Wei Liu
On Fri, May 29, 2015 at 10:54:09AM +0100, Ross Lagerwall wrote:
 On 05/29/2015 10:50 AM, Wei Liu wrote:
 On Fri, May 29, 2015 at 10:43:08AM +0100, Ross Lagerwall wrote:
 On 05/29/2015 10:41 AM, Wei Liu wrote:
 On Fri, May 29, 2015 at 08:59:45AM +0100, Ross Lagerwall wrote:
 When doing passthrough of a PCI device for an HVM guest, don't insert
 the device into xenstore, otherwise pciback attempts to use it which
 conflicts with QEMU.
 
 This manifests itself such that the first time a device is passed to a
 domain, it succeeds. Subsequent attempts fail unless the device is
 unbound from pciback or the machine rebooted.
 
 
 The commit message looks sensible to me. However this might break
 qemu-trad PCI passthrough if I'm not mistaken. What QEMU version are you
 using?  Upstream or trad? Have you tested both of them?
 
 
 qemu-trad. I haven't tested with qemu-upstream.
 
 
 I don't quite get this. Doesn't qemu-trad depends on those xenstore
 nodes for PCI passthrough information?  What did I miss?
 
 
 A different set of xenstore keys are used for communication between libxl
 and QEMU. The communication between libxl and QEMU happens further up in the
 same function:
 http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxl/libxl_pci.c;h=e0743f8112689b340ba7de88bc8895b62105aaba;hb=HEAD#l901
 

OK. Now I get the idea. IMHO this piece of code is not in a very good
state. The problem is the way it works is very fragile. Now we have
three functions, each of which has partial responsibility of writing
some xenstore nodes. This is not your fault.

Acked-by: Wei Liu wei.l...@citrix.com

 Regards,
 -- 
 Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for Xen 4.6 0/4] Enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-05-29 Thread Dario Faggioli
On Tue, 2015-05-26 at 09:59 +0100, Ian Campbell wrote:
 On Mon, 2015-05-25 at 18:59 -0500, Chong Li wrote:
 
 This series arrived in my mailbox as 5 distinct mails.
 
 Please use git send-email such that the mails arrive as a single email
 thread (i.e. each mail as a reply to the previous or to the 0th mail) or
 arrange for the same thing by hand (I highly recommend using git
 send-email though)
 
Indeed.

BTW, Chong, v1 was threaded ok, so maybe did something different this
time when sending the patches. If yes, just don't! :-)

Regards,
Dario

-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/4] libxc/libxl: fill xc_hvm_build_args in libxl

2015-05-29 Thread Andrew Cooper
On 29/05/15 12:37, Wei Liu wrote:
 When building HVM guests, originally some fields of xc_hvm_build_args
 are filled in xc_hvm_build (and buried in the wrong function), some are
 set in libxl__build_hvm before passing xc_hvm_build_args to
 xc_hvm_build. This is fragile.

 After examining the code in xc_hvm_build that sets those fields, we can
 in fact move setting of mmio_start etc in libxl. This way we consolidate
 memory layout setting in libxl.

 The setting of firmware data related fields is left in xc_hvm_build
 because it depends on parsing ELF image. Those fields only point to
 scratch data that doesn't affect memory layout.

 There should be no change in the generated guest memory layout.

 Signed-off-by: Wei Liu wei.l...@citrix.com
 Cc: Ian Campbell ian.campb...@citrix.com
 Cc: Ian Jackson ian.jack...@eu.citrix.com
 ---
 Cc: Chen, Tiejun tiejun.c...@intel.com

 This might affect your RMRR patch series.

It should at least me noted that this is a semantic change in domain
construction for all other toolstacks out there, an aid to the unlucky
person forward porting something like Xapi and finding that a chunk of
work is no longer performed by libxc.


 I once said xc_hvm_build would touch various xc_hvm_build_args fields
 that would affect guest memory layout. It won't be that case anymore
 with this patch.
 ---
  tools/libxc/xc_hvm_build_x86.c | 37 +++--
  tools/libxl/libxl_dom.c| 16 
  2 files changed, 23 insertions(+), 30 deletions(-)

 diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
 index e45ae4a..92422bf 100644
 --- a/tools/libxc/xc_hvm_build_x86.c
 +++ b/tools/libxc/xc_hvm_build_x86.c
 @@ -88,22 +88,14 @@ static int modules_init(struct xc_hvm_build_args *args,
  return 0;
  }
  
 -static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
 -   uint64_t mmio_start, uint64_t mmio_size,
 +static void build_hvm_info(void *hvm_info_page,
 struct xc_hvm_build_args *args)
  {
  struct hvm_info_table *hvm_info = (struct hvm_info_table *)
  (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
 -uint64_t lowmem_end = mem_size, highmem_end = 0;
  uint8_t sum;
  int i;
  
 -if ( lowmem_end  mmio_start )
 -{
 -highmem_end = (1ull32) + (lowmem_end - mmio_start);
 -lowmem_end = mmio_start;
 -}
 -
  memset(hvm_info_page, 0, PAGE_SIZE);
  
  /* Fill in the header. */
 @@ -116,14 +108,10 @@ static void build_hvm_info(void *hvm_info_page, 
 uint64_t mem_size,
  memset(hvm_info-vcpu_online, 0xff, sizeof(hvm_info-vcpu_online));
  
  /* Memory parameters. */
 -hvm_info-low_mem_pgend = lowmem_end  PAGE_SHIFT;
 -hvm_info-high_mem_pgend = highmem_end  PAGE_SHIFT;
 +hvm_info-low_mem_pgend = args-lowmem_end  PAGE_SHIFT;
 +hvm_info-high_mem_pgend = args-highmem_end  PAGE_SHIFT;
  hvm_info-reserved_mem_pgstart = ioreq_server_pfn(0);
  
 -args-lowmem_end = lowmem_end;
 -args-highmem_end = highmem_end;
 -args-mmio_start = mmio_start;
 -
  /* Finish with the checksum. */
  for ( i = 0, sum = 0; i  hvm_info-length; i++ )
  sum += ((uint8_t *)hvm_info)[i];
 @@ -251,8 +239,6 @@ static int setup_guest(xc_interface *xch,
  xen_pfn_t *page_array = NULL;
  unsigned long i, vmemid, nr_pages = args-mem_size  PAGE_SHIFT;
  unsigned long target_pages = args-mem_target  PAGE_SHIFT;
 -uint64_t mmio_start = (1ull  32) - args-mmio_size;
 -uint64_t mmio_size = args-mmio_size;
  unsigned long entry_eip, cur_pages, cur_pfn;
  void *hvm_info_page;
  uint32_t *ident_pt;
 @@ -344,8 +330,8 @@ static int setup_guest(xc_interface *xch,
  
  for ( i = 0; i  nr_pages; i++ )
  page_array[i] = i;
 -for ( i = mmio_start  PAGE_SHIFT; i  nr_pages; i++ )
 -page_array[i] += mmio_size  PAGE_SHIFT;
 +for ( i = args-mmio_start  PAGE_SHIFT; i  nr_pages; i++ )
 +page_array[i] += args-mmio_size  PAGE_SHIFT;
  
  /*
   * Try to claim pages for early warning of insufficient memory available.
 @@ -446,7 +432,7 @@ static int setup_guest(xc_interface *xch,
* range */
   !check_mmio_hole(cur_pfn  PAGE_SHIFT,
SUPERPAGE_1GB_NR_PFNS  PAGE_SHIFT,
 -  mmio_start, mmio_size) )
 +  args-mmio_start, args-mmio_size) )
  {
  long done;
  unsigned long nr_extents = count  SUPERPAGE_1GB_SHIFT;
 @@ -545,7 +531,7 @@ static int setup_guest(xc_interface *xch,
xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
HVM_INFO_PFN)) == NULL )
  goto error_out;
 -build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size, args);
 +build_hvm_info(hvm_info_page, args);
  munmap(hvm_info_page, PAGE_SIZE);
  
  /* Allocate and clear 

Re: [Xen-devel] [PATCH] efi: Reallocate memory map if ExitBootServices() fails

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 09:48, ross.lagerw...@citrix.com wrote:
 --- a/xen/common/efi/boot.c
 +++ b/xen/common/efi/boot.c
 @@ -1053,14 +1053,14 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
 *SystemTable)
  efi_arch_video_init(gop, info_size, mode_info);
  }
  
 -efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
 - efi_mdesc_size, mdesc_ver);
 -efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
 -if ( !efi_memmap )
 -blexit(LUnable to allocate memory for EFI memory map);
 -
  for ( retry = 0; ; retry = 1 )
  {
 +efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
 + efi_mdesc_size, mdesc_ver);
 +efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
 +if ( !efi_memmap )
 +blexit(LUnable to allocate memory for EFI memory map);

You can't blexit() anymore after having called ExitBootServices() once.
Admittedly even the PrintErrMesg() used for error handling a few
lines down is on the edge of being invalid (but this is a best effort
thing anyway).

Further you should do a second allocation only if you positively
identified that the new size is larger than what the existing buffer
can hold. It may be worth allocating a couple of extra entries the
first time through anyway; perhaps that would even avoid th need
for this workaround.

Since, finally, this is only a workaround, as the spec clearly says:
After an Operating System calls ExitBootServices(), firmware boot
services are no longer available and it is illegal to call any boot
service. Even our re-invocation of GetMemoryMap() is bending
that (and I only hesitantly agreed for it to be added).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Question on QEMU restore process

2015-05-29 Thread Lengyel, Tamas
On Tue, May 26, 2015 at 5:49 PM, Ian Campbell ian.campb...@citrix.com
wrote:

 On Tue, 2015-05-26 at 17:38 +0200, Lengyel, Tamas wrote:
  Hi all,
 
  I'm wondering if someone can point me in the right direction. I'm
  trying to understand the process around domain save/restore using XL.
  I can see the xl save format and how its appended with the QEMU state
  aquired via the xen-save-devices-state qmp command (this is for
  upstream QEMU).
 
  However, I have a hard time locating the exact point where that save
  state is being loaded into the new QEMU process. I see in libxc that
  it should be dumped into /var/lib/xen/qemu-resume appended by the
  domain id of the newly created domain. It seems this was passed to
  QEMU via the -loadvm flag at one point, but on Xen 4.4 I don't see
  that flag on my QEMU processes when I restore a domain.

 libxl__build_device_model_args_new appears to pass it as -incoming
 fd:N where N is an open file descriptor which qemu will inherit from
 its parent (libxl).

 Ian.


Hi Ian,
thanks for the info. What I'm trying to do is full HVM cloning of domains
where I would also load up the correct QEMU save state for a clone domain.
For this I was thinking of simply creating a clone domain paused (xl create
-p -e clone config), copy over hvm+vcpu context of the origin domain and
dedup memory using memsharing. Afterwards, I would kill the QEMU instance
for the clone that was automatically created by xl, and replace it with a
qemu instance that loads the state using -loadvm.

Now, this loading of the QEMU instance via -loadvm doesn't seem to work. I
tried restoring a saved domain (xl restore -p -e savefile), dumped its
QEMU state via QMP, killed the QEMU process and restarted it exactly as it
was running before, except with -loadvm qemu-save instead of the
-incoming fd option. I get the following error:

/usr/local/lib/xen/bin/qemu-system-i386 -xen-domid 8 -chardev
socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-8,server,nowait
-no-shutdown -mon chardev=libxl-cmd,mode=control -chardev
socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-8,server,nowait
-mon chardev=libxenstat-cmd,mode=control -nodefaults -name windows7-sp1-x86
-vnc 0.0.0.0:0,to=99 -display none -device cirrus-vga,vgamem_mb=8 -boot
order=cd -usb -usbdevice tablet -device
e1000,id=nic0,netdev=net0,mac=00:06:5b:ba:7c:02 -netdev
type=tap,id=net0,ifname=vif8.0-emu,script=no,downscript=no -loadvm
./qemu-save -machine xenfv -m 1016 -drive
file=/dev/l1vg/windows7-sp1-x86,if=ide,index=0,media=disk,format=raw,cache=writeback
-daemonize
xen be: vkbd-0: initial backend state is wrong (InitWait)
qemu: hardware error: xen: failed to populate ram at 3f80
CPU #0:
EAX= EBX= ECX= EDX=0663
ESI= EDI= EBP= ESP=
EIP=fff0 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =   9300
CS =f000   9b00
SS =   9300
DS =   9300
FS =   9300
GS =   9300
LDT=   8200
TR =   8b00
GDT=  
IDT=  
CR0=6010 CR2= CR3= CR4=
DR0= DR1= DR2= DR3=
DR6=0ff0 DR7=0400
EFER=
FCW=037f FSW= [ST=0] FTW=00 MXCSR=1f80
FPR0=  FPR1= 
FPR2=  FPR3= 
FPR4=  FPR5= 
FPR6=  FPR7= 
XMM00=
XMM01=
XMM02=
XMM03=
XMM04=
XMM05=
XMM06=
XMM07=

What step am I missing in here for the (re)creation of the QEMU process?

-- 

[image: www.novetta.com]

Tamas K Lengyel

Senior Security Researcher

7921 Jones Branch Drive

McLean VA 22102

Email  tleng...@novetta.com
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 05/13] x86: expose CBM length and COS number information

2015-05-29 Thread Dario Faggioli
On Thu, 2015-05-28 at 14:26 +0100, Jan Beulich wrote:
  On 21.05.15 at 10:41, chao.p.p...@linux.intel.com wrote:

  --- a/xen/include/public/sysctl.h
  +++ b/xen/include/public/sysctl.h
  @@ -694,6 +694,20 @@ struct xen_sysctl_pcitopoinfo {
   typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
   DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
   
  +#define XEN_SYSCTL_PSR_CAT_get_l3_info   0
  +struct xen_sysctl_psr_cat_op {
  +uint32_t cmd;   /* IN: XEN_SYSCTL_PSR_CAT_* */
  +uint32_t target;/* IN: socket to be operated on */
 
 If this is always the socket number, why would the variable be
 named anything other than socket. If otoh subsequent patches
 use it differently, I think the comment should be omitted now
 rather than being dropped then (or it should be given its final
 wording from the beginning).
 
ISTR asking about this interface before (it might have been at the
toolstack level, though), and the answer was that it is not subsequent
patches _in_this_series_ (of course), but maybe future ones that will
modify the 'per-socket-ness' nature of this feature.

So, yes, maybe the comment should say something along those lines (and
be updated when things change).

Dario

-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V7] xen/vm_event: Clean up control-register-write vm_events and add XCR0 event

2015-05-29 Thread Jan Beulich
 On 28.05.15 at 18:32, rcojoc...@bitdefender.com wrote:
 On 05/28/2015 07:06 PM, Lengyel, Tamas wrote:
 That seems a bit odd to me that we now have both a macro and a function
 with the same name. I think making the macro at least all capital would
 avoid confusion and make the code a lot more readable. But even then
 IMHO this wrapper is not helping code readability much. Its really a
 pain to track down auto-expanding fields in macros when someone is
 trying to read the code.
 
 I think the lowercase version of the wrapper macro of the same name is a
 Xen coding style convention, but if Jan thinks going uppercase is not a
 problem that's fine with me.
 
 As for the wrapper, being right next to the function it is wrapping
 makes it hard to ignore when reading the header, and the #define is
 pretty clear on what it does.
 
 Also, while I think that the macro does help with readability (so my
 preference would be to leave it in), it's just that - a preference - so
 if Jan also agrees that we should now remove it I'll do that. After all,

Afaic it should be kept as is.

 the macro will probably go out the window (or the first parameter will
 need to be changed to VM_EVENT_##what instead of VM_EVENT_X86_##what) as
 soon as ARM control register write events will come into play.

It's in an x86-specific header, so why should it need to be changed
for ARM? If ARM will gain a similarly named function, the use sites
will still all be architecture specific, and hence both declaration and
whether or not to have a wrapper macro can remain a per-arch
decision.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH] x86: remove vmalloc.h from asm/io.h

2015-05-29 Thread Ingo Molnar

* Stephen Rothwell s...@canb.auug.org.au wrote:

 Nothing in asm/io.h uses anything from vmalloc.h, so remove the include
 and fix up the build problems in an allmodconfig (64 bit and 32 bit)
 build.
 
 This may be the place where x86 builds get vmalloc.h implicitly included
 and that tends to hide places where vmalloc() et al are added to files
 but the include of vmalloc.h is forgotten.

Good idea.

Acked-by: Ingo Molnar mi...@kernel.org

 Based in Linus' tree of today.
 
 There are probably more places that need vmalloc.h included, but this passes 
 64 
 bit and 32 bit allmodconfig builds, so is a place to start.

Please also test x86 allnoconfig and defconfig 32/64, that tends to unearth the 
remaining places. People doing randconfig testing will find the rest.

Thanks,

Ingo

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 05/13] x86: expose CBM length and COS number information

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 11:23, dario.faggi...@citrix.com wrote:
 On Fri, 2015-05-29 at 09:07 +0100, Jan Beulich wrote:
  On 29.05.15 at 04:47, chao.p.p...@linux.intel.com wrote:
  On Thu, May 28, 2015 at 02:26:03PM +0100, Jan Beulich wrote:
  
   --- a/xen/include/public/sysctl.h
   +++ b/xen/include/public/sysctl.h
   @@ -694,6 +694,20 @@ struct xen_sysctl_pcitopoinfo {
typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);

   +#define XEN_SYSCTL_PSR_CAT_get_l3_info   0
   +struct xen_sysctl_psr_cat_op {
   +uint32_t cmd;   /* IN: XEN_SYSCTL_PSR_CAT_* */
   +uint32_t target;/* IN: socket to be operated on */
  
  If this is always the socket number, why would the variable be
  named anything other than socket. If otoh subsequent patches
  use it differently, I think the comment should be omitted now
  rather than being dropped then (or it should be given its final
  wording from the beginning).
  
  Or 'target to be operated on'?
 
 Fine with me. Just not something that may end up being confusing.
 
 So, I really don't want to turn this into pure bikeshedding, but, for a
 field called 'target', a comment saying 'target to be operated on' seems
 rather pointless, and I'd go for omitting it (for now).

Right - my earlier response was merely meant to say I'm not
opposed to a non-confusing comment, not that I see a strict
need for a mostly redundant one here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/mem_sharing: Relax sanity check for memops

2015-05-29 Thread Tamas K Lengyel
The sharing vm_event ring being enabled is not necessary for mem_sharing
memops.

Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de
---
 xen/arch/x86/mm/mem_sharing.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 0700f00..16e329e 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1320,10 +1320,6 @@ int 
mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
 if ( !hap_enabled(d) || !d-arch.hvm_domain.mem_sharing_enabled )
 goto out;
 
-rc = -ENODEV;
-if ( unlikely(!d-vm_event-share.ring_page) )
-goto out;
-
 switch ( mso.op )
 {
 case XENMEM_sharing_op_nominate_gfn:
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable test] 56759: regressions - FAIL

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 11:56, andrew.coop...@citrix.com wrote:
 On 28/05/15 11:10, Jan Beulich wrote:
 On 28.05.15 at 11:26, ian.campb...@citrix.com wrote:
 On Thu, 2015-05-28 at 09:50 +0100, Jan Beulich wrote:
 On 27.05.15 at 18:04, ian.campb...@citrix.com wrote:
 On Tue, 2015-05-26 at 14:29 +0100, Ian Campbell wrote:
 I've now managed to reproduce using the arndale on my desk.
 ... and now I've confirmed that reverting the spin lock change causes
 the issue to not happen any more.
 Considering that this issue has prevented a push for almost
 two weeks, I think we ought to consider reverting the two
 offending commits until the problem got sorted out.
 I think that would probably be wise. I'll try and figure out exactly
 what is going on and propose some patches ASAP.
 Now done and pushed.
 
 Wait what?  This failure is not related to spinlocks; It is a networking
 behavioural bug (hardware specific, even) which has been uncovered,
 showing that there is a preexisting race condition.

If Ian gives his okay, I'm fine to re-instate the reverted patches (which
incidentally even got a push during the night), but I can't really see the
proof of what you claim in any of the earlier communication.

 It is not reasonable to revert a correct change because it has exposed
 an existing race condition elsewhere.  IMO, this should have been a
 force push to mark the test as non-blocking.

That's one way to view it. I'm not sure a force push would have been
warranted here, as the regression was real. And further holding up
the tree moving forward would have been bad in that situation too,
the more that it was - as said above - almost two weeks that it had
been stuck.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable test] 56759: regressions - FAIL

2015-05-29 Thread Ian Campbell
On Fri, 2015-05-29 at 10:56 +0100, Andrew Cooper wrote:
 On 28/05/15 11:10, Jan Beulich wrote:
  On 28.05.15 at 11:26, ian.campb...@citrix.com wrote:
  On Thu, 2015-05-28 at 09:50 +0100, Jan Beulich wrote:
  On 27.05.15 at 18:04, ian.campb...@citrix.com wrote:
  On Tue, 2015-05-26 at 14:29 +0100, Ian Campbell wrote:
  I've now managed to reproduce using the arndale on my desk.
  ... and now I've confirmed that reverting the spin lock change causes
  the issue to not happen any more.
  Considering that this issue has prevented a push for almost
  two weeks, I think we ought to consider reverting the two
  offending commits until the problem got sorted out.
  I think that would probably be wise. I'll try and figure out exactly
  what is going on and propose some patches ASAP.
  Now done and pushed.
 
 Wait what?  This failure is not related to spinlocks; It is a networking
 behavioural bug (hardware specific, even) which has been uncovered,
 showing that there is a preexisting race condition.

That's the current _hypothesis_, but it hasn't been confirmed what is
actually happening here.

So far doing the apparently obvious fix in netback (moving the state
change to closed until after the uevent is generated) doesn't seem to
have fixed the issue. So either the hypothesis is wrong or there is
something more subtle going on.

We don't know what is causing this issue yet and therefore neither
holding up the push gate nor force pushing seem appropriate under the
circumstances.

 It is not reasonable to revert a correct change because it has exposed
 an existing race condition elsewhere.  IMO, this should have been a
 force push to mark the test as non-blocking.
 
 ~Andrew



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xen-netfront sets partial checksum at wrong offset

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 12:39, wei.l...@citrix.com wrote:
 On Fri, May 29, 2015 at 11:34:07AM +0100, Jan Beulich wrote:
  On 11.05.15 at 19:25, venkat.x.venkatsu...@oracle.com wrote:
   
  Please CC the maintainers of the driver. You can get that from
  'scripts/get_maintainer.pl'
  
  I've done that for you.
  
  Thanks, Konrad.
  
  I am copying Wei too who had fixed the below problem earlier.
  It fixed the incorrect ip_hdr(). tcp_hdr() still needs to fixed.
  
  commit d554f73df6bc35ac8f6a65e5560bf1d31dfebed9
  Author: Wei Liu wei.l...@citrix.com
  Date:   Wed Feb 19 18:48:34 2014 +
  
  xen-netfront: reset skb network header before checksum
 
 So no response at all so far from the maintainers made me look into
 this: I first thought what we need would be calls to
 skb_probe_transport_header() in skb_checksum_setup_ip() after
 each of the skb_maybe_pull_tail() functions. But
 skb_partial_csum_set() already calls skb_set_transport_header(),
 so I now think things ought to be fine without any change. Can
 you clarify what you think is missing? Or is this an issue in just the
 old (3.8.x) kernel you're using?
 
 
 I think this is the follow-up thread 20150512013424.ga7...@oracle.com
 
 And the conclusion is 3.8 is too old so the fix is not there.

Ah, right. Seems like I failed to remove the earlier mail from my list of
things to look at when this came through. Sorry for the noise then.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen BUG at page_alloc.c:1738 (Xen 4.5)

2015-05-29 Thread Andrew Cooper
On 29/05/15 11:50, M A Young wrote:
 On Fri, 29 May 2015, Andrew Cooper wrote:

 Are you in a position to compile identical Xen 4.5 source with two different
 versions of gcc?  (current staging-4.5 staging even has the gcc5 build fix
 in)

 If it is a gcc compiler bug, we would expect the version compiled with gcc
 4.9 to work fine, but the one compiled with 5 to fail in the identified
 manor.
 I did a bit of testing - xen-4.5.1-rc1 built on Fedora 22 (gcc5) doesn't 
 boot for me, but if I replace xen.gz with one from the same code built on 
 Fedora 21 (gcc4) then it does boot. There are rpms and build logs 
 available via 
 http://copr.fedoraproject.org/coprs/myoung/xentest/build/93366/
 if anyone else wants to do some testing.

   Michael Young

Do you have easy access to xen-syms from each build?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/paging: remove pointless current domain checks

2015-05-29 Thread Andrew Cooper
On 29/05/15 10:09, Jan Beulich wrote:
 Checking that the subject domain is not the current one is pointless
 when already having paused that domain: domain_pause() already
 ASSERT()s this to be the case.

 Signed-off-by: Jan Beulich jbeul...@suse.com

Reviewed-by: Andrew Cooper andrew.coop...@citrix.com


 --- a/xen/arch/x86/mm/hap/hap.c
 +++ b/xen/arch/x86/mm/hap/hap.c
 @@ -464,13 +464,6 @@ int hap_enable(struct domain *d, u32 mod
  
  domain_pause(d);
  
 -/* error check */
 -if ( (d == current-domain) )
 -{
 -rv = -EINVAL;
 -goto out;
 -}
 -
  old_pages = d-arch.paging.hap.total_pages;
  if ( old_pages == 0 )
  {
 --- a/xen/arch/x86/mm/shadow/common.c
 +++ b/xen/arch/x86/mm/shadow/common.c
 @@ -2972,8 +2972,7 @@ int shadow_enable(struct domain *d, u32 
  domain_pause(d);
  
  /* Sanity check the arguments */
 -if ( (d == current-domain) ||
 - shadow_mode_enabled(d) ||
 +if ( shadow_mode_enabled(d) ||
   ((mode  PG_translate)  !(mode  PG_refcounts)) ||
   ((mode  PG_external)  !(mode  PG_translate)) )
  {





 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests

2015-05-29 Thread Wei Liu
On Fri, May 29, 2015 at 08:59:45AM +0100, Ross Lagerwall wrote:
 When doing passthrough of a PCI device for an HVM guest, don't insert
 the device into xenstore, otherwise pciback attempts to use it which
 conflicts with QEMU.
 
 This manifests itself such that the first time a device is passed to a
 domain, it succeeds. Subsequent attempts fail unless the device is
 unbound from pciback or the machine rebooted.
 

The commit message looks sensible to me. However this might break
qemu-trad PCI passthrough if I'm not mistaken. What QEMU version are you
using?  Upstream or trad? Have you tested both of them?

Wei.

 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
 ---
  tools/libxl/libxl_pci.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
 index e0743f8..2552889 100644
 --- a/tools/libxl/libxl_pci.c
 +++ b/tools/libxl/libxl_pci.c
 @@ -994,7 +994,7 @@ out:
  }
  }
  
 -if (!starting)
 +if (!starting  type == LIBXL_DOMAIN_TYPE_PV)
  rc = libxl__device_pci_add_xenstore(gc, domid, pcidev, starting);
  else
  rc = 0;
 -- 
 2.1.0

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/mem_sharing: Relax sanity check for memops

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 11:37, tkleng...@sec.in.tum.de wrote:
 The sharing vm_event ring being enabled is not necessary for mem_sharing
 memops.

If indeed so, why would the same not apply to mem_paging memops?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests

2015-05-29 Thread Ross Lagerwall

On 05/29/2015 10:50 AM, Wei Liu wrote:

On Fri, May 29, 2015 at 10:43:08AM +0100, Ross Lagerwall wrote:

On 05/29/2015 10:41 AM, Wei Liu wrote:

On Fri, May 29, 2015 at 08:59:45AM +0100, Ross Lagerwall wrote:

When doing passthrough of a PCI device for an HVM guest, don't insert
the device into xenstore, otherwise pciback attempts to use it which
conflicts with QEMU.

This manifests itself such that the first time a device is passed to a
domain, it succeeds. Subsequent attempts fail unless the device is
unbound from pciback or the machine rebooted.



The commit message looks sensible to me. However this might break
qemu-trad PCI passthrough if I'm not mistaken. What QEMU version are you
using?  Upstream or trad? Have you tested both of them?



qemu-trad. I haven't tested with qemu-upstream.



I don't quite get this. Doesn't qemu-trad depends on those xenstore
nodes for PCI passthrough information?  What did I miss?



A different set of xenstore keys are used for communication between 
libxl and QEMU. The communication between libxl and QEMU happens further 
up in the same function:

http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxl/libxl_pci.c;h=e0743f8112689b340ba7de88bc8895b62105aaba;hb=HEAD#l901

Regards,
--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 26/41] arm : acpi add xen environment table

2015-05-29 Thread Stefano Stabellini
On Thu, 28 May 2015, Jan Beulich wrote:
  On 28.05.15 at 14:12, stefano.stabell...@eu.citrix.com wrote:
  On Thu, 28 May 2015, Jan Beulich wrote:
   On 28.05.15 at 12:58, stefano.stabell...@eu.citrix.com wrote:
   Let's take a closer look at this table. After the boilierplate, these
   are the interesting fields:
   
   GNT Start, GNT Size
   Evtchn Intr, Evtchn Intr Flags
   
   After the table, it is clearly stated:
   
   The Grant Table region is optional.
   
   and
   
   The Event Channel Interrupt is optional.
   
   So I think there is no problem: we don't want to pass any info in that
   table? Sure, let's not pass any. We can still use it to flag the
   presence of Xen on the platform.
  
  Even more so a reason to use what base ACPI has, without any
  custom table.
  
  Could you please make a concrete suggestion with table names and fields?
 
 I already pointed you at 6.0's new FADT field Hypervisor Vendor
 Identity.

OK, that is a decent alternative.

We don't have a suitable hypercall to retrieve the evtchn PPI but we
could add one. Overall I still prefer the table approach, but if you
really don't want it, I can live with the hypercalls.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xen-netfront sets partial checksum at wrong offset

2015-05-29 Thread Wei Liu
On Fri, May 29, 2015 at 11:34:07AM +0100, Jan Beulich wrote:
  On 11.05.15 at 19:25, venkat.x.venkatsu...@oracle.com wrote:
   
  Please CC the maintainers of the driver. You can get that from
  'scripts/get_maintainer.pl'
  
  I've done that for you.
  
  Thanks, Konrad.
  
  I am copying Wei too who had fixed the below problem earlier.
  It fixed the incorrect ip_hdr(). tcp_hdr() still needs to fixed.
  
  commit d554f73df6bc35ac8f6a65e5560bf1d31dfebed9
  Author: Wei Liu wei.l...@citrix.com
  Date:   Wed Feb 19 18:48:34 2014 +
  
  xen-netfront: reset skb network header before checksum
 
 So no response at all so far from the maintainers made me look into
 this: I first thought what we need would be calls to
 skb_probe_transport_header() in skb_checksum_setup_ip() after
 each of the skb_maybe_pull_tail() functions. But
 skb_partial_csum_set() already calls skb_set_transport_header(),
 so I now think things ought to be fine without any change. Can
 you clarify what you think is missing? Or is this an issue in just the
 old (3.8.x) kernel you're using?
 

I think this is the follow-up thread 20150512013424.ga7...@oracle.com

And the conclusion is 3.8 is too old so the fix is not there.

 (In either case netback's xenvif_tx_submit() calling
 skb_probe_transport_header() would seem pointless, as
 skb_checksum_setup() - with or without a fix - ought to be taking
 care of this anyway.)
 

Patches welcome. :-)

Wei.

 Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] osstest: reduce FreeBSD install timeouts

2015-05-29 Thread Roger Pau Monne
Only the first block is expected to take longer (because it decompresses
the image and writes it to a LVM volume), the remaining commands should
execute much faster, so reduce the timeout.

Signed-off-by: Roger Pau Monné roger@citrix.com
Cc: Ian Jackson ian.jack...@eu.citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
---
I've checked this with the output of OSSTest, and those blocks execute
immediately, OSSTest doesn't even report execution times on them:

http://logs.test-lab.xenproject.org/osstest/logs/57419/test-amd64-i386-freebsd10-amd64/9.ts-freebsd-install.log
---
 ts-freebsd-install | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ts-freebsd-install b/ts-freebsd-install
index 0d6eb0c..98dad24 100755
--- a/ts-freebsd-install
+++ b/ts-freebsd-install
@@ -67,7 +67,7 @@ sub prep () {
 mount -t ufs -o ufstype=ufs2,rw $rootpartition_dev $mnt
 END
 
-target_cmd_root($ho, END, 900);
+target_cmd_root($ho, END, 15);
 mkdir -p $mnt/root/.ssh
 cat 'ENDKEYS' $mnt/root/.ssh/authorized_keys
 $authkeys
@@ -75,7 +75,7 @@ ENDKEYS
 
 END
 
-target_cmd_root($ho, END, 900);
+target_cmd_root($ho, END, 30);
 echo 'sshd_enable=YES'  $mnt/etc/rc.conf
 echo 'ifconfig_xn0=DHCP'  $mnt/etc/rc.conf
 echo 'PermitRootLogin yes'  $mnt/etc/ssh/sshd_config
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/4] Fix HVM vNUMA

2015-05-29 Thread Wei Liu
Ping...

On Mon, May 18, 2015 at 04:34:48PM +0100, Wei Liu wrote:
 Boris discovered that HVM vNUMA didn't actually work. This patch series fixes
 that.
 
 The first patch is a prerequisite patch for the actual fixes. The second patch
 is to help debugging. The fixes are in the last two patches, which can be
 squashed into one if necessary. 
 
 Wei.
 
 Wei Liu (4):
   libxc/libxl: fill xc_hvm_build_args in libxl
   libxc: print more error messages when failed
   libxc: rework vnuma bits in setup_guest
   libxl: fix HVM vNUMA
 
  tools/libxc/xc_hvm_build_x86.c | 129 
 ++---
  tools/libxl/libxl_dom.c|  48 ---
  tools/libxl/libxl_vnuma.c  |  15 -
  3 files changed, 122 insertions(+), 70 deletions(-)
 
 -- 
 1.9.1

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen BUG at page_alloc.c:1738 (Xen 4.5)

2015-05-29 Thread M A Young
On Fri, 29 May 2015, Andrew Cooper wrote:

 Are you in a position to compile identical Xen 4.5 source with two different
 versions of gcc?  (current staging-4.5 staging even has the gcc5 build fix
 in)
 
 If it is a gcc compiler bug, we would expect the version compiled with gcc
 4.9 to work fine, but the one compiled with 5 to fail in the identified
 manor.

I did a bit of testing - xen-4.5.1-rc1 built on Fedora 22 (gcc5) doesn't 
boot for me, but if I replace xen.gz with one from the same code built on 
Fedora 21 (gcc4) then it does boot. There are rpms and build logs 
available via 
http://copr.fedoraproject.org/coprs/myoung/xentest/build/93366/
if anyone else wants to do some testing.

Michael Young___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen BUG at page_alloc.c:1738 (Xen 4.5)

2015-05-29 Thread M A Young
On Fri, 29 May 2015, Andrew Cooper wrote:

 On 29/05/15 11:50, M A Young wrote:
  On Fri, 29 May 2015, Andrew Cooper wrote:
 
  Are you in a position to compile identical Xen 4.5 source with two 
  different
  versions of gcc?  (current staging-4.5 staging even has the gcc5 build fix
  in)
 
  If it is a gcc compiler bug, we would expect the version compiled with gcc
  4.9 to work fine, but the one compiled with 5 to fail in the identified
  manor.
  I did a bit of testing - xen-4.5.1-rc1 built on Fedora 22 (gcc5) doesn't 
  boot for me, but if I replace xen.gz with one from the same code built on 
  Fedora 21 (gcc4) then it does boot. There are rpms and build logs 
  available via 
  http://copr.fedoraproject.org/coprs/myoung/xentest/build/93366/
  if anyone else wants to do some testing.
 
  Michael Young
 
 Do you have easy access to xen-syms from each build?

Yes.

Michael Young

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 05/13] x86: expose CBM length and COS number information

2015-05-29 Thread Dario Faggioli
On Fri, 2015-05-29 at 09:07 +0100, Jan Beulich wrote:
  On 29.05.15 at 04:47, chao.p.p...@linux.intel.com wrote:
  On Thu, May 28, 2015 at 02:26:03PM +0100, Jan Beulich wrote:
  
   --- a/xen/include/public/sysctl.h
   +++ b/xen/include/public/sysctl.h
   @@ -694,6 +694,20 @@ struct xen_sysctl_pcitopoinfo {
typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);

   +#define XEN_SYSCTL_PSR_CAT_get_l3_info   0
   +struct xen_sysctl_psr_cat_op {
   +uint32_t cmd;   /* IN: XEN_SYSCTL_PSR_CAT_* */
   +uint32_t target;/* IN: socket to be operated on */
  
  If this is always the socket number, why would the variable be
  named anything other than socket. If otoh subsequent patches
  use it differently, I think the comment should be omitted now
  rather than being dropped then (or it should be given its final
  wording from the beginning).
  
  Or 'target to be operated on'?
 
 Fine with me. Just not something that may end up being confusing.
 
So, I really don't want to turn this into pure bikeshedding, but, for a
field called 'target', a comment saying 'target to be operated on' seems
rather pointless, and I'd go for omitting it (for now).

Dario

-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests

2015-05-29 Thread Wei Liu
On Fri, May 29, 2015 at 10:43:08AM +0100, Ross Lagerwall wrote:
 On 05/29/2015 10:41 AM, Wei Liu wrote:
 On Fri, May 29, 2015 at 08:59:45AM +0100, Ross Lagerwall wrote:
 When doing passthrough of a PCI device for an HVM guest, don't insert
 the device into xenstore, otherwise pciback attempts to use it which
 conflicts with QEMU.
 
 This manifests itself such that the first time a device is passed to a
 domain, it succeeds. Subsequent attempts fail unless the device is
 unbound from pciback or the machine rebooted.
 
 
 The commit message looks sensible to me. However this might break
 qemu-trad PCI passthrough if I'm not mistaken. What QEMU version are you
 using?  Upstream or trad? Have you tested both of them?
 
 
 qemu-trad. I haven't tested with qemu-upstream.
 

I don't quite get this. Doesn't qemu-trad depends on those xenstore
nodes for PCI passthrough information?  What did I miss?

Wei.

 -- 
 Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xen-netfront sets partial checksum at wrong offset

2015-05-29 Thread Jan Beulich
 On 11.05.15 at 19:25, venkat.x.venkatsu...@oracle.com wrote:
  
 Please CC the maintainers of the driver. You can get that from
 'scripts/get_maintainer.pl'
 
 I've done that for you.
 
 Thanks, Konrad.
 
 I am copying Wei too who had fixed the below problem earlier.
 It fixed the incorrect ip_hdr(). tcp_hdr() still needs to fixed.
 
 commit d554f73df6bc35ac8f6a65e5560bf1d31dfebed9
 Author: Wei Liu wei.l...@citrix.com
 Date:   Wed Feb 19 18:48:34 2014 +
 
 xen-netfront: reset skb network header before checksum

So no response at all so far from the maintainers made me look into
this: I first thought what we need would be calls to
skb_probe_transport_header() in skb_checksum_setup_ip() after
each of the skb_maybe_pull_tail() functions. But
skb_partial_csum_set() already calls skb_set_transport_header(),
so I now think things ought to be fine without any change. Can
you clarify what you think is missing? Or is this an issue in just the
old (3.8.x) kernel you're using?

(In either case netback's xenvif_tx_submit() calling
skb_probe_transport_header() would seem pointless, as
skb_checksum_setup() - with or without a fix - ought to be taking
care of this anyway.)

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/mem_sharing: Relax sanity check for memops

2015-05-29 Thread Tamas K Lengyel
On Fri, May 29, 2015 at 11:54 AM, Jan Beulich jbeul...@suse.com wrote:

  On 29.05.15 at 11:37, tkleng...@sec.in.tum.de wrote:
  The sharing vm_event ring being enabled is not necessary for mem_sharing
  memops.

 If indeed so, why would the same not apply to mem_paging memops?

 Jan



The ring during mem_sharing is only used to signal an out-of-memory error
condition. If no listener is present for this error condition, Xen will
automatically kill the domain that was requesting more memory when none is
available during unsharing. For paging, the listener is an absolute
requirement as without it paging won't work at all. That's not the case for
sharing.

Tamas
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 26/41] arm : acpi add xen environment table

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 12:31, stefano.stabell...@eu.citrix.com wrote:
 On Thu, 28 May 2015, Jan Beulich wrote:
  On 28.05.15 at 14:12, stefano.stabell...@eu.citrix.com wrote:
  Could you please make a concrete suggestion with table names and fields?
 
 I already pointed you at 6.0's new FADT field Hypervisor Vendor
 Identity.
 
 OK, that is a decent alternative.
 
 We don't have a suitable hypercall to retrieve the evtchn PPI but we
 could add one. Overall I still prefer the table approach, but if you
 really don't want it, I can live with the hypercalls.

I'm clearly not in the position to force any design decision onto the
ARM side of Xen. But I continue to be of the opinion that custom
tables should be a last resort approach only, which isn't warranted
here (anymore).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] efi: Reallocate memory map if ExitBootServices() fails

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 11:57, ross.lagerw...@citrix.com wrote:
 On 05/29/2015 10:45 AM, Jan Beulich wrote:
 On 29.05.15 at 09:48, ross.lagerw...@citrix.com wrote:
 --- a/xen/common/efi/boot.c
 +++ b/xen/common/efi/boot.c
 @@ -1053,14 +1053,14 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
 *SystemTable)
   efi_arch_video_init(gop, info_size, mode_info);
   }

 -efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
 - efi_mdesc_size, mdesc_ver);
 -efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
 -if ( !efi_memmap )
 -blexit(LUnable to allocate memory for EFI memory map);
 -
   for ( retry = 0; ; retry = 1 )
   {
 +efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
 + efi_mdesc_size, mdesc_ver);
 +efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
 +if ( !efi_memmap )
 +blexit(LUnable to allocate memory for EFI memory map);

 You can't blexit() anymore after having called ExitBootServices() once.
 Admittedly even the PrintErrMesg() used for error handling a few
 lines down is on the edge of being invalid (but this is a best effort
 thing anyway).
 
 OK, I'll convert it into a PrintErrMesg.

Conditionally upon being in the second iteration.

 Since, finally, this is only a workaround, as the spec clearly says:
 After an Operating System calls ExitBootServices(), firmware boot
 services are no longer available and it is illegal to call any boot
 service. Even our re-invocation of GetMemoryMap() is bending
 that (and I only hesitantly agreed for it to be added).
 
 The spec kinda disagrees with that:
 It is suggested that GetMemoryMap()be called immediately before calling 
 ExitBootServices(). If MapKey value is incorrect, ExitBootServices() 
 returns EFI_INVALID_PARAMETER and GetMemoryMap() with ExitBootServices() 
 must be called again. Firmware implementation may choose to do a partial 
 shutdown of the boot services during the first call to 
 ExitBootServices(). EFI OS loader should not make calls to any boot 
 service function other then GetMemoryMap() after the first call to 
 ExitBootServices().

This was the grounds on which the current retry loop was added.

 I think the patch does the right thing in the regard.

For x86 yes, since efi_arch_allocate_mmap_buffer() doesn't use
boot services there. For ARM no, due to its use of AllocatePool().
Iirc Daniel was also planning on changing the x86 side too, and
such a change would break things in non-obvious ways. Hence I
think depending on the ability to do another allocation after the
first ExitBootServices() call is not really a good idea.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/4] Fix HVM vNUMA

2015-05-29 Thread Dario Faggioli
On Fri, 2015-05-29 at 11:46 +0100, Wei Liu wrote:
 Ping...
 
FWIW, I had a look at the patches and they seemed fine, sorry to not
having said that before.

TBH, that was partly because I was expecting a v2 to show up, fixing the
(trivial) issue you identified together with Boris, and  was planning to
'Reviewed-by' that...

Dario

 On Mon, May 18, 2015 at 04:34:48PM +0100, Wei Liu wrote:
  Boris discovered that HVM vNUMA didn't actually work. This patch series 
  fixes
  that.
  
  The first patch is a prerequisite patch for the actual fixes. The second 
  patch
  is to help debugging. The fixes are in the last two patches, which can be
  squashed into one if necessary. 
  
  Wei.
  
  Wei Liu (4):
libxc/libxl: fill xc_hvm_build_args in libxl
libxc: print more error messages when failed
libxc: rework vnuma bits in setup_guest
libxl: fix HVM vNUMA
  
   tools/libxc/xc_hvm_build_x86.c | 129 
  ++---
   tools/libxl/libxl_dom.c|  48 ---
   tools/libxl/libxl_vnuma.c  |  15 -
   3 files changed, 122 insertions(+), 70 deletions(-)
  
  -- 
  1.9.1
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH] x86: remove vmalloc.h from asm/io.h

2015-05-29 Thread Stephen Rothwell
Hi Ingo,

On Fri, 29 May 2015 11:21:05 +0200 Ingo Molnar mi...@kernel.org wrote:

 Good idea.
 
 Acked-by: Ingo Molnar mi...@kernel.org

Thanks.

 Please also test x86 allnoconfig and defconfig 32/64, that tends to unearth 
 the 
 remaining places. People doing randconfig testing will find the rest.

Good idea.  the allnoconfigs produced this further patch.  I will
squash it into the original.  The defconfigs built ok.

From: Stephen Rothwell s...@canb.auug.org.au
Date: Fri, 29 May 2015 22:01:41 +1000
Subject: [PATCH] x86: more fixes for removing vmalloc.h fron asm/io.h

Signed-off-by: Stephen Rothwell s...@canb.auug.org.au
---
 arch/x86/include/asm/io.h | 1 +
 include/linux/io.h| 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 5791e7ace9db..2a3543a4db1d 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -40,6 +40,7 @@
 #include linux/compiler.h
 #include asm/page.h
 #include asm/early_ioremap.h
+#include asm/pgtable_types.h
 
 #define build_mmio_read(name, size, type, reg, barrier) \
 static inline type name(const volatile void __iomem *addr) \
diff --git a/include/linux/io.h b/include/linux/io.h
index 986f2bffea1e..cb753a2450b8 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -19,6 +19,7 @@
 #define _LINUX_IO_H
 
 #include linux/types.h
+#include linux/init.h
 #include asm/io.h
 #include asm/page.h
 
-- 
2.1.4

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpHAjsDQXDnb.pgp
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH] x86: remove vmalloc.h from asm/io.h

2015-05-29 Thread Takashi Iwai
At Fri, 29 May 2015 19:18:47 +1000,
Stephen Rothwell wrote:
 
 Nothing in asm/io.h uses anything from vmalloc.h, so remove the include
 and fix up the build problems in an allmodconfig (64 bit and 32 bit)
 build.
 
 This may be the place where x86 builds get vmalloc.h implicitly included
 and that tends to hide places where vmalloc() et al are added to files
 but the include of vmalloc.h is forgotten.
 
 Cc: Thomas Gleixner t...@linutronix.de
 Cc: Ingo Molnar mi...@redhat.com
 Cc: H. Peter Anvin h...@zytor.com
 Cc: x...@kernel.org
 Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Cc: Boris Ostrovsky boris.ostrov...@oracle.com
 Cc: David Vrabel david.vra...@citrix.com
 Cc: Anton Vorontsov an...@enomsg.org
 Cc: Colin Cross ccr...@android.com
 Cc: Kees Cook keesc...@chromium.org
 Cc: Tony Luck tony.l...@intel.com
 Cc: Rafael J. Wysocki r...@rjwysocki.net
 Cc: Len Brown l...@kernel.org
 Cc: Kristen Carlson Accardi kris...@linux.intel.com
 Cc: Viresh Kumar viresh.ku...@linaro.org
 Cc: Vinod Koul vinod.k...@intel.com
 Cc: K. Y. Srinivasan k...@microsoft.com
 Cc: Haiyang Zhang haiya...@microsoft.com
 Cc: Hiral Patel hiral...@cisco.com
 Cc: Suma Ramars sram...@cisco.com
 Cc: Brian Uchino buch...@cisco.com
 Cc: James E.J. Bottomley jbottom...@odin.com
 Cc: Jaroslav Kysela pe...@perex.cz
 Cc: Takashi Iwai ti...@suse.de

For the sound bits,
  Acked-by: Takashi Iwai ti...@suse.de


thanks,

Takashi

 Cc: Andrew Morton a...@linux-foundation.org
 Suggested-by: David Miller da...@davemloft.net
 Signed-off-by: Stephen Rothwell s...@canb.auug.org.au
 
 ---
 
 Based in Linus' tree of today.
 
 There are probably more places that need vmalloc.h included, but this
 passes 64 bit and 32 bit allmodconfig builds, so is a place to start.
 
 Dave Miller suggested that I start this journey.
 
  arch/x86/include/asm/io.h  | 2 --
  arch/x86/kernel/crash.c| 1 +
  arch/x86/kernel/machine_kexec_64.c | 1 +
  arch/x86/mm/pageattr-test.c| 1 +
  arch/x86/mm/pageattr.c | 1 +
  arch/x86/xen/p2m.c | 1 +
  drivers/acpi/apei/erst.c   | 1 +
  drivers/cpufreq/intel_pstate.c | 1 +
  drivers/dma/mic_x100_dma.c | 1 +
  drivers/net/hyperv/netvsc.c| 1 +
  drivers/net/hyperv/rndis_filter.c  | 1 +
  drivers/scsi/fnic/fnic_debugfs.c   | 1 +
  drivers/scsi/fnic/fnic_trace.c | 1 +
  sound/pci/asihpi/hpioctl.c | 1 +
  14 files changed, 13 insertions(+), 2 deletions(-)
 
 diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
 index 34a5b93704d3..5791e7ace9db 100644
 --- a/arch/x86/include/asm/io.h
 +++ b/arch/x86/include/asm/io.h
 @@ -197,8 +197,6 @@ extern void set_iounmap_nonlazy(void);
  
  #include asm-generic/iomap.h
  
 -#include linux/vmalloc.h
 -
  /*
   * Convert a virtual cached pointer to an uncached pointer
   */
 diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
 index c76d3e37c6e1..e068d6683dba 100644
 --- a/arch/x86/kernel/crash.c
 +++ b/arch/x86/kernel/crash.c
 @@ -22,6 +22,7 @@
  #include linux/elfcore.h
  #include linux/module.h
  #include linux/slab.h
 +#include linux/vmalloc.h
  
  #include asm/processor.h
  #include asm/hardirq.h
 diff --git a/arch/x86/kernel/machine_kexec_64.c 
 b/arch/x86/kernel/machine_kexec_64.c
 index 415480d3ea84..11546b462fa6 100644
 --- a/arch/x86/kernel/machine_kexec_64.c
 +++ b/arch/x86/kernel/machine_kexec_64.c
 @@ -17,6 +17,7 @@
  #include linux/ftrace.h
  #include linux/io.h
  #include linux/suspend.h
 +#include linux/vmalloc.h
  
  #include asm/init.h
  #include asm/pgtable.h
 diff --git a/arch/x86/mm/pageattr-test.c b/arch/x86/mm/pageattr-test.c
 index 6629f397b467..8ff686aa7e8c 100644
 --- a/arch/x86/mm/pageattr-test.c
 +++ b/arch/x86/mm/pageattr-test.c
 @@ -9,6 +9,7 @@
  #include linux/random.h
  #include linux/kernel.h
  #include linux/mm.h
 +#include linux/vmalloc.h
  
  #include asm/cacheflush.h
  #include asm/pgtable.h
 diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
 index 89af288ec674..bedfc794b4ba 100644
 --- a/arch/x86/mm/pageattr.c
 +++ b/arch/x86/mm/pageattr.c
 @@ -14,6 +14,7 @@
  #include linux/percpu.h
  #include linux/gfp.h
  #include linux/pci.h
 +#include linux/vmalloc.h
  
  #include asm/e820.h
  #include asm/processor.h
 diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c
 index b47124d4cd67..8b7f18e200aa 100644
 --- a/arch/x86/xen/p2m.c
 +++ b/arch/x86/xen/p2m.c
 @@ -67,6 +67,7 @@
  #include linux/seq_file.h
  #include linux/bootmem.h
  #include linux/slab.h
 +#include linux/vmalloc.h
  
  #include asm/cache.h
  #include asm/setup.h
 diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
 index ed65e9c4b5b0..3670bbab57a3 100644
 --- a/drivers/acpi/apei/erst.c
 +++ b/drivers/acpi/apei/erst.c
 @@ -35,6 +35,7 @@
  #include linux/nmi.h
  #include linux/hardirq.h
  #include linux/pstore.h
 +#include linux/vmalloc.h
  #include acpi/apei.h
  
  #include apei-internal.h
 diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
 

Re: [Xen-devel] [PATCH v2 1/4] libxc/libxl: fill xc_hvm_build_args in libxl

2015-05-29 Thread Wei Liu
On Fri, May 29, 2015 at 01:29:18PM +0100, Andrew Cooper wrote:
 On 29/05/15 12:37, Wei Liu wrote:
  When building HVM guests, originally some fields of xc_hvm_build_args
  are filled in xc_hvm_build (and buried in the wrong function), some are
  set in libxl__build_hvm before passing xc_hvm_build_args to
  xc_hvm_build. This is fragile.
 
  After examining the code in xc_hvm_build that sets those fields, we can
  in fact move setting of mmio_start etc in libxl. This way we consolidate
  memory layout setting in libxl.
 
  The setting of firmware data related fields is left in xc_hvm_build
  because it depends on parsing ELF image. Those fields only point to
  scratch data that doesn't affect memory layout.
 
  There should be no change in the generated guest memory layout.
 
  Signed-off-by: Wei Liu wei.l...@citrix.com
  Cc: Ian Campbell ian.campb...@citrix.com
  Cc: Ian Jackson ian.jack...@eu.citrix.com
  ---
  Cc: Chen, Tiejun tiejun.c...@intel.com
 
  This might affect your RMRR patch series.
 
 It should at least me noted that this is a semantic change in domain
 construction for all other toolstacks out there, an aid to the unlucky
 person forward porting something like Xapi and finding that a chunk of
 work is no longer performed by libxc.
 

Right. I will write this in the commit message if I end up sending v3.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Ian Campbell
On Fri, 2015-05-22 at 10:08 +0100, Ian Campbell wrote:
 If win7 doesn't shutdown given a power button request I'd be more
 inclined to remove the setting in osstest for those flights and let
 guest-stop go back to being never pass than to start making such changes
 to the VM config which I think would probably break the preceding
 suspend and migration tests (which aren't completely reliable, but are
 far more so than this shutdown one).

Does anyone have any ideas here or shall I propose:

-8--

From 2d1b814e65676c3cf56ce2e569491953607f53f7 Mon Sep 17 00:00:00 2001
From: Ian Campbell ian.campb...@citrix.com
Date: Fri, 29 May 2015 13:51:33 +0100
Subject: [PATCH] Turn off acpi_shutdown for windows 7

As described in 1432284841.10746.136.ca...@citrix.com /
http://lists.xen.org/archives/html/xen-devel/2015-05/msg03016.html
Windows 7 does not appear to reliably actually shutdown when asked to
via the ACPI power button.

Once this patch is applied some force pushes will likely be needed in
order for this to not appear as a regression.

Signed-off-by: Ian Campbell ian.campb...@citrix.com
Cc: Andrew Cooper andrew.coop...@citrix.com
Cc: Paul Durrant paul.durr...@citrix.com
Cc: Ian Jackson ian.jack...@eu.citrix.com,
Cc: Jan Beulich jbeul...@suse.com
---
Alternatively could make it an allowed/non-blocking failure?
---
 make-flight | 1 -
 1 file changed, 1 deletion(-)

diff --git a/make-flight b/make-flight
index 8a1fceb..5120891 100755
--- a/make-flight
+++ b/make-flight
@@ -206,7 +206,6 @@ do_hvm_win7_x64_tests () {
   job_create_test test-$xenarch$kern-$dom0arch-xl$qemuu_suffix-win7-amd64 \
 test-win xl $xenarch $dom0arch $qemuu_runvar \
 win_image=win7-x64.iso \
-win_acpi_shutdown=true \
 all_hostflags=$most_hostflags,hvm
 }
 
-- 
2.1.4




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for Xen 4.6 1/4] xen: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-05-29 Thread Dario Faggioli
On Wed, 2015-05-27 at 12:41 +0100, Jan Beulich wrote:
  On 26.05.15 at 19:18, lichong...@gmail.com wrote:
  On Tue, May 26, 2015 at 4:08 AM, Jan Beulich jbeul...@suse.com wrote:
  On 26.05.15 at 02:05, lichong...@gmail.com wrote:
  Add two hypercalls(XEN_DOMCTL_SCHEDOP_getvcpuinfo/putvcpuinfo) to get/set 
  a
  domain's
  per-VCPU parameters. Hypercalls are handled by newly added hook
  (.adjust_vcpu) in the
  scheduler interface.
 
  Add a new data structure (struct xen_domctl_scheduler_vcpu_op) for
  transferring data
  between tool and hypervisor.
 
  Signed-off-by: Chong Li chong...@wustl.edu
  Signed-off-by: Meng Xu men...@cis.upenn.edu
  Signed-off-by: Sisu Xi xis...@gmail.com
  ---
 
  Missing brief description of changes in v2 here.
  
  Based on Dario's suggestion in PATCH v1, we think it would be better
  to make the per-vcpu get/set functionalities more general, rather than
  just serving for rtds scheduler. So, the changes in v2 are:
  
  1) Add a new hypercall, XEN_DOMCTL_scheduler_vcpu_op. Any scheduler
  can use it for per-vcpu get/set. There is a new data structure (struct
  xen_domctl_scheduler_vcpu_op), which helps transferring data (per-vcpu
  params) between tool and hypervisor when the hypercall is invoked.
  
  2) The new hypercall is handled by function sched_adjust_vcpu (in
  schedule.c), which would call a newly added hook (named as
  .adjust_vcpu, added to the scheduler interface (struct scheduler)).
  
  3) In RTDS scheduler, .adjust_vcpu hooks a function defined in
  sched_rt.c, named as rt_vcpu_cntl. This function gets/sets the
  per-vcpu params.
 
 No, that looks to be the description of the patch as a whole, not
 one of the delta between v1 and v2 (which is what I was asking for).
 
  +
  +switch ( op-cmd )
  +{
  +case XEN_DOMCTL_SCHEDOP_getvcpuinfo:
  +spin_lock_irqsave(prv-lock, flags);
  +list_for_each( iter, sdom-vcpu )
  +{
  +svc = list_entry(iter, struct rt_vcpu, sdom_elem);
  +vcpuid = svc-vcpu-vcpu_id;
  +
  +local_sched.budget = svc-budget / MICROSECS(1);
  +local_sched.period = svc-period / MICROSECS(1);
  +if ( copy_to_guest_offset(op-u.rtds.vcpus, vcpuid,
 
  What makes you think that the caller has provided enough slots?
  
  This code works in a similar way as the one for XEN_SYSCTL_numainfo.
  So if there is no enough slot in guest space, en error code is
  returned.
 
 I don't think I saw any place you returned an error here. The
 only error being returned is -EFAULT when the copy-out fails.
 
Look again at XEN_SYSCTL_numainfo, in particular, at this part:

if ( ni-num_nodes  num_nodes )
{
ret = -ENOBUFS;
i = num_nodes;
}
else
i = 0

Which, as you'll appreciate if you check from where ni-num_nodes and
num_nodes come from, if where the boundary checking we're asking for
happens.

Note how the 'i' variable is used in the rest of the block, and you'll
see what we mean, and can do something similar.

  +local_sched, 1) )
 
  You're leaking hypervisor stack contents here, but by reading
  the structure from guest memory first to honor its vcpuid field
  (this making get match put) you'd avoid this anyway.
 
  
  The structure in guest memory has nothing, and it will hold per-vcpu
  params after this XEN_DOMCTL_SCHEDOP_getvcpuinfo hypercall is well
  served.
  Does leaking hypervisor stack mean that I need to call
  local_sched.vcpuid = vcpuid before copying this local_sched to the
  guest memory?
 
 Not just that. You'd also need to make sure padding space is
 cleared. 

Indeed.

 But as said, I suppose you want to copy in what the
 guest provided anyway, in which case the leak is implicitly
 gone.
 
Chong, you're saying above The structure in guest memory has nothing.
AFAICT, what Jan means when he says make get match put, is why don't
you, for XEN_DOMCTL_SCHEDOP_getvcpuinfo, allow the caller to provide an
array with fewer elements than the number of vcpus, and put in the
vcpuid field of those elements, the id-s of the vcpus she wants to
receive information upon.

I also was about to make a similar request, as it makes things more
consistent, between get and put. After all, if we think that it is
valuable to allow batching (i.e., having an hypercall that returns the
parameters of _a_bunch_ of vcpus, not just of one), and that it makes
sense to allow that bunch of vcpus to be incomplete and sparse, this is
true for both the directions, isn't it?

  +}
  +hypercall_preempt_check();
 
  ???
  
  We've talked about this in patch v1
  (http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg01152.html).
  When a domain has too many VCPUs, we need to make sure the spin lock
  does not last for too long.
 
 Right, but the call above does nothing like that. Please look at other
 uses of it to understand what needs to be done here.
 
This being said, would it make sense to put down a 

Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Ian Campbell
On Fri, 2015-05-29 at 14:14 +0100, Andrew Cooper wrote:
 On 29/05/15 14:04, Jan Beulich wrote:
  On 29.05.15 at 14:54, ian.campb...@citrix.com wrote:
  On Fri, 2015-05-22 at 10:08 +0100, Ian Campbell wrote:
  If win7 doesn't shutdown given a power button request I'd be more
  inclined to remove the setting in osstest for those flights and let
  guest-stop go back to being never pass than to start making such changes
  to the VM config which I think would probably break the preceding
  suspend and migration tests (which aren't completely reliable, but are
  far more so than this shutdown one).
  Does anyone have any ideas here or shall I propose:
  Unless we have a way to make an adjustment inside the guest for the
  power button to gain shutdown meaning, I think there's no alternative
  to the change below.
 
 You can avoid advertising S3/S4 in the ACPI tables, which iirc causes
 the same alteration to happen.
 
 Hvmloader uses the platform/acpi_s{3,4} booleans to control whether the
 relevant SSDTs are exposed.

As I said in 1432285699.10746.147.ca...@citrix.com and again in reply
to Jan just now, this test does sometimes pass.

So whether or not we are exposing the correct set of tables to the guest
there is something else non-deterministic going on here I think.

I'd also be concerned about the impact on other tests (in particular the
SRM ones) on disabling s3 and/or s4.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [rumpuserxen test] 57518: regressions - FAIL

2015-05-29 Thread osstest service user
flight 57518 rumpuserxen real [real]
http://logs.test-lab.xenproject.org/osstest/logs/57518/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-rumpuserxen   5 rumpuserxen-build fail REGR. vs. 33866
 build-i386-rumpuserxen5 rumpuserxen-build fail REGR. vs. 33866

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a

version targeted for testing:
 rumpuserxen  3b91e44996ea6ae1276bce1cc44f38701c53ee6f
baseline version:
 rumpuserxen  30d72f3fc5e35cd53afd82c8179cc0e0b11146ad


People who touched revisions under test:
  Antti Kantee po...@iki.fi
  Ian Jackson ian.jack...@eu.citrix.com
  Martin Lucina mar...@lucina.net
  Wei Liu l...@liuw.name


jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-i386-rumpuserxen-i386 blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 535 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 14:54, ian.campb...@citrix.com wrote:
 On Fri, 2015-05-22 at 10:08 +0100, Ian Campbell wrote:
 If win7 doesn't shutdown given a power button request I'd be more
 inclined to remove the setting in osstest for those flights and let
 guest-stop go back to being never pass than to start making such changes
 to the VM config which I think would probably break the preceding
 suspend and migration tests (which aren't completely reliable, but are
 far more so than this shutdown one).
 
 Does anyone have any ideas here or shall I propose:

Unless we have a way to make an adjustment inside the guest for the
power button to gain shutdown meaning, I think there's no alternative
to the change below.

Jan

 -8--
 
 From 2d1b814e65676c3cf56ce2e569491953607f53f7 Mon Sep 17 00:00:00 2001
 From: Ian Campbell ian.campb...@citrix.com
 Date: Fri, 29 May 2015 13:51:33 +0100
 Subject: [PATCH] Turn off acpi_shutdown for windows 7
 
 As described in 1432284841.10746.136.ca...@citrix.com /
 http://lists.xen.org/archives/html/xen-devel/2015-05/msg03016.html 
 Windows 7 does not appear to reliably actually shutdown when asked to
 via the ACPI power button.
 
 Once this patch is applied some force pushes will likely be needed in
 order for this to not appear as a regression.
 
 Signed-off-by: Ian Campbell ian.campb...@citrix.com
 Cc: Andrew Cooper andrew.coop...@citrix.com
 Cc: Paul Durrant paul.durr...@citrix.com
 Cc: Ian Jackson ian.jack...@eu.citrix.com,
 Cc: Jan Beulich jbeul...@suse.com
 ---
 Alternatively could make it an allowed/non-blocking failure?
 ---
  make-flight | 1 -
  1 file changed, 1 deletion(-)
 
 diff --git a/make-flight b/make-flight
 index 8a1fceb..5120891 100755
 --- a/make-flight
 +++ b/make-flight
 @@ -206,7 +206,6 @@ do_hvm_win7_x64_tests () {
job_create_test test-$xenarch$kern-$dom0arch-xl$qemuu_suffix-win7-amd64 \
  test-win xl $xenarch $dom0arch $qemuu_runvar \
  win_image=win7-x64.iso \
 -win_acpi_shutdown=true \
  all_hostflags=$most_hostflags,hvm
  }
  
 -- 
 2.1.4




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Ian Campbell
On Fri, 2015-05-29 at 14:04 +0100, Jan Beulich wrote:
  On 29.05.15 at 14:54, ian.campb...@citrix.com wrote:
  On Fri, 2015-05-22 at 10:08 +0100, Ian Campbell wrote:
  If win7 doesn't shutdown given a power button request I'd be more
  inclined to remove the setting in osstest for those flights and let
  guest-stop go back to being never pass than to start making such changes
  to the VM config which I think would probably break the preceding
  suspend and migration tests (which aren't completely reliable, but are
  far more so than this shutdown one).
  
  Does anyone have any ideas here or shall I propose:
 
 Unless we have a way to make an adjustment inside the guest for the
 power button to gain shutdown meaning, I think there's no alternative
 to the change below.

The strange this is that it does work _sometimes_, either by complete
coincidence or because there is something non-deterministic about how
Win7 reacts to this ACPI event.

Does anyone know which setting we would want to change? In particular it
would be good if I knew what to look for to check the current status...

 
 Jan
 
  -8--
  
  From 2d1b814e65676c3cf56ce2e569491953607f53f7 Mon Sep 17 00:00:00 2001
  From: Ian Campbell ian.campb...@citrix.com
  Date: Fri, 29 May 2015 13:51:33 +0100
  Subject: [PATCH] Turn off acpi_shutdown for windows 7
  
  As described in 1432284841.10746.136.ca...@citrix.com /
  http://lists.xen.org/archives/html/xen-devel/2015-05/msg03016.html 
  Windows 7 does not appear to reliably actually shutdown when asked to
  via the ACPI power button.
  
  Once this patch is applied some force pushes will likely be needed in
  order for this to not appear as a regression.
  
  Signed-off-by: Ian Campbell ian.campb...@citrix.com
  Cc: Andrew Cooper andrew.coop...@citrix.com
  Cc: Paul Durrant paul.durr...@citrix.com
  Cc: Ian Jackson ian.jack...@eu.citrix.com,
  Cc: Jan Beulich jbeul...@suse.com
  ---
  Alternatively could make it an allowed/non-blocking failure?
  ---
   make-flight | 1 -
   1 file changed, 1 deletion(-)
  
  diff --git a/make-flight b/make-flight
  index 8a1fceb..5120891 100755
  --- a/make-flight
  +++ b/make-flight
  @@ -206,7 +206,6 @@ do_hvm_win7_x64_tests () {
 job_create_test test-$xenarch$kern-$dom0arch-xl$qemuu_suffix-win7-amd64 \
   test-win xl $xenarch $dom0arch $qemuu_runvar \
   win_image=win7-x64.iso \
  -win_acpi_shutdown=true \
   all_hostflags=$most_hostflags,hvm
   }
   
  -- 
  2.1.4
 
 
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for Xen 4.6 1/4] xen: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 15:01, dario.faggi...@citrix.com wrote:
 On Wed, 2015-05-27 at 12:41 +0100, Jan Beulich wrote:
  On 26.05.15 at 19:18, lichong...@gmail.com wrote:
  On Tue, May 26, 2015 at 4:08 AM, Jan Beulich jbeul...@suse.com wrote:
  On 26.05.15 at 02:05, lichong...@gmail.com wrote:
  +
  +switch ( op-cmd )
  +{
  +case XEN_DOMCTL_SCHEDOP_getvcpuinfo:
  +spin_lock_irqsave(prv-lock, flags);
  +list_for_each( iter, sdom-vcpu )
  +{
  +svc = list_entry(iter, struct rt_vcpu, sdom_elem);
  +vcpuid = svc-vcpu-vcpu_id;
  +
  +local_sched.budget = svc-budget / MICROSECS(1);
  +local_sched.period = svc-period / MICROSECS(1);
  +if ( copy_to_guest_offset(op-u.rtds.vcpus, vcpuid,
 
  What makes you think that the caller has provided enough slots?
  
  This code works in a similar way as the one for XEN_SYSCTL_numainfo.
  So if there is no enough slot in guest space, en error code is
  returned.
 
 I don't think I saw any place you returned an error here. The
 only error being returned is -EFAULT when the copy-out fails.
 
 Look again at XEN_SYSCTL_numainfo, in particular, at this part:
 
 if ( ni-num_nodes  num_nodes )
 {
 ret = -ENOBUFS;
 i = num_nodes;
 }
 else
 i = 0
 
 Which, as you'll appreciate if you check from where ni-num_nodes and
 num_nodes come from, if where the boundary checking we're asking for
 happens.
 
 Note how the 'i' variable is used in the rest of the block, and you'll
 see what we mean, and can do something similar.

I think this was targeted at Chong rather than me (while I was
listed as To, and Chong only on Cc)?

  +}
  +hypercall_preempt_check();
 
  ???
  
  We've talked about this in patch v1
  
 (http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg01152.html).
  When a domain has too many VCPUs, we need to make sure the spin lock
  does not last for too long.
 
 Right, but the call above does nothing like that. Please look at other
 uses of it to understand what needs to be done here.
 
 This being said, would it make sense to put down a threshold, below
 which we don't need to check for preemption (nor to ask to reissue the
 hypercall)?
 
 Something like, if we're dealing with a request for the parameters of N
 (== 16 ? 32 ?) vcpus, it's fine to do them all at once. Above that
 limit, we just do bunches of N vcpus, and then do a preempt check. What
 do you think Jan? And what do you think a suitable limit would be?

I have no problem with that, or with checking for preemption only
every so many iterations.

 In fact, apart from the fact that it's used incorrectly, I don't think
 that checking for preemption after _each_ step of the loop make much
 sense...

Whether checking at the beginning (but not for the first iteration)
or at the end (but not for the last one) doesn't really matter.

  --- a/xen/common/schedule.c
  +++ b/xen/common/schedule.c
  @@ -1104,6 +1104,30 @@ long sched_adjust(struct domain *d, struct 
  xen_domctl_scheduler_op *op)
   return ret;
   }
 
  +/* Adjust scheduling parameter for the vcpus of a given domain. */
  +long sched_adjust_vcpu(
  +struct domain *d,
  +struct xen_domctl_scheduler_vcpu_op *op)
  +{
  +long ret;
 
  I see no reason for this and the function return type to be long.
  
  The reason is stated in the begining.
 
 In the beginning of what? I don't see any such, and it would also be
 odd for there to be an explanation of why a particular type was chosen.

 As Chong he's saying elsewhere, he's moslty following suit. Should we
 consider a pre-patch making both sched_adjust() and
 sched_adjust_global() retunr int?

Perhaps. But the main point here is that people copying existing code
should sanity check what they copy (so to not spread oddities or even
bugs).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 15:14, andrew.coop...@citrix.com wrote:
 On 29/05/15 14:04, Jan Beulich wrote:
 On 29.05.15 at 14:54, ian.campb...@citrix.com wrote:
 On Fri, 2015-05-22 at 10:08 +0100, Ian Campbell wrote:
 If win7 doesn't shutdown given a power button request I'd be more
 inclined to remove the setting in osstest for those flights and let
 guest-stop go back to being never pass than to start making such changes
 to the VM config which I think would probably break the preceding
 suspend and migration tests (which aren't completely reliable, but are
 far more so than this shutdown one).
 Does anyone have any ideas here or shall I propose:
 Unless we have a way to make an adjustment inside the guest for the
 power button to gain shutdown meaning, I think there's no alternative
 to the change below.
 
 You can avoid advertising S3/S4 in the ACPI tables, which iirc causes
 the same alteration to happen.
 
 Hvmloader uses the platform/acpi_s{3,4} booleans to control whether the
 relevant SSDTs are exposed.

Which libxl even has settings for. That would perhaps be a better first
try than disabling ACPI shutdown.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] efi: Reallocate memory map if ExitBootServices() fails

2015-05-29 Thread Ross Lagerwall
If calling ExitBootServices() fails, the memory map size may increase.
Allocate an extra page the first time around so that we hopefully
shouldn't have to reallocate the memory map. Allocate a new memory map
if needed, although this only a fallback, because some arches may use
boot services to allocate memory which may not work after the first call
to ExitBootServices().

This was seen on the following machine when using the iscsidxe UEFI
driver. The machine would consistently fail the first call to
ExitBootServices().
System Information
Manufacturer: Supermicro
Product Name: X10SLE-F/HF
BIOS Information
Vendor: American Megatrends Inc.
Version: 2.00
Release Date: 04/24/2014

Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
---
 xen/common/efi/boot.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index ef8476c..7d09b39 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -700,7 +700,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 EFI_STATUS status;
 unsigned int i, argc;
 CHAR16 **argv, *file_name, *cfg_file_name = NULL, *options = NULL;
-UINTN map_key, info_size, gop_mode = ~0;
+UINTN memmap_size, map_key, info_size, gop_mode = ~0;
 EFI_HANDLE *handles = NULL;
 EFI_SHIM_LOCK_PROTOCOL *shim_lock;
 EFI_GRAPHICS_OUTPUT_PROTOCOL *gop = NULL;
@@ -1053,14 +1053,23 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 efi_arch_video_init(gop, info_size, mode_info);
 }
 
-efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
- efi_mdesc_size, mdesc_ver);
-efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
-if ( !efi_memmap )
-blexit(LUnable to allocate memory for EFI memory map);
-
 for ( retry = 0; ; retry = 1 )
 {
+efi_bs-GetMemoryMap(memmap_size, NULL, map_key,
+ efi_mdesc_size, mdesc_ver);
+if ( memmap_size  efi_memmap_size )
+{
+efi_memmap_size = memmap_size + EFI_PAGE_SIZE;
+efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
+if ( !efi_memmap )
+{
+if ( retry )
+PrintErr(LUnable to allocate memory for EFI memory map);
+else
+blexit(LUnable to allocate memory for EFI memory map);
+}
+}
+
 status = efi_bs-GetMemoryMap(efi_memmap_size, efi_memmap, map_key,
   efi_mdesc_size, mdesc_ver);
 if ( EFI_ERROR(status) )
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [osstest test] 57401: tolerable FAIL - PUSHED

2015-05-29 Thread Robert Hu
On Thu, 2015-05-28 at 11:26 +0100, Ian Campbell wrote:
 On Thu, 2015-05-28 at 04:49 +, osstest service user wrote:
  flight 57401 osstest real [real]
  http://logs.test-lab.xenproject.org/osstest/logs/57401/
  
  Failures :-/ but no regressions.
 
 Lontao, Robert,
 
 The grub related bits of the nested series are now in the production
 osstest instance.
Yes, I've seen it. Thanks Ian.
 
 I'm sorry but I've had a couple of high priority interrupts wrt
 reviewing the remainder.
 
 Ian.
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/4] libxc: print more error messages when failed

2015-05-29 Thread Andrew Cooper
On 29/05/15 12:37, Wei Liu wrote:
 No functional changes introduced.

 Signed-off-by: Wei Liu wei.l...@citrix.com
 Cc: Ian Campbell ian.campb...@citrix.com
 Cc: Ian Jackson ian.jack...@eu.citrix.com

This patch reminds me of a todo item I have wanted to do for a while
using setvcpucontext rather than mapping gfn 0 and manually inserting
`jmp $0x10`

However, as for the change itself, a definite improvement.

 ---
  tools/libxc/xc_hvm_build_x86.c | 30 +++---
  1 file changed, 27 insertions(+), 3 deletions(-)

 diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
 index 92422bf..df4b7ed 100644
 --- a/tools/libxc/xc_hvm_build_x86.c
 +++ b/tools/libxc/xc_hvm_build_x86.c
 @@ -259,7 +259,10 @@ static int setup_guest(xc_interface *xch,
  
  memset(elf, 0, sizeof(elf));
  if ( elf_init(elf, image, image_size) != 0 )
 +{
 +PERROR(Could not initialise ELF image);
  goto error_out;
 +}
  
  xc_elf_set_logfile(xch, elf, 1);
  
 @@ -522,15 +525,24 @@ static int setup_guest(xc_interface *xch,
  DPRINTF(  1GB PAGES: 0x%016lx\n, stat_1gb_pages);
  
  if ( loadelfimage(xch, elf, dom, page_array) != 0 )
 +{
 +PERROR(Could not load ELF image);
  goto error_out;
 +}
  
  if ( loadmodules(xch, args, m_start, m_end, dom, page_array) != 0 )
 -goto error_out;
 +{
 +PERROR(Could not load ACPI modules);
 +goto error_out;
 +}
  
  if ( (hvm_info_page = xc_map_foreign_range(
xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
HVM_INFO_PFN)) == NULL )
 +{
 +PERROR(Could not map hvm info page);
  goto error_out;
 +}
  build_hvm_info(hvm_info_page, args);
  munmap(hvm_info_page, PAGE_SIZE);
  
 @@ -547,7 +559,10 @@ static int setup_guest(xc_interface *xch,
  }
  
  if ( xc_clear_domain_pages(xch, dom, special_pfn(0), NR_SPECIAL_PAGES) )
 -goto error_out;
 +{
 +PERROR(Could not clear special pages);
 +goto error_out;
 +}
  
  xc_hvm_param_set(xch, dom, HVM_PARAM_STORE_PFN,
   special_pfn(SPECIALPAGE_XENSTORE));
 @@ -580,7 +595,10 @@ static int setup_guest(xc_interface *xch,
  }
  
  if ( xc_clear_domain_pages(xch, dom, ioreq_server_pfn(0), 
 NR_IOREQ_SERVER_PAGES) )
 -goto error_out;
 +{
 +PERROR(Could not clear ioreq page);
 +goto error_out;
 +}
  
  /* Tell the domain where the pages are and how many there are */
  xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_SERVER_PFN,
 @@ -595,7 +613,10 @@ static int setup_guest(xc_interface *xch,
  if ( (ident_pt = xc_map_foreign_range(
xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
special_pfn(SPECIALPAGE_IDENT_PT))) == NULL )
 +{
 +PERROR(Could not map special page ident_pt);
  goto error_out;
 +}
  for ( i = 0; i  PAGE_SIZE / sizeof(*ident_pt); i++ )
  ident_pt[i] = ((i  22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
 _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
 @@ -610,7 +631,10 @@ static int setup_guest(xc_interface *xch,
  char *page0 = xc_map_foreign_range(
  xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, 0);
  if ( page0 == NULL )
 +{
 +PERROR(Could not map page0);
  goto error_out;
 +}
  page0[0] = 0xe9;
  *(uint32_t *)page0[1] = entry_eip - 5;
  munmap(page0, PAGE_SIZE);


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Andrew Cooper
On 29/05/15 14:04, Jan Beulich wrote:
 On 29.05.15 at 14:54, ian.campb...@citrix.com wrote:
 On Fri, 2015-05-22 at 10:08 +0100, Ian Campbell wrote:
 If win7 doesn't shutdown given a power button request I'd be more
 inclined to remove the setting in osstest for those flights and let
 guest-stop go back to being never pass than to start making such changes
 to the VM config which I think would probably break the preceding
 suspend and migration tests (which aren't completely reliable, but are
 far more so than this shutdown one).
 Does anyone have any ideas here or shall I propose:
 Unless we have a way to make an adjustment inside the guest for the
 power button to gain shutdown meaning, I think there's no alternative
 to the change below.

You can avoid advertising S3/S4 in the ACPI tables, which iirc causes
the same alteration to happen.

Hvmloader uses the platform/acpi_s{3,4} booleans to control whether the
relevant SSDTs are exposed.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Qemu-devel] [PATCH 1/1] Fix device listener interface for PCI to PCI bridges

2015-05-29 Thread Don Slutz
On 05/28/15 17:05, Michael S. Tsirkin wrote:
 On Thu, May 28, 2015 at 11:03:07PM +0200, Michael S. Tsirkin wrote:
 On Thu, May 28, 2015 at 03:09:48PM -0400, Don Slutz wrote:
 On 05/28/15 08:28, Michael S. Tsirkin wrote:
 On Thu, May 28, 2015 at 07:25:50AM -0400, Don Slutz wrote:
 On 05/28/15 05:30, Michael S. Tsirkin wrote:
 On Thu, May 28, 2015 at 04:46:37AM -0400, Don Slutz wrote:
 The commit 707ff80021ccd7a68f4b3d2c44eebf87efbb41c4 assumed that a
 PCI device has a static address.  This is not true for PCI devices
 that are on the secondary bus of a PCI to PCI bridge.


...

 Not really because it's not just secondary bus number.
 Changing subordinate bus numbers can hide/unhide whole buses.

 You are right.  I have no idea what Paul Durrant was thinking about this
 case.
 And this would apply to PCI_SUBORDINATE_BUS not PCI_SECONDARY_BUS.

 Since at QEMU 2.2 Xen sends all pci-config requests to QEMU, things
 worked.
 Why worked?


The issue is that PCI to PCI bridges were not configured by hvmloader
nor by SeaBIOS.  I do have a patch
to Xen that fixes this (I was rebasing on the latest Xen to post it and
my test case stopped working).  So PCI devices that exist on the
secondary side of a PCI to PCI bridge could only be used (by Linux with
at least pci=assign-busses) after booting.


 It is not clear to me that the complexity of tracking bus
 visibility make sense.  Clearly you do.
 Well what was the point of the change?

To get config cycles that are valid that you do not get today.

 If you don't care that we get irrelevant config cycles why not
 just send them all to QEMU?


That would need to be answered by Paul Durrant and/or other people (see
below)


 Would it be better to have:

 void device_listener_change_pci_bus_num(PCIBus *b, PCIDevice *d, void
 *opaque)
 {
 uint8_t oldbus = (uint8_t)opaque;
 DEVICE_LISTENER_CALL(change_pci_bus_num, Forward, d, oldbus);
 }

 So that the above works, or to add a function to convert args?

 To know whether device is accessible to config cycles, you
 really need to scan the hierarchy from root downwards.

 Yes, that is correct.  However what I am doing here is not
 changing how QEMU checks if the device is accessible, but
 changing what pci config Xen sends to QEMU.  If the change
 to PCI_SECONDARY_BUS hides this PCI to PCI bridge, that is
 not an issue.


-Don Slutz


 Imagine a bridge with secondary bus number 5
 behind another one with subordinate numbers 1-3.
 You should not send conf cycles for bus number 5 to qemu.

 That is correct.  How ever unless Paul Durrant has an example of more then 1
 QEMU where this would make a difference, the cases I am aware of are:

 1) Xen does not send it, and returns 0x (or smaller).

 2) QEMU returns 0x (or smaller).

 I will grant that #1 is faster, but it also is only happening during start
 up and so I do not see the clear win to add more complex code to only do #1.

-Don Slutz
 It's not about faster. I assumed you need to get just the correct
 stuff out to QEMU to gain the better security as
 3996e85c1822e05c50250f8d2d1e57b6bea1229d claims.
 And if that's not the case, please educate me.

I do not know enough about this to answer here.  I have added xen-devel
list to this in the hope
that someone there can answer the better security question.

   -Don Slutz


 -- 
 MST


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling

2015-05-29 Thread Julien Grall
Hi Ian,

NIT: You used my Linaro email which I think is de-activated now :).

On 27/05/2015 13:48, Ian Campbell wrote:
 Here follows draft C based on previous feedback.

 Also at:

 http://xenbits.xen.org/people/ianc/vits/draftC.{pdf,html}

 I think I've captured most of the previous discussion, except where
 explicitly noted by XXX or in other replies, but please do point out
 places where I've missed something.

 One area where I am pretty sure I've dropped the ball is on the
 completion and update of `CREADR`. That conversation ended up
 bifurcating along the 1:N vs N:N mapping scheme lines, and I didn't
 manage to get the various proposals straight. Since we've now agreed on
 N:N hopefully we can reach a conclusion (no pun intended) on the
 completion aspect too (sorry that this probably means rehasing at least
 a subset of the previous thread).

 Ian.

 % Xen on ARM vITS Handling
 % Ian Campbell ian.campb...@citrix.com
 % Draft C

 # Changelog

 ## Since Draft B

 * Details of command translation (thanks to Julien and Vijay)
 * Added background on LPI Translation and Pending tablesd
 * Added background on Collections
 * Settled on `N:N` scheme for vITS:pITS mapping.
 * Rejigged section nesting a bit.
 * Since we now thing translation should be cheap, settle on
translation at scheduling time.
 * Lazy `INVALL` and `SYNC`

 ## Since Draft A

 * Added discussion of when/where command translation occurs.
 * Contention on scheduler lock, suggestion to use SOFTIRQ.
 * Handling of domain shutdown.
 * More detailed discussion of multiple vs single vits pros/cons.

 # Introduction

 ARM systems containing a GIC version 3 or later may contain one or
 more ITS logical blocks. An ITS is used to route Message Signalled
 interrupts from devices into an LPI injection on the processor.

 The following summarises the ITS hardware design and serves as a set
 of assumptions for the vITS software design. (XXX it is entirely
 possible I've horribly misunderstood how this stuff fits
 together). For full details of the ITS see the GIC Architecture
 Specification.

 ## Device Identifiers

 Each device using the ITS is associated with a unique identifier.

 The device IDs are typically described via system firmware, e.g. the
 ACPI IORT table or via device tree.

 The number of device ids is variable and can be discovered via
 `GITS_TYPER.Devbits`. This field allows an ITS to have up to 2^32
 device.

 ## Interrupt Collections

 Each interrupt is a member of an Interrupt Collection. This allows
 software to manage large numbers of physical interrupts with a small
 number of commands rather than issuing one command per interrupt.

 On a system with N processors, the ITS must provide at least N+1
 collections.

 ## Target Addresses

 The Target Address correspond to a specific GIC re-distributor. The format
 of this field depends on the value of the `GITS_TYPER.PTA` bit:

 * 1: the base address of the re-distributor target is used
 * 0: a unique processor number is used. The mapping between the
processor affinity value (`MPIDR`) and the processor number is
discoverable via `GICR_TYPER.ProcessorNumber`.

 ## ITS Translation Table

 Message signalled interrupts are translated into an LPI via an ITS
 translation table which must be configured for each device which can
 generate an MSI.

I'm not sure what is the ITS Table Table. Did you mean Interrupt
Translation Table?


 The ITS translation table maps the device id of the originating devic

s/devic/device/?

 into an Interrupt Collection and then into a target address.

 ## ITS Configuration

 The ITS is configured and managed, including establishing and
 configuring Translation Table for each device, via an in memory ring
 shared between the CPU and the ITS controller. The ring is managed via
 the `GITS_CBASER` register and indexed by `GITS_CWRITER` and
 `GITS_CREADR` registers.

 A processor adds commands to the shared ring and then updates
 `GITS_CWRITER` to make them visible to the ITS controller.

 The ITS controller processes commands from the ring and then updates
 `GITS_CREADR` to indicate the the processor that the command has been
 processed.

 Commands are processed sequentially.

 Commands sent on the ring include operational commands:

 * Routing interrupts to processors;
 * Generating interrupts;
 * Clearing the pending state of interrupts;
 * Synchronising the command queue

 and maintenance commands:

 * Map device/collection/processor;
 * Map virtual interrupt;
 * Clean interrupts;
 * Discard interrupts;

 The field `GITS_CBASER.Size` encodes the number of 4KB pages minus 0
 consisting of the command queue. This field is 8 bits which means the
 maximum size is 2^8 * 4KB = 1MB. Given that each command is 32 bytes,
 there is a maximum of 32768 commands in the queue.

 The ITS provides no specific completion notification
 mechanism. Completion is monitored by a combination of a `SYNC`
 command and either polling `GITS_CREADR` or notification via an
 

[Xen-devel] [PATCH V8] xen/vm_event: Clean up control-register-write vm_events and add XCR0 event

2015-05-29 Thread Razvan Cojocaru
As suggested by Andrew Cooper, this patch attempts to remove
some redundancy and allow for an easier time when adding vm_events
for new control registers in the future, by having a single
VM_EVENT_REASON_WRITE_CTRLREG vm_event type, meant to serve CR0,
CR3, CR4 and (newly introduced) XCR0. The actual control register
will be deduced by the new .index field in vm_event_write_ctrlreg
(renamed from vm_event_mov_to_cr). The patch has also modified
the xen-access.c test - it is now able to log CR3 events.

Signed-off-by: Razvan Cojocaru rcojoc...@bitdefender.com
Acked-by: Jan Beulich jbeul...@suse.com

---
Changes since V7:
 - Explicitly #include asm/hvm/event.h.
---
 tools/libxc/include/xenctrl.h   |9 ++---
 tools/libxc/xc_monitor.c|   40 +++---
 tools/tests/xen-access/xen-access.c |   30 +++--
 xen/arch/x86/hvm/event.c|   55 +-
 xen/arch/x86/hvm/hvm.c  |   11 +++---
 xen/arch/x86/hvm/vmx/vmx.c  |6 ++--
 xen/arch/x86/monitor.c  |   63 +--
 xen/include/asm-x86/domain.h|   12 ++-
 xen/include/asm-x86/hvm/event.h |5 ++-
 xen/include/asm-x86/monitor.h   |2 ++
 xen/include/public/domctl.h |   12 +++
 xen/include/public/vm_event.h   |   29 
 12 files changed, 114 insertions(+), 160 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 09a7450..83fd289 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2375,12 +2375,9 @@ int xc_mem_access_disable_emulate(xc_interface *xch, 
domid_t domain_id);
 void *xc_monitor_enable(xc_interface *xch, domid_t domain_id, uint32_t *port);
 int xc_monitor_disable(xc_interface *xch, domid_t domain_id);
 int xc_monitor_resume(xc_interface *xch, domid_t domain_id);
-int xc_monitor_mov_to_cr0(xc_interface *xch, domid_t domain_id, bool enable,
-  bool sync, bool onchangeonly);
-int xc_monitor_mov_to_cr3(xc_interface *xch, domid_t domain_id, bool enable,
-  bool sync, bool onchangeonly);
-int xc_monitor_mov_to_cr4(xc_interface *xch, domid_t domain_id, bool enable,
-  bool sync, bool onchangeonly);
+int xc_monitor_write_ctrlreg(xc_interface *xch, domid_t domain_id,
+ uint16_t index, bool enable, bool sync,
+ bool onchangeonly);
 int xc_monitor_mov_to_msr(xc_interface *xch, domid_t domain_id, bool enable,
   bool extended_capture);
 int xc_monitor_singlestep(xc_interface *xch, domid_t domain_id, bool enable);
diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c
index 87ad968..63013de 100644
--- a/tools/libxc/xc_monitor.c
+++ b/tools/libxc/xc_monitor.c
@@ -45,8 +45,9 @@ int xc_monitor_resume(xc_interface *xch, domid_t domain_id)
NULL);
 }
 
-int xc_monitor_mov_to_cr0(xc_interface *xch, domid_t domain_id, bool enable,
-  bool sync, bool onchangeonly)
+int xc_monitor_write_ctrlreg(xc_interface *xch, domid_t domain_id,
+ uint16_t index, bool enable, bool sync,
+ bool onchangeonly)
 {
 DECLARE_DOMCTL;
 
@@ -54,39 +55,8 @@ int xc_monitor_mov_to_cr0(xc_interface *xch, domid_t 
domain_id, bool enable,
 domctl.domain = domain_id;
 domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE
 : XEN_DOMCTL_MONITOR_OP_DISABLE;
-domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_MOV_TO_CR0;
-domctl.u.monitor_op.u.mov_to_cr.sync = sync;
-domctl.u.monitor_op.u.mov_to_cr.onchangeonly = onchangeonly;
-
-return do_domctl(xch, domctl);
-}
-
-int xc_monitor_mov_to_cr3(xc_interface *xch, domid_t domain_id, bool enable,
-  bool sync, bool onchangeonly)
-{
-DECLARE_DOMCTL;
-
-domctl.cmd = XEN_DOMCTL_monitor_op;
-domctl.domain = domain_id;
-domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE
-: XEN_DOMCTL_MONITOR_OP_DISABLE;
-domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_MOV_TO_CR3;
-domctl.u.monitor_op.u.mov_to_cr.sync = sync;
-domctl.u.monitor_op.u.mov_to_cr.onchangeonly = onchangeonly;
-
-return do_domctl(xch, domctl);
-}
-
-int xc_monitor_mov_to_cr4(xc_interface *xch, domid_t domain_id, bool enable,
-  bool sync, bool onchangeonly)
-{
-DECLARE_DOMCTL;
-
-domctl.cmd = XEN_DOMCTL_monitor_op;
-domctl.domain = domain_id;
-domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE
-: XEN_DOMCTL_MONITOR_OP_DISABLE;
-domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_MOV_TO_CR4;
+domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG;
+domctl.u.monitor_op.u.mov_to_cr.index = index;

Re: [Xen-devel] [PATCH v2] efi: Reallocate memory map if ExitBootServices() fails

2015-05-29 Thread Jan Beulich
 On 29.05.15 at 15:30, ross.lagerw...@citrix.com wrote:
 @@ -1053,14 +1053,23 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
 *SystemTable)
  efi_arch_video_init(gop, info_size, mode_info);
  }
  
 -efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key,
 - efi_mdesc_size, mdesc_ver);
 -efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);
 -if ( !efi_memmap )
 -blexit(LUnable to allocate memory for EFI memory map);
 -
  for ( retry = 0; ; retry = 1 )
  {
 +efi_bs-GetMemoryMap(memmap_size, NULL, map_key,
 + efi_mdesc_size, mdesc_ver);
 +if ( memmap_size  efi_memmap_size )
 +{
 +efi_memmap_size = memmap_size + EFI_PAGE_SIZE;

I'd really like to make sure the value is a multiple of efi_mdesc_size
(i.e. simply adding EFI_PAGE_SIZE will not do). And that's leaving
aside whether the increase really needs to be that big.

 +efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size);

So considering the ARM side (which this breaks) - did you try at all
the alternative of allocating a slightly larger memory map ahead of
the loop?

 +if ( !efi_memmap )
 +{
 +if ( retry )
 +PrintErr(LUnable to allocate memory for EFI memory 
 map);
 +else
 +blexit(LUnable to allocate memory for EFI memory map);
 +}

Looking at this again (and remembering that PrintErrMesg(), as used
a few lines down, also calls blexit()), I think we should handle this
inside blexit(): If ExitBootServices() was already used enter an
infinite loop right after printing the message(s). I wonder whether,
to signal this, we shouldn't set efi_bs to NULL right after calling
ExitBootServices(), replacing the single remaining valid use with
SystemTable-BootServices-GetMemoryMap().

In no case should you allow execution to fall through here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for Xen 4.6 1/4] xen: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-05-29 Thread Dario Faggioli
On Mon, 2015-05-25 at 19:05 -0500, Chong Li wrote:

 diff --git a/xen/common/domctl.c b/xen/common/domctl.c
 index 28aea55..8143c44 100644
 --- a/xen/common/domctl.c
 +++ b/xen/common/domctl.c
 @@ -841,6 +841,11 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
 u_domctl)
  copyback = 1;
  break;
  
 +case XEN_DOMCTL_scheduler_vcpu_op:
 +ret = sched_adjust_vcpu(d, op-u.scheduler_vcpu_op);
 +copyback = 1;
 +break;
 +

So, thinking more about this, do we really want a full new DOMCTL, or do
we want to instrument XEN_DOMCTL_scheduler_op? That would mean adding
two operations there, and fiddling with struct xen_domctl_scheduler_op?

The problem I see is that, whatever I imagine putting this, there would
be some redundancy.

Quick--dirty, I put together the below... would it possibly make sense?

typedef struct xen_domctl_sched_credit {
uint16_t weight;
uint16_t cap;
} xen_domctl_sched_credit_t;

typedef struct xen_domctl_sched_credit2 {
uint16_t weight;
} xen_domctl_sched_credit2_t;

typedef struct xen_domctl_sched_rtds {
uint32_t
period; 
   
uint32_t budget;
} xen_domctl_sched_rtds_t;

typedef union xen_domctl_schedparam { 
xen_domctl_sched_credit_t credit;
xen_domctl_sched_credit2_t credit2;
xen_domctl_sched_rtds_t rtds;
} xen_domctl_schedparam_t;

typedef struct xen_domctl_schedparam_vcpu {
union {
xen_domctl_sched_credit_t credit;
xen_domctl_sched_credit2_t credit2;
xen_domctl_sched_rtds_t rtds;
} s;   
uint16_t vcpuid; 
} xen_domctl_schedparam_vcpu_t;
DEFINE_XEN_GUEST_HANDLE(xen_domctl_schedparam_vcpu_t);

/* Set or get info? */
#define XEN_DOMCTL_SCHEDOP_putinfo 0
#define XEN_DOMCTL_SCHEDOP_getinfo 1
#define XEN_DOMCTL_SCHEDOP_putvcpuinfo 2
#define XEN_DOMCTL_SCHEDOP_getvcpuinfo 3 
struct xen_domctl_scheduler_op {
uint32_t sched_id;  /* XEN_SCHEDULER_* */
uint32_t cmd;   /* XEN_DOMCTL_SCHEDOP_* */ 
union {
xen_domctl_schedparam_t d;
struct {
XEN_GUEST_HANDLE_64(xen_domctl_schedparam_vcpu_t) vcpus;
uint16_t nr_vcpus;
} v; 
} u;
}; 
typedef struct xen_domctl_scheduler_op xen_domctl_scheduler_op_t;
DEFINE_XEN_GUEST_HANDLE(xen_domctl_scheduler_op_t);

I'm also attaching a (build-tested only) patch to that effect (I'm
killing SEDF in there, so I don't have to care about it, as it's going
away anyway).

Thoughts?

  /*
 - * set/get each vcpu info of each domain
 + * set/get per-domain params of a domain
   */
  static int
  rt_dom_cntl(
 @@ -1115,6 +1115,74 @@ rt_dom_cntl(
  return rc;
  }
  
 +/*
 + * set/get per-vcpu params of a domain
 + */
 +static int
 +rt_vcpu_cntl(
 +const struct scheduler *ops,
 +struct domain *d,
 +struct xen_domctl_scheduler_vcpu_op *op)
 +{

I commented this function already, while replying to Jan.
 
 +/* Adjust scheduling parameter for the vcpus of a given domain. */
 +long sched_adjust_vcpu(
 +struct domain *d,
 +struct xen_domctl_scheduler_vcpu_op *op)
 +{

Actually, I don't think you even need this function.

Especially if we take the path of doing all this as subops of
DOMCTL_scheduler_op, you can just use sched_adjust, do some more
multiplexing in there, and the call the proper scheduler hook.

Thanks and Regards,
Dario

-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)

diff --git a/xen/common/Makefile b/xen/common/Makefile
index 1cddebc..3fdf931 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -31,7 +31,6 @@ obj-y += rbtree.o
 obj-y += rcupdate.o
 obj-y += sched_credit.o
 obj-y += sched_credit2.o
-obj-y += sched_sedf.o
 obj-y += sched_arinc653.o
 obj-y += sched_rt.o
 obj-y += schedule.o
diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 953ecb0..43b086b 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -1039,25 +1039,25 @@ csched_dom_cntl(
 
 if ( op-cmd == XEN_DOMCTL_SCHEDOP_getinfo )
 {
-op-u.credit.weight = sdom-weight;
-op-u.credit.cap = sdom-cap;
+op-u.d.credit.weight = sdom-weight;
+op-u.d.credit.cap = sdom-cap;
 }
 else
 {
 ASSERT(op-cmd == XEN_DOMCTL_SCHEDOP_putinfo);
 
-if ( op-u.credit.weight != 0 )
+if ( op-u.d.credit.weight != 0 )
 {
 if ( !list_empty(sdom-active_sdom_elem) )
 {
 prv-weight -= sdom-weight * sdom-active_vcpu_count;
-prv-weight += op-u.credit.weight * sdom-active_vcpu_count;
+prv-weight += op-u.d.credit.weight * sdom-active_vcpu_count;
 }
-sdom-weight = 

Re: [Xen-devel] [PATCH V7] xen/vm_event: Clean up control-register-write vm_events and add XCR0 event

2015-05-29 Thread Razvan Cojocaru
On 05/29/2015 12:12 PM, Jan Beulich wrote:
 On 28.05.15 at 18:32, rcojoc...@bitdefender.com wrote:
 the macro will probably go out the window (or the first parameter will
 need to be changed to VM_EVENT_##what instead of VM_EVENT_X86_##what) as
 soon as ARM control register write events will come into play.
 
 It's in an x86-specific header, so why should it need to be changed
 for ARM? If ARM will gain a similarly named function, the use sites
 will still all be architecture specific, and hence both declaration and
 whether or not to have a wrapper macro can remain a per-arch
 decision.

You're right. Submitted V8 with the proper header included.


Thanks,
Razvan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Ian Campbell
On Fri, 2015-05-29 at 14:28 +0100, Jan Beulich wrote:
  On 29.05.15 at 15:14, andrew.coop...@citrix.com wrote:
  On 29/05/15 14:04, Jan Beulich wrote:
  On 29.05.15 at 14:54, ian.campb...@citrix.com wrote:
  On Fri, 2015-05-22 at 10:08 +0100, Ian Campbell wrote:
  If win7 doesn't shutdown given a power button request I'd be more
  inclined to remove the setting in osstest for those flights and let
  guest-stop go back to being never pass than to start making such changes
  to the VM config which I think would probably break the preceding
  suspend and migration tests (which aren't completely reliable, but are
  far more so than this shutdown one).
  Does anyone have any ideas here or shall I propose:
  Unless we have a way to make an adjustment inside the guest for the
  power button to gain shutdown meaning, I think there's no alternative
  to the change below.
  
  You can avoid advertising S3/S4 in the ACPI tables, which iirc causes
  the same alteration to happen.
  
  Hvmloader uses the platform/acpi_s{3,4} booleans to control whether the
  relevant SSDTs are exposed.
 
 Which libxl even has settings for. That would perhaps be a better first
 try than disabling ACPI shutdown.

I've mentioned it at least twice now but nobody seems to think it is of
note that even with the current settings it does the right thing about 1
time in 10? Is Win7 really expected to behave so randomly here? 

Also, when the test fails the guest is also not hibernating either.

Plus I've queried the impact of change the ACPI s3/s4 settings on the
save/restore/migrate tests in the flight more than once and nobody has
responded to that either.

Ian


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3] run QEMU as non-root

2015-05-29 Thread Stefano Stabellini
Try to use xen-qemudepriv-$domname first, then xen-qemudepriv-base +
domid, finally xen-qemudepriv-shared and root if everything else fails.

The uids need to be manually created by the user or, more likely, by the
xen package maintainer.

To actually secure QEMU when running in Dom0, we need at least to
deprivilege the privcmd and xenstore interfaces, this is just the first
step in that direction.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com

---
Changes in v3:
- clarify doc
- handle errno == ERANGE
---
 docs/misc/qemu-deprivilege   |   36 +
 tools/libxl/libxl_dm.c   |   60 ++
 tools/libxl/libxl_internal.h |4 +++
 3 files changed, 100 insertions(+)
 create mode 100644 docs/misc/qemu-deprivilege

diff --git a/docs/misc/qemu-deprivilege b/docs/misc/qemu-deprivilege
new file mode 100644
index 000..3a61867
--- /dev/null
+++ b/docs/misc/qemu-deprivilege
@@ -0,0 +1,36 @@
+For security reasons, libxl tries to create QEMU as non-root.
+Libxl looks for the following users in this order:
+
+1) a user named xen-qemudepriv-$domname
+Where $domname is the name of the domain being created.
+To use this, you just need to create a user with the appropriate name
+for each domain. For example, if your virtual machine is named windows:
+
+adduser --system xen-qemudepriv-windows
+
+
+2) a user named xen-qemudepriv-base, adding domid to its uid
+This requires the reservation of 65536 uids from the uid of
+xen-qemudepriv-base to uid+65535.  For example, if xen-qemudepriv-base
+has uid 6000, and the domid is 25, libxl will try to use uid 6025. To
+use this mechanism, you might want to create a large number of users at
+installation time. For example:
+
+adduser --system xen-qemudepriv-base
+for i in '' $(seq 1 65335)
+do
+adduser --system xen-qemudepriv-base$i
+done
+
+
+3) a user named xen-qemudepriv-shared
+As a fall back if both 1) and 2) fail, libxl will use a single user for
+all QEMU instances. The user is named xen-qemudepriv-shared. This is
+less secure but still better than running QEMU as root. Using this is as
+simple as creating just one more user on your host:
+
+adduser --system xen-qemudepriv-shared
+
+
+4) root
+As a last resort, libxl will start QEMU as root.
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 0c6408d..97a921f 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -19,6 +19,8 @@
 
 #include libxl_internal.h
 #include xen/hvm/e820.h
+#include sys/types.h
+#include pwd.h
 
 static const char *libxl_tapif_script(libxl__gc *gc)
 {
@@ -439,6 +441,9 @@ static char ** libxl__build_device_model_args_new(libxl__gc 
*gc,
 int i, connection, devid;
 uint64_t ram_size;
 const char *path, *chardev;
+struct passwd pwd, *user = NULL;
+char *buf = NULL;
+long buf_size;
 
 dm_args = flexarray_make(gc, 16, 1);
 
@@ -878,6 +883,61 @@ static char ** 
libxl__build_device_model_args_new(libxl__gc *gc,
 default:
 break;
 }
+
+buf_size = sysconf(_SC_GETPW_R_SIZE_MAX);
+if (buf_size  0) {
+LOGE(ERROR, sysconf(_SC_GETPW_R_SIZE_MAX) returned error %ld, 
buf_size);
+goto end_search;
+}
+errno = 0;
+
+retry:
+if (errno == ERANGE)
+buf_size += 512;
+buf = libxl__realloc(gc, buf, buf_size);
+if (c_info-name) {
+errno = 0;
+getpwnam_r(libxl__sprintf(gc, %s-%s,
+LIBXL_QEMU_USER_PREFIX, c_info-name),
+pwd, buf, buf_size, user);
+if (user != NULL)
+goto end_search;
+if (errno == ERANGE)
+goto retry;
+}
+
+errno = 0;
+getpwnam_r(LIBXL_QEMU_USER_BASE, pwd, buf, buf_size, user);
+if (user != NULL) {
+errno = 0;
+getpwuid_r(user-pw_uid + guest_domid, pwd, buf, buf_size, user);
+if (user != NULL)
+goto end_search;
+if (errno == ERANGE)
+goto retry;
+}
+if (errno == ERANGE)
+goto retry;
+
+errno = 0;
+getpwnam_r(LIBXL_QEMU_USER_SHARED, pwd, buf, buf_size, user);
+if (user != NULL) {
+LOG(WARN, Could not find user %s-%s or user %s (+domid %d), 
falling back to %s,
+LIBXL_QEMU_USER_PREFIX, c_info-name, LIBXL_QEMU_USER_BASE,
+guest_domid, LIBXL_QEMU_USER_SHARED);
+goto end_search;
+}
+if (errno == ERANGE)
+goto retry;
+
+
+LOG(WARN, Could not find user %s, starting QEMU as root, 
LIBXL_QEMU_USER_SHARED);
+
+end_search:
+if (user) {
+flexarray_append(dm_args, -runas);
+flexarray_append(dm_args, user-pw_name);
+}
 }
 flexarray_append(dm_args, NULL);
 return (char **) flexarray_contents(dm_args);
diff --git 

Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling

2015-05-29 Thread Julien Grall
Hi Vijay,

On 27/05/15 17:44, Vijay Kilari wrote:
 ## Command Translation

 Of the existing GICv3 ITS commands, `MAPC`, `MAPD`, `MAPVI`/`MAPI` are
 potentially time consuming commands as these commands creates entry in
 the Xen ITS structures, which are used to validate other ITS commands.

 `INVALL` and `SYNC` are global and potentially disruptive to other
 guests and so need consideration.

 All other ITS command like `MOVI`, `DISCARD`, `INV`, `INT`, `CLEAR`
 just validate and generate physical command.

 ### `MAPC` command translation

 Format: `MAPC vCID, vTA`

-  The GITS_TYPER.PAtype is emulated as 0. Hence vTA is always represents
   vcpu number. Hence vTA is validated against physical Collection
 IDs by querying
   ITS driver and corresponding Physical Collection ID is retrieved.
-  Each vITS will have cid_map (struct cid_mapping) which holds mapping of

Why do you speak about each vITS? The emulation is only related to one
vITS and not shared...

   Virtual Collection ID(vCID), Virtual Target address(vTA) and
   Physical Collection ID (pCID).
   If vCID entry already exists in cid_map, then that particular
 mapping is updated with
   the new pCID and vTA else new entry is made in cid_map

When you move a collection, you also have to make sure that all the
interrupts associated to it will be delivered to the new target.

I'm not sure what you are suggesting for that...


-  MAPC pCID, pTA physical ITS command is generated

We should not send any MAPC command to the physical ITS. The collection
is already mapped during Xen boot and the guest should not be able to
move the physical collection (they are shared between all the guests and
Xen).


 
Here there is no overhead, the cid_map entries are preallocated
 with size of nr_cpus
in the platform.

As said the number of collection should be at least nr_cpus + 1.

 
 - `MAPC pCID, pTA` physical ITS command is generated

 ### `MAPD` Command translation

 Format: `MAPD device, Valid, ITT IPA, ITT Size`

 `MAPD` is sent with `Valid` bit set if device needs to be added and reset
 when device is removed.

 If `Valid` bit is set:

 - Allocate memory for `its_device` struct
 - Validate ITT IPA  ITT size and update its_device struct
 - Find number of vectors(nrvecs) for this device by querying PCI
   helper function
 - Allocate nrvecs number of LPI XXX nrvecs is a function of `ITT Size`?
 - Allocate memory for `struct vlpi_map` for this device. This
   `vlpi_map` holds mapping of Virtual LPI to Physical LPI and ID.
 - Find physical ITS node with which this device is associated
 - Call `p2m_lookup` on ITT IPA addr and get physical ITT address
 - Validate ITT Size
 - Generate/format physical ITS command: `MAPD, ITT PA, ITT Size`

 Here the overhead is with memory allocation for `its_device` and `vlpi_map`

 XXX Suggestion was to preallocate some of those at device passthrough
 setup time?
 
 If Validation bit is set:
- Query its_device tree and get its_device structure for this device.
- (XXX: If pci device is hidden from dom0, does this device is added
with PHYSDEVOP_pci_device_add hypercall?)
- If device does not exists return
- If device exists in RB-tree then
   - Validate ITT IPA  ITT size and update its_device struct

To validate the ITT size you need to know the number of interrupt ID.

   - Check if device is already assigned to the domain,
 if not then
- Find number of vectors(nrvecs) for this device.
- Allocate nrvecs number of LPI
- Fetch vlpi_map for this device (preallocated at the
 time of adding
  this device to Xen). This vlpi_map holds mapping of
 Virtual LPI to
  Physical LPI and ID.
- Call p2m_lookup on ITT IPA addr and get physical ITT address
- Assign this device to this domain and mark as enabled
   - If this device already exists with the domain (Domain is
 remapping the device)
- Validate ITT IPA  ITT size and update its_device struct
- Call p2m_lookup on ITT IPA addr and get physical ITT address
- Disable all the LPIs of this device by searching
 through vlpi_map and LPI
  configuration table

Disabling all the LPIs associated to a device can be time consuming
because you have to unroute them and make sure that the physical ITS
effectively disabled it before sending the MAPD command.

Given that the software would be buggy if it send a MAPD command without
releasing all the associated interrupt we could ignore the command if
any interrupt is still enabled.

 
   - Generate/format physical ITS command: MAPD, ITT PA, ITT Size
 

 If Validation bit is not set:

 - Validate if the device exits by checking vITS device list
 - Clear all `vlpis` assigned for this device
 - Remove this device from vITS list
 - Free memory

 XXX If preallocation presumably 

Re: [Xen-devel] [PATCH V4 7/8] xen/arm64: increase MAX_VIRT_CPUS to 128 on arm64

2015-05-29 Thread Julien Grall
Hi Chen,

On 28/05/15 11:15, Chen Baozi wrote:
 From: Chen Baozi baoz...@gmail.com
 
 GIC-500 supports up to 128 cores in a single SoC. Increase MAX_VIRT_CPUS
 to 128 on arm64.

This should be the last patch. You still have to modify domain_max_vcpus
before being able to increase the number of vCPUS handled.

 Signed-off-by: Chen Baozi baoz...@gmail.com

with the patch at the end of the series:

Acked-by: Julien Grall julien.gr...@citrix.com

 ---
  xen/arch/arm/vgic-v3.c   | 1 -
  xen/include/asm-arm/config.h | 4 
  2 files changed, 4 insertions(+), 1 deletion(-)
 
 diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
 index 0da031c..be5fff1 100644
 --- a/xen/arch/arm/vgic-v3.c
 +++ b/xen/arch/arm/vgic-v3.c
 @@ -889,7 +889,6 @@ static int vgic_v3_distr_mmio_write(struct vcpu *v, 
 mmio_info_t *info)
  rank = vgic_rank_offset(v, 64, gicd_reg - GICD_IROUTER,
  DABT_DOUBLE_WORD);
  if ( rank == NULL ) goto write_ignore;
 -BUG_ON(v-domain-max_vcpus  8);
  new_irouter = *r;
  vgic_lock_rank(v, rank, flags);
  
 diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
 index 3b23e05..817c216 100644
 --- a/xen/include/asm-arm/config.h
 +++ b/xen/include/asm-arm/config.h
 @@ -47,7 +47,11 @@
  #define NR_CPUS 128
  #endif
  
 +#ifdef CONFIG_ARM_64
 +#define MAX_VIRT_CPUS 128
 +#else
  #define MAX_VIRT_CPUS 8
 +#endif
  
  #define asmlinkage /* Nothing needed */
  
 

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Ian Campbell
On Fri, 2015-05-29 at 15:31 +0100, Andrew Cooper wrote:
 On 29/05/15 15:00, Ian Campbell wrote:
  On Fri, 2015-05-29 at 14:28 +0100, Jan Beulich wrote:
  On 29.05.15 at 15:14, andrew.coop...@citrix.com wrote:
  On 29/05/15 14:04, Jan Beulich wrote:
  On 29.05.15 at 14:54, ian.campb...@citrix.com wrote:
  On Fri, 2015-05-22 at 10:08 +0100, Ian Campbell wrote:
  If win7 doesn't shutdown given a power button request I'd be more
  inclined to remove the setting in osstest for those flights and let
  guest-stop go back to being never pass than to start making such 
  changes
  to the VM config which I think would probably break the preceding
  suspend and migration tests (which aren't completely reliable, but are
  far more so than this shutdown one).
  Does anyone have any ideas here or shall I propose:
  Unless we have a way to make an adjustment inside the guest for the
  power button to gain shutdown meaning, I think there's no alternative
  to the change below.
  You can avoid advertising S3/S4 in the ACPI tables, which iirc causes
  the same alteration to happen.
 
  Hvmloader uses the platform/acpi_s{3,4} booleans to control whether the
  relevant SSDTs are exposed.
  Which libxl even has settings for. That would perhaps be a better first
  try than disabling ACPI shutdown.
  I've mentioned it at least twice now but nobody seems to think it is of
  note that even with the current settings it does the right thing about 1
  time in 10? Is Win7 really expected to behave so randomly here? 
 
 Windows is far from bug free, and also a black box.  I really cannot
 answer whether this is intended behaviour or not.
 
 
  Also, when the test fails the guest is also not hibernating either.
 
  Plus I've queried the impact of change the ACPI s3/s4 settings on the
  save/restore/migrate tests in the flight more than once and nobody has
  responded to that either.
 
 save/restore/migrate necessarily need PV drivers inside the guest, and
 installing PV drivers should set the sane defaults inside the guest.
 
 What does OSSTest do with PV drivers?

Nothing AFAIK, unless they are baked into the ISOs we take from the
XenRT folks (which I think is not the case).

Yet our s/r/m tests of these guests do pass (mostly), I believe because
they end up leveraging HVM s3 but I might be wrong.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 05/10] xsm: add XEN_DOMCTL_soft_reset support

2015-05-29 Thread Jan Beulich
 On 27.05.15 at 17:25, vkuzn...@redhat.com wrote:
 +static XSM_INLINE int xsm_soft_reset(XSM_DEFAULT_ARG struct domain *d1,
 +  struct domain *d2)

Hard tabs used here ...

 +static inline int xsm_soft_reset (xsm_default_t def, struct domain *d1,
 +   struct domain *d2)

... and here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 for Xen 4.6 0/4] Enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-05-29 Thread Meng Xu
2015-05-29 4:15 GMT-07:00 Dario Faggioli dario.faggi...@citrix.com:

 On Tue, 2015-05-26 at 09:59 +0100, Ian Campbell wrote:
  On Mon, 2015-05-25 at 18:59 -0500, Chong Li wrote:
 
  This series arrived in my mailbox as 5 distinct mails.
 
  Please use git send-email such that the mails arrive as a single email
  thread (i.e. each mail as a reply to the previous or to the 0th mail) or
  arrange for the same thing by hand (I highly recommend using git
  send-email though)
 
 Indeed.

 BTW, Chong, v1 was threaded ok, so maybe did something different this
 time when sending the patches. If yes, just don't! :-)



​I think that's because Chong wanted to send different patches in this
patch set to different maintainers. (For example, we don't want to spam
Jan's folder with xl, libxl patch. :-) )

Is it ok to use git send-email --reply-to to attach all four patches to
the cover letter (that is this email thread) of the patch set?


Thanks,

Meng


---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] run QEMU as non-root

2015-05-29 Thread Ian Campbell
On Fri, 2015-05-29 at 14:47 +0100, Stefano Stabellini wrote:
 Try to use xen-qemudepriv-$domname first, then xen-qemudepriv-base +
 domid, finally xen-qemudepriv-shared and root if everything else fails.
 
 The uids need to be manually created by the user or, more likely, by the
 xen package maintainer.
 
 To actually secure QEMU when running in Dom0, we need at least to
 deprivilege the privcmd and xenstore interfaces, this is just the first
 step in that direction.
 
 Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
 
 ---
 Changes in v3:
 - clarify doc
 - handle errno == ERANGE
 ---
  docs/misc/qemu-deprivilege   |   36 +
  tools/libxl/libxl_dm.c   |   60 
 ++
  tools/libxl/libxl_internal.h |4 +++
  3 files changed, 100 insertions(+)
  create mode 100644 docs/misc/qemu-deprivilege
 
 diff --git a/docs/misc/qemu-deprivilege b/docs/misc/qemu-deprivilege
 new file mode 100644
 index 000..3a61867
 --- /dev/null
 +++ b/docs/misc/qemu-deprivilege

Could you name this with a .txt or even a .markdown or .pandoc please.

I think you should also add a reference to this to the toplevel INSTALL
file, so there is some hope of someone seeing it.

And I presume you will update the relevant wikipages (e.g. the
installing from source one) once this change lands?

 +2) a user named xen-qemudepriv-base, adding domid to its uid
 +This requires the reservation of 65536 uids from the uid of
 +xen-qemudepriv-base to uid+65535.  For example, if xen-qemudepriv-base
 +has uid 6000, and the domid is 25, libxl will try to use uid 6025. To
 +use this mechanism, you might want to create a large number of users at
 +installation time. For example:
 +
 +adduser --system xen-qemudepriv-base
 +for i in '' $(seq 1 65335)
 +do
 +adduser --system xen-qemudepriv-base$i
 +done

I'm not sure that adduser is necessarily guaranteed to create users
sequentially, even if it might do so most of the time. Did you check
this?

Perhaps rather than doing base + domid we should just lookup the user
xen-qemudepriv-domidNNN and document creating a user for every domid?
The advantage with that is we don't need to figure out how to document
the creation of 65K _consecutive_ users.

 @@ -439,6 +441,9 @@ static char ** 
 libxl__build_device_model_args_new(libxl__gc *gc,
  int i, connection, devid;
  uint64_t ram_size;
  const char *path, *chardev;
 +struct passwd pwd, *user = NULL;
 +char *buf = NULL;
 +long buf_size;
  
  dm_args = flexarray_make(gc, 16, 1);
  
 @@ -878,6 +883,61 @@ static char ** 
 libxl__build_device_model_args_new(libxl__gc *gc,
  default:
  break;
  }
 +
 +buf_size = sysconf(_SC_GETPW_R_SIZE_MAX);
 +if (buf_size  0) {
 +LOGE(ERROR, sysconf(_SC_GETPW_R_SIZE_MAX) returned error %ld, 
 buf_size);
 +goto end_search;
 +}
 +errno = 0;
 +
 +retry:
 +if (errno == ERANGE)
 +buf_size += 512;
 +buf = libxl__realloc(gc, buf, buf_size);
 +if (c_info-name) {
 +errno = 0;
 +getpwnam_r(libxl__sprintf(gc, %s-%s,
 +LIBXL_QEMU_USER_PREFIX, c_info-name),
 +pwd, buf, buf_size, user);
 +if (user != NULL)
 +goto end_search;
 +if (errno == ERANGE)
 +goto retry;
 +}
 +
 +errno = 0;
 +getpwnam_r(LIBXL_QEMU_USER_BASE, pwd, buf, buf_size, user);
 +if (user != NULL) {
 +errno = 0;
 +getpwuid_r(user-pw_uid + guest_domid, pwd, buf, buf_size, 
 user);
 +if (user != NULL)
 +goto end_search;
 +if (errno == ERANGE)
 +goto retry;
 +}
 +if (errno == ERANGE)
 +goto retry;
 +
 +errno = 0;
 +getpwnam_r(LIBXL_QEMU_USER_SHARED, pwd, buf, buf_size, user);
 +if (user != NULL) {
 +LOG(WARN, Could not find user %s-%s or user %s (+domid %d), 
 falling back to %s,
 +LIBXL_QEMU_USER_PREFIX, c_info-name, 
 LIBXL_QEMU_USER_BASE,
 +guest_domid, LIBXL_QEMU_USER_SHARED);
 +goto end_search;
 +}
 +if (errno == ERANGE)
 +goto retry;

This is all rather repetitive, please add a helper which takes care of
allocating the buffer, retrying and logging on fail.

I don't think you need to worry about making all three attempts use the
same buffer, so the helper doesn't need to have the complexity of doing
that.

That would also let you avoid repeating all three searches when the last
one returns ERANGE.

 +
 +
 +LOG(WARN, Could not find user %s, starting QEMU as root, 
 LIBXL_QEMU_USER_SHARED);
 +
 +end_search:
 +if (user) {
 +flexarray_append(dm_args, -runas);
 +flexarray_append(dm_args, user-pw_name);
 +}
  }
  

Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Paul Durrant
 -Original Message-
 From: Paul Durrant
 Sent: 29 May 2015 16:07
 To: Ian Campbell
 Cc: Jan Beulich; Andrew Cooper; Ian Jackson; xen-devel
 Subject: RE: [Xen-devel] ACPI shutdown unreliable with win7?
 
  -Original Message-
  From: Ian Campbell [mailto:ian.campb...@citrix.com]
  Sent: 29 May 2015 15:36
  To: Paul Durrant
  Cc: Jan Beulich; Andrew Cooper; Ian Jackson; xen-devel
  Subject: Re: [Xen-devel] ACPI shutdown unreliable with win7?
 
  On Fri, 2015-05-29 at 15:25 +0100, Paul Durrant wrote:
-Original Message-
From: Ian Campbell [mailto:ian.campb...@citrix.com]
Sent: 29 May 2015 14:14
To: Jan Beulich
Cc: Andrew Cooper; Paul Durrant; Ian Jackson; xen-devel
Subject: Re: [Xen-devel] ACPI shutdown unreliable with win7?
   
On Fri, 2015-05-29 at 14:04 +0100, Jan Beulich wrote:
  On 29.05.15 at 14:54, ian.campb...@citrix.com wrote:
  On Fri, 2015-05-22 at 10:08 +0100, Ian Campbell wrote:
  If win7 doesn't shutdown given a power button request I'd be
 more
  inclined to remove the setting in osstest for those flights and let
  guest-stop go back to being never pass than to start making such
changes
  to the VM config which I think would probably break the preceding
  suspend and migration tests (which aren't completely reliable, but
  are
  far more so than this shutdown one).
 
  Does anyone have any ideas here or shall I propose:

 Unless we have a way to make an adjustment inside the guest for the
 power button to gain shutdown meaning, I think there's no
  alternative
 to the change below.
   
The strange this is that it does work _sometimes_, either by complete
coincidence or because there is something non-deterministic about
 how
Win7 reacts to this ACPI event.
   
  
   How long is the test waiting for the OS to shut down though? If you
   get unlucky, Windows will wander off to Windows Update, download a
   bazillion patches and take more than an hour to shut down. If you’re
   lucky, it may shut down in 10 seconds or less.
 
  The screenshot in e.g.
  http://logs.test-lab.xenproject.org/osstest/logs/56929/test-amd64-
 amd64-
  xl-qemut-win7-amd64/info.html
  seems to show that the guest isn't even trying to shut down, it's just
  sat there at the desktop:
 
  http://logs.test-lab.xenproject.org/osstest/logs/56929/test-amd64-
 amd64-
  xl-qemut-win7-amd64/win.guest.osstest--vnc.jpeg
 
  I think if it had hit WU there would be activity on the screen?
 
  FWIW We appear to wait 200s, if we were seeing failures due to windows
  update then I'd be inclined to extend that, but I think right now that
  would be premature, unless WU happens with no status on the screen.
 
 
 No, you'd see something. Perhaps our ACPI lid/power switch code is just
 buggy then?
 

I already said this to Ian, but for the record... From my reading of the ACPI 
spec (v 5.0, page 23 glossary entry) the SCI is an active low, shareable, 
level-sensitive interrupt, but our code (pmtimer.c:pmt_update_sci) seems to 
treat it as active high. I'll have a look at the QEMU SCI code as reference, 
but maybe it is just broken.

  Paul

   Paul
 
  Ian

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Paul Durrant
 -Original Message-
 From: Ross Philipson [mailto:ross.philip...@gmail.com]
 Sent: 29 May 2015 16:35
 To: Ian Campbell; Paul Durrant
 Cc: Andrew Cooper; xen-devel; Jan Beulich; Ian Jackson
 Subject: Re: [Xen-devel] ACPI shutdown unreliable with win7?
 
 On 05/29/2015 11:11 AM, Ian Campbell wrote:
  On Fri, 2015-05-29 at 16:06 +0100, Paul Durrant wrote:
  FWIW We appear to wait 200s, if we were seeing failures due to
 windows
  update then I'd be inclined to extend that, but I think right now that
  would be premature, unless WU happens with no status on the screen.
 
 
  No, you'd see something. Perhaps our ACPI lid/power switch code is just
 buggy then?
 
  It seems to work reliably for the WinXP tests, FWIW...
 
  Ian.
 
 
 One thing I find confusing is that the firmware code does not even have
 a power button device (PNP0C0C) or the fixed feature power button that
 is enabled in the FADT (flag == PWR_BUTTON bit 4). So I don't see how
 the shutdown is purely an ACPI function. Is there something else to the
 story? Is it relying on PV tools to do it?
 

Xen (and QEMU seemingly) implement the 'Fixed Power Button' (section 
4.8.2.2.1.1 on my spec) and this requires the PWR_BUTTON flag to be clear 
(according table 4-13). It also does not require a power button device to be 
implemented (which is presumably why this way of doing it was chosen).

  Paul

 Ross
 
 
 
  ___
  Xen-devel mailing list
  Xen-devel@lists.xen.org
  http://lists.xen.org/xen-devel
 
 
 
 --
 Ross Philipson

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Ross Philipson

On 05/29/2015 12:00 PM, Paul Durrant wrote:

-Original Message-
From: Ross Philipson [mailto:ross.philip...@gmail.com]
Sent: 29 May 2015 16:35
To: Ian Campbell; Paul Durrant
Cc: Andrew Cooper; xen-devel; Jan Beulich; Ian Jackson
Subject: Re: [Xen-devel] ACPI shutdown unreliable with win7?

On 05/29/2015 11:11 AM, Ian Campbell wrote:

On Fri, 2015-05-29 at 16:06 +0100, Paul Durrant wrote:

FWIW We appear to wait 200s, if we were seeing failures due to

windows

update then I'd be inclined to extend that, but I think right now that
would be premature, unless WU happens with no status on the screen.



No, you'd see something. Perhaps our ACPI lid/power switch code is just

buggy then?


It seems to work reliably for the WinXP tests, FWIW...

Ian.



One thing I find confusing is that the firmware code does not even have
a power button device (PNP0C0C) or the fixed feature power button that
is enabled in the FADT (flag == PWR_BUTTON bit 4). So I don't see how
the shutdown is purely an ACPI function. Is there something else to the
story? Is it relying on PV tools to do it?



Xen (and QEMU seemingly) implement the 'Fixed Power Button' (section 
4.8.2.2.1.1 on my spec) and this requires the PWR_BUTTON flag to be clear 
(according table 4-13). It also does not require a power button device to be 
implemented (which is presumably why this way of doing it was chosen).


Ah got it. I read the spec backwards - that the bit was set not clear 
for the fixed feature power button :)





   Paul


Ross




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel




--
Ross Philipson



--
Ross Philipson

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V4 7/8] xen/arm64: increase MAX_VIRT_CPUS to 128 on arm64

2015-05-29 Thread Andrew Cooper
On 29/05/15 17:45, Julien Grall wrote:
 On 29/05/15 17:41, Andrew Cooper wrote:
 On 29/05/15 17:20, Julien Grall wrote:
 On 29/05/15 16:51, Julien Grall wrote:
 diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
 index 3b23e05..817c216 100644
 --- a/xen/include/asm-arm/config.h
 +++ b/xen/include/asm-arm/config.h
 @@ -47,7 +47,11 @@
  #define NR_CPUS 128
  #endif
  
 +#ifdef CONFIG_ARM_64
 +#define MAX_VIRT_CPUS 128
 +#else
  #define MAX_VIRT_CPUS 8
 +#endif
 Looking to the last patch, the usage of MAX_VIRT_CPUS is now minimal.
 Can't finish to replace MAX_VIRT_CPUS to another corresponding value and
 drop the define?
 You cant drop MAX_VIRT_CPUS (I tried this when introducing
 max_domain_vcpus()). It is used for some conditional compilation in
 common code.
 AFAICT only in the event channel code to avoid allocating memory when
 less than 64 vCPU is used.

Correct.


 Anyway, if we can drop it. I would add a check in domain_max_vcpus for
 safety.

It has a second use currently for auditing the dom0_max_vcpus parameter
(for both x86 and ARM), but could easily be replaced with a
max_domain_vcpus() call.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V4 5/8] tools/libxl: Set 'reg' of cpu node equal to MPIDR affinity for domU

2015-05-29 Thread Julien Grall
On 29/05/15 16:55, Ian Campbell wrote:
 On Fri, 2015-05-29 at 16:44 +0100, Julien Grall wrote:
 
 +name = GCSPRINTF(cpu@%lx, mpidr_aff);

 It's not necessary to change the cpu@.
 
 AIUI it is conventional in DT for this to match the first reg entry.

Well, it's conventional when the reg field is containing an address.

It's not the case of the cpu node as reg doesn't contain an address.
There is a mix of both in the different DTS.

Although, increment an ID make easier to read the dumped DTS for debugging.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable test] 56759: regressions - FAIL

2015-05-29 Thread Ian Campbell
On Wed, 2015-05-27 at 17:04 +0100, Ian Campbell wrote:
 Looking at the netback side though it seems like netback_remove is
 switching to state=Closed _before_ it calls kobject_uevent(...,
 KOBJ_OFFLINE) and it is this which generates the call to netback_uevent
 which tries and fails to read script and produces the error message.

I've just sent out a patch which fixes this issue, although I am still
at a loss to explain why we have only started seeing this now and only
under such specific circumstances.

 I'm still slightly concerned that perhaps the new spinlock stuff has
 some sort of bad behaviour either on arndale specifically or more
 generally for ARM systems which has pushed this particular case over the
 edge.

I did run some benchmarks (hackbench+fio on arndale domU and hackbench
on midway dom0) with and without the ticket locks and the results were
close enough that I'm basically not too worried that there is something
wrong with the ticket locks on ARM.

It still niggles somewhat not to have a good theory about why this
change had this seemingly random effect, but I've not got any good ideas
for avenues to explore and I've got other things to do so I think I'll
leave it at that.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] ACPI shutdown unreliable with win7?

2015-05-29 Thread Ross Philipson

On 05/29/2015 11:11 AM, Ian Campbell wrote:

On Fri, 2015-05-29 at 16:06 +0100, Paul Durrant wrote:

FWIW We appear to wait 200s, if we were seeing failures due to windows
update then I'd be inclined to extend that, but I think right now that
would be premature, unless WU happens with no status on the screen.



No, you'd see something. Perhaps our ACPI lid/power switch code is just buggy 
then?


It seems to work reliably for the WinXP tests, FWIW...

Ian.



One thing I find confusing is that the firmware code does not even have 
a power button device (PNP0C0C) or the fixed feature power button that 
is enabled in the FADT (flag == PWR_BUTTON bit 4). So I don't see how 
the shutdown is purely an ACPI function. Is there something else to the 
story? Is it relying on PV tools to do it?


Ross




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel




--
Ross Philipson

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen: netback: fix error printf format string.

2015-05-29 Thread Ian Campbell
drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’:
drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects 
argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=]
(txreq.offset~PAGE_MASK) + txreq.size);
^

txreq.offset and .size are uint16_t fields.

Signed-off-by: Ian Campbell ian.campb...@citrix.com
---
 drivers/net/xen-netback/netback.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 4de46aa..a3b1cbb 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1248,7 +1248,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
*queue,
/* No crossing a page as the payload mustn't fragment. */
if (unlikely((txreq.offset + txreq.size)  PAGE_SIZE)) {
netdev_err(queue-vif-dev,
-  txreq.offset: %x, size: %u, end: %lu\n,
+  txreq.offset: %x, size: %u, end: %u\n,
   txreq.offset, txreq.size,
   (txreq.offset~PAGE_MASK) + txreq.size);
xenvif_fatal_tx_err(queue-vif);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >