Re: [Xen-devel] [PATCH] build: disable default built-in rules and variables

2015-12-01 Thread Andrew Cooper
On 01/12/2015 21:19, Doug Goldstein wrote:
> Disable the built-in rules and variables from GNU make to improve
> build performance and avoid awkward corner cases with the built-in
> rules. Currently none of the implicit rules are used but this is helpful
> to do when developing changes to the build system.
>
> Signed-off-by: Doug Goldstein 

Reviewed-by: Andrew Cooper 

Trying this out, it reliably drops Xen's build time from ~58s to ~52s on
the dev box I have to hand.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/pvhvm: Support more than 32 VCPUs when migrating (v3).

2015-12-01 Thread Konrad Rzeszutek Wilk
On Fri, Nov 13, 2015 at 02:49:22PM +, Ian Campbell wrote:
> On Fri, 2015-11-13 at 09:42 -0500, Konrad Rzeszutek Wilk wrote:
> > On Thu, Nov 12, 2015 at 04:40:06PM +, Ian Campbell wrote:
> > > On Fri, 2015-07-10 at 14:57 -0400, Konrad Rzeszutek Wilk wrote:
> > > > On Fri, Jul 10, 2015 at 02:37:46PM -0400, Konrad Rzeszutek Wilk
> > > > wrote:
> > > > > When Xen migrates an HVM guest, by default its shared_info can
> > > > > only hold up to 32 CPUs. As such the hypercall
> > > > > VCPUOP_register_vcpu_info was introduced which allowed us to
> > > > > setup per-page areas for VCPUs. This means we can boot PVHVM
> > > > > guest with more than 32 VCPUs. During migration the per-cpu
> > > > > structure is allocated freshly by the hypervisor (vcpu_info_mfn
> > > > > is set to INVALID_MFN) so that the newly migrated guest
> > > > > can make an VCPUOP_register_vcpu_info hypercall.
> > > > > 
> > > > > Unfortunatly we end up triggering this condition in Xen:
> > > > > /* Run this command on yourself or on other offline VCPUS. */
> > > > >  if ( (v != current) && !test_bit(_VPF_down, >pause_flags) )
> > > > > 
> > > > > which means we are unable to setup the per-cpu VCPU structures
> > > > > for running vCPUS. The Linux PV code paths make this work by
> > > > > iterating over every vCPU with:
> > > > > 
> > > > >  1) is target CPU up (VCPUOP_is_up hypercall?)
> > > > >  2) if yes, then VCPUOP_down to pause it.
> > > > >  3) VCPUOP_register_vcpu_info
> > > > >  4) if it was down, then VCPUOP_up to bring it back up
> > > > > 
> > > > > But since VCPUOP_down, VCPUOP_is_up, and VCPUOP_up are
> > > > > not allowed on HVM guests we can't do this. However with the
> > > > > Xen git commit f80c5623a126afc31e6bb9382268d579f0324a7a
> > > > > ("xen/x86: allow HVM guests to use hypercalls to bring up vCPUs"")
> > > > 
> > > >  I was in my local tree which was Roger's 'hvm_without_dm_v3'
> > > > looking at patches and spotted this - and thought it was already in!
> > > > 
> > > > Sorry about this patch - and please ignore it until the VCPU_op*
> > > > can be used by HVM guests.
> > > 
> > > FYI I just tripped over this while implementing ARM save/restore (in
> > > that I
> > > couldn't figure out HTF HVM VCPUs > MAX_LEGACY_VCPUS were getting their
> > > vcpu_info re-registered, which turns out to be because they aren't...).
> > > 
> > > ARM also lack the VCPU_up/down/is_up hypercalls. My plan there is
> > > simply to
> > > use on_each_cpu to do it, I can get away with this on ARM because the
> > > necessary infra (IPIs etc) are provided by the h/w virt platform (i.e.
> > > look
> > > native) so there is no reliance on Xen infra being fully up.
> > > 
> > > Not sure if that is also true of x86/PVHVM but thought I would mention
> > > it
> > > in case it seemed preferable to you.
> > 
> > Yes, but we have a hard-limit of 32 CPUs on 'HVM' guests and on the
> > shared_info structure. Hence to go above that we need to use the VCPU_op
> > calls.
> 
> I think you mean s/HVM/Linux PVHVM/? Because HVM_MAX_VCPUS in the
> hypervisor is 128.

Thank you. I confused the CPU support with event channels support.

The issue was the shared page and the events - the shared
page by default only has enough slots for 32 CPUs. Anything above that
and we need to use the fancy VCPUOP hypercalls. And that is exactly
what we end up doing during bootup. But if we do save/restore we get to
the issue that only one hypercall is allowed on HVM: VCPUOP_register_vcpu_info

And that hypercall can only be called on for vCPUS which are offline
or if we (the VCPU) is calling it. However the restore path brings
every CPU from CPU0. We could send an IPI to the other CPU such that
it would call VCPUOP_register_vcpu_info.. but IPIs are using events and
the events are not working past 32CPUs unless you call 
VCPUOP_register_vcpu_info.

> 
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [V2 PATCH 1/9] x86/hvm: pkeys, add pkeys support for cpuid handling

2015-12-01 Thread Andrew Cooper
On 27/11/15 09:51, Huaitong Han wrote:
> diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
> index e60929d..84d3a10 100644
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -264,8 +264,9 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 
> *c)
>   /* Intel-defined flags: level 0x0007 */
>   if ( c->cpuid_level >= 0x0007 )
>   cpuid_count(0x0007, 0, ,
> - 
> >x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)],
> - , );
> +>x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)],
> +>x86_capability[cpufeat_word(X86_FEATURE_PKU)],
> +);

You have some indentation issues here.

>  }
>  
>  /*
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index ea982e2..0adafe9 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4582,6 +4582,18 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, 
> unsigned int *ebx,
>  /* Don't expose INVPCID to non-hap hvm. */
>  if ( (count == 0) && !hap_enabled(d) )
>  *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> +
> +/* X86_FEATURE_PKU is not yet implemented for shadow paging
> + *
> + * Hypervisor gets guest pkru value from XSAVE state, because
> + * Hypervisor CR4 without X86_CR4_PKE disables RDPKRU instruction.
> + */
> +if ( (count == 0) && (!hap_enabled(d) || !cpu_has_xsave) )
> +*ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
> +
> +if ( (count == 0) && cpu_has_pku )
> +*ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
> + cpufeat_mask(X86_FEATURE_OSPKE) : 0;

This is still buggy.  cpu_has_pku has no relevance to whether OSPKE
becomes visible.

Visibility of OSPKE is determined solely by v->arch.hvm_vcpu.guest_cr[4]
& X86_CR4_PKE and nothing else.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [V2 PATCH 5/9] x86/hvm: pkeys, add functions to get pkeys value from PTE

2015-12-01 Thread Andrew Cooper
On 27/11/15 09:51, Huaitong Han wrote:
> diff --git a/xen/include/asm-x86/x86_64/page.h 
> b/xen/include/asm-x86/x86_64/page.h
> index 19ab4d0..49343ec 100644
> --- a/xen/include/asm-x86/x86_64/page.h
> +++ b/xen/include/asm-x86/x86_64/page.h
> @@ -134,6 +134,25 @@ typedef l4_pgentry_t root_pgentry_t;
>  #define get_pte_flags(x) (((int)((x) >> 40) & ~0xFFF) | ((int)(x) & 0xFFF))
>  #define put_pte_flags(x) (((intpte_t)((x) & ~0xFFF) << 40) | ((x) & 0xFFF))
>  
> +/*
> + * Protection keys define a new 4-bit protection key field
> + * (PKEY) in bits 62:59 of leaf entries of the page tables.
> + * This corresponds to bit 22:19 of a 24-bit flags.
> + *
> + * Notice: Bit 22 is used by _PAGE_GNTTAB which is visible to PV guests,
> + * so Protection keys must be disabled on PV guests.
> + */
> +#define _PAGE_PKEY_BIT0 (1u<<19)   /* Protection Keys, bit 1/4 */
> +#define _PAGE_PKEY_BIT1 (1u<<20)   /* Protection Keys, bit 2/4 */
> +#define _PAGE_PKEY_BIT2 (1u<<21)   /* Protection Keys, bit 3/4 */
> +#define _PAGE_PKEY_BIT3 (1u<<22)   /* Protection Keys, bit 4/4 */
> +
> +/* The order of mask _PAGE_PKEY_BIT0 is 19 */
> +#define get_pte_pkeys(x) ((int)(get_pte_flags(x) >> 19) & 0xF)
> +
> +/* Take pkey first bit as pkey feature */
> +#define _PAGE_PKEY_BIT _PAGE_PKEY_BIT0

You have not addressed Jan's feedback from v1.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [V2 PATCH 7/9] x86/hvm: pkeys, add pkeys support for guest_walk_tables

2015-12-01 Thread Andrew Cooper
On 27/11/15 09:52, Huaitong Han wrote:
> This patch adds pkeys support for guest_walk_tables.
>
> Signed-off-by: Huaitong Han 

You must CC the appropriate maintainer for this patch, which includes
the x86 MM maintainer.

> ---
>  xen/arch/x86/mm/guest_walk.c  | 65 
> +++
>  xen/include/asm-x86/hvm/hvm.h |  2 ++
>  2 files changed, 67 insertions(+)
>
> diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
> index 18d1acf..3e443b3 100644
> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -31,6 +31,8 @@ asm(".file \"" __OBJECT_FILE__ "\"");
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 

I can see why you need xstate.h, but I why do you need i387.h ?

>  
>  extern const uint32_t gw_page_flags[];
>  #if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS
> @@ -90,6 +92,53 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, 
> int set_dirty)
>  return 0;
>  }
>  
> +#if GUEST_PAGING_LEVELS >= 4
> +uint32_t leaf_pte_pkeys_check(struct vcpu *vcpu, uint32_t pfec,
> +uint32_t pte_access, uint32_t pte_pkeys)

This is a latent linking bug for the future when 5 levels comes along.

It will probably be best to use the same trick as gw_page_flags to
compile it once but use it multiple times.

> +{
> +bool_t pkru_ad, pkru_wd;
> +bool_t ff, wf, uf, rsvdf, pkuf;
> +unsigned int pkru = 0;
> +
> +uf = pfec & PFEC_user_mode;
> +wf = pfec & PFEC_write_access;
> +rsvdf = pfec & PFEC_reserved_bit;
> +ff = pfec & PFEC_insn_fetch;
> +pkuf = pfec & PFEC_prot_key;
> +
> +if ( !cpu_has_xsave || !pkuf || is_pv_vcpu(vcpu) )
> +return 0;
> +
> +vcpu_save_fpu(vcpu);
> +pkru = *(unsigned int*)get_xsave_addr(vcpu->arch.xsave_area, 
> XSTATE_PKRU);

Style.

> +if ( unlikely(pkru) )
> +{
> +/*
> + * PKU:  additional mechanism by which the paging controls
> + * access to user-mode addresses based on the value in the
> + * PKRU register. A fault is considered as a PKU violation if all
> + * of the following conditions are ture:
> + * 1.CR4_PKE=1.
> + * 2.EFER_LMA=1.
> + * 3.page is present with no reserved bit violations.
> + * 4.the access is not an instruction fetch.
> + * 5.the access is to a user page.
> + * 6.PKRU.AD=1
> + *   or The access is a data write and PKRU.WD=1
> + *and either CR0.WP=1 or it is a user access.
> + */
> +pkru_ad = READ_PKRU_AD(pkru, pte_pkeys);
> +pkru_wd = READ_PKRU_AD(pkru, pte_pkeys);
> +if ( hvm_pku_enabled(vcpu) && hvm_long_mode_enabled(vcpu) &&
> +!rsvdf && !ff && (pkru_ad ||
> +(pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf
> +return 1;
> +}
> +
> +return 0;
> +}
> +#endif
> +
>  /* Walk the guest pagetables, after the manner of a hardware walker. */
>  /* Because the walk is essentially random, it can cause a deadlock 
>   * warning in the p2m locking code. Highly unlikely this is an actual
> @@ -106,6 +155,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>  #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
>  guest_l3e_t *l3p = NULL;
>  guest_l4e_t *l4p;
> +uint32_t pkeys;
>  #endif
>  uint32_t gflags, mflags, iflags, rc = 0;
>  bool_t smep = 0, smap = 0;
> @@ -190,6 +240,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>  goto out;
>  /* Get the l3e and check its flags*/
>  gw->l3e = l3p[guest_l3_table_offset(va)];
> +pkeys = guest_l3e_get_pkeys(gw->l3e);
>  gflags = guest_l3e_get_flags(gw->l3e) ^ iflags;
>  if ( !(gflags & _PAGE_PRESENT) ) {
>  rc |= _PAGE_PRESENT;
> @@ -199,6 +250,9 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>  
>  pse1G = (gflags & _PAGE_PSE) && guest_supports_1G_superpages(v); 
>  
> +if (pse1G && leaf_pte_pkeys_check(v, pfec, gflags, pkeys))
> +rc |= _PAGE_PKEY_BIT;
> +
>  if ( pse1G )
>  {
>  /* Generate a fake l1 table entry so callers don't all 
> @@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>  
>  pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); 
>  
> +#if GUEST_PAGING_LEVELS >= 4
> +pkeys = guest_l2e_get_pkeys(gw->l2e);
> +if (pse2M && leaf_pte_pkeys_check(v, pfec, gflags, pkeys))
> +rc |= _PAGE_PKEY_BIT;
> +#endif
> +
>  if ( pse2M )
>  {
>  /* Special case: this guest VA is in a PSE superpage, so there's
> @@ -330,6 +390,11 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>  goto out;
>  }
>  rc |= ((gflags & mflags) ^ mflags);
> +#if GUEST_PAGING_LEVELS >= 4
> +pkeys = guest_l1e_get_pkeys(gw->l1e);
> +if (leaf_pte_pkeys_check(v, pfec, gflags, pkeys))
> +rc |= _PAGE_PKEY_BIT;
> +#endif

As 

[Xen-devel] [PATCH] build: disable default built-in rules and variables

2015-12-01 Thread Doug Goldstein
Disable the built-in rules and variables from GNU make to improve
build performance and avoid awkward corner cases with the built-in
rules. Currently none of the implicit rules are used but this is helpful
to do when developing changes to the build system.

Signed-off-by: Doug Goldstein 
---
 xen/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/xen/Makefile b/xen/Makefile
index fa9cf0a..3a1de99 100644
--- a/xen/Makefile
+++ b/xen/Makefile
@@ -15,6 +15,9 @@ export XEN_BUILD_HOST ?= $(shell hostname)
 export BASEDIR := $(CURDIR)
 export XEN_ROOT := $(BASEDIR)/..
 
+# Do not use make's built-in rules and variables
+MAKEFLAGS += -rR
+
 EFI_MOUNTPOINT ?= $(BOOT_DIR)/efi
 
 .PHONY: default
-- 
2.4.10


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-pciback: fix up cleanup path when alloc fails

2015-12-01 Thread Doug Goldstein
On 12/1/15 1:35 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Dec 01, 2015 at 11:47:17AM -0500, Konrad Rzeszutek Wilk wrote:
>> On Thu, Nov 26, 2015 at 02:32:39PM -0600, Doug Goldstein wrote:
>>> When allocating a pciback device fails, avoid the possibility of a
>>> use after free.
>>
>> Reviewed-by: Konrad Rzeszutek Wilk 
>>
>> Ugh, and it looks like xen-blkfront has the same issue.
> 
>  Nope. No problems there.
> 
> The ->probe if it fails (so xenbus_dev_probe returns the error)
> ends up in the 'probe_failed' label in really_probe which takes care by doing:
> 
> dev_set_drvdata(dev, NULL);
> 
> Wheew!
> 
> either way the patch should go in, but the 'possibility' should
> be perhaps removed? Unless there is some other path I missed?

I put 'possibility' in there because it will only happen when the
function returns failure. I was also trying to not make it sound panicky
I guess. I can resubmit the patch with that word dropped if that's
desirable.

> 
>>
>>>
>>> Reported-by: Jonathan Creekmore 
>>> Signed-off-by: Doug Goldstein 
>>> ---
>>>  drivers/xen/xen-pciback/xenbus.c | 4 +++-
>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/xen/xen-pciback/xenbus.c 
>>> b/drivers/xen/xen-pciback/xenbus.c
>>> index 98bc345..4843741 100644
>>> --- a/drivers/xen/xen-pciback/xenbus.c
>>> +++ b/drivers/xen/xen-pciback/xenbus.c
>>> @@ -44,7 +44,6 @@ static struct xen_pcibk_device *alloc_pdev(struct 
>>> xenbus_device *xdev)
>>> dev_dbg(>dev, "allocated pdev @ 0x%p\n", pdev);
>>>  
>>> pdev->xdev = xdev;
>>> -   dev_set_drvdata(>dev, pdev);
>>>  
>>> mutex_init(>dev_lock);
>>>  
>>> @@ -58,6 +57,9 @@ static struct xen_pcibk_device *alloc_pdev(struct 
>>> xenbus_device *xdev)
>>> kfree(pdev);
>>> pdev = NULL;
>>> }
>>> +
>>> +   dev_set_drvdata(>dev, pdev);
>>> +
>>>  out:
>>> return pdev;
>>>  }
>>> -- 
>>> 2.4.10
>>>


-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Hotplugged devices in Xen 4.5 and domain reboot

2015-12-01 Thread Doug Goldstein
On 12/1/15 11:05 AM, Ian Campbell wrote:
> On Tue, 2015-12-01 at 18:48 +0200, Iurii Mykhalskyi wrote:
>> Thanks to all for a replays, please see my answers below:
>>
>> On 12/01/2015 05:29 PM, Wei Liu wrote:
>>> On Tue, Dec 01, 2015 at 04:58:55PM +0200, Iurii Mykhalskyi wrote:
 Our real usb mass-storage device are located at driver domain (DomD).
 So we
 setup second block-device backend there.

 To hotplug usb mass-storage from DomD we use follow command:

 xl block-attach domU_id phy:/bla-bla,xvda10,w,backend="DomD"

>>> What happens if you run this in Dom0? I guess DomD doesn't respond to
>>> the request?
>> Yes, there is no responded from domD, because actual storage device
>> are located there, and toolstack stuck on real device existence check.
> 
> This check is a toolstack bug which should be fixed. We've squashed some of
> them at various points, but I'd not be surprised if others remain.
> 
 There was no support of attaching block-device in runtime from domain
 other
 to Domain-0, so we have made some hacks to allow call block-attach
 command
 from non-dom0 privileged domain.
>>> So this is a special use case. This is the first time I know people
>>> actually run xl block-attach in driver domain.
>> Yes, this is special case and we this by our solution design.
 One of patches was - don't update
 /var/lib/xen/userdata-d.$DOMID-$UUID.libxl-json during execution of
 this
 command (because this log located on dom0 rootfs and we don't have
 any
 access to it from DomD). So, there is no different in configs before
 and
 after hotplug.

>>> The state of $DOMID is recorded in libxl-json file. No wonder you lose
>>> all state.
>>>
>>> But even if you write those states, they are going to be inside driver
>>> domain.  There is no way at the moment to synthesise the state inside
>>> Dom0 and DomD into one. There is also difficulty in how you can split
>>> the synthesised and dispatch the states to multiple entities again when
>>> rebuilding a domain.
>>>
>>> So I think having multiple entities managing state of one single domain
>>> is bad. I think the proper way of making it work is to make hotplug
>>> device from domain other than Dom0 work.
>>>
>>> There is a daemon "xl devd" in driver domain. We might be able to teach
>>> it to response to Dom0 toostack request. I'm a bit surprised if it
>>> doesn't do that already. Did you forget to start that daemon?
>> We can't run devd in driver domain, because it failed on connect to
>> xenstored socket (/var/run/xenstored/socket - we have xenstored running
>> only in dom0).   
> 
> devd _should_ be able to talk to xenstored over the kernel provided
> interface to the shared ring rather than the local socket.
> 
> It is certainly not expected that devd be colocated in the same domain as
> happens to be running xenstored.
> 
> If this is not working then there is another bug.

This works, but might have problems in Xen 4.5. If you're using running
on Linux 3.14 or newer then you will have a problem. You need to
backport commit 9c89dc95201ffed5fead17b35754bf9440fdbdc0 if you're using
the C based xenstore and 7d418eab3b6dbdeec84bf73af301dca54368547a if
you're using the Ocaml based xenstore.

You can test this locally by supplying "--disable-socket" to xenstore
when it starts up.


> 
>>> In general a driver domain would not be expected to have sufficient
>>> privilege over e.g. a guest domain's /local/domain/domU/devices to
>>> create
>>> the f.e. dirs.
> 
>> In our solution we have to create 2 full privileged domains - Dom0 and 
>> DomD, so we need 2 toolstack domains.
>>Any special privileges hack wasn't done - we need just to setup 
>> additional permissions for DomD.
> 
> I'm afraid you simply cannot have 2 toolstack domains. The toolstack is a
> singleton entity in a Xen system.
> 
> If you want to run toolstack operations from a non-toolstack domain then
> you will need to arrange for some (likely out-of-band) mechanism for such
> domains to ask the single toolstack domain to do something on their behalf.
> 
> Ian.
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 


-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-pciback: fix up cleanup path when alloc fails

2015-12-01 Thread Konrad Rzeszutek Wilk
On Tue, Dec 01, 2015 at 02:54:33PM -0600, Doug Goldstein wrote:
> On 12/1/15 1:35 PM, Konrad Rzeszutek Wilk wrote:
> > On Tue, Dec 01, 2015 at 11:47:17AM -0500, Konrad Rzeszutek Wilk wrote:
> >> On Thu, Nov 26, 2015 at 02:32:39PM -0600, Doug Goldstein wrote:
> >>> When allocating a pciback device fails, avoid the possibility of a
> >>> use after free.
> >>
> >> Reviewed-by: Konrad Rzeszutek Wilk 
> >>
> >> Ugh, and it looks like xen-blkfront has the same issue.
> > 
> >  Nope. No problems there.
> > 
> > The ->probe if it fails (so xenbus_dev_probe returns the error)
> > ends up in the 'probe_failed' label in really_probe which takes care by 
> > doing:
> > 
> > dev_set_drvdata(dev, NULL);
> > 
> > Wheew!
> > 
> > either way the patch should go in, but the 'possibility' should
> > be perhaps removed? Unless there is some other path I missed?
> 
> I put 'possibility' in there because it will only happen when the
> function returns failure. I was also trying to not make it sound panicky

Right, but when it returns failure, the 'really_probe' will take
care of setting dev_set_drvdata(dev, NULL) - so we won't have the
use after free problem.


> I guess. I can resubmit the patch with that word dropped if that's
> desirable.

Sure, or just say: "The 'really_probe' takes care of setting
dev_set_drvdata(dev, NULL) in its failure path (which we would 
exercise if the ->probe function failed), so we we
are OK. However lets be defensive as the code can change."

> 
> > 
> >>
> >>>
> >>> Reported-by: Jonathan Creekmore 
> >>> Signed-off-by: Doug Goldstein 
> >>> ---
> >>>  drivers/xen/xen-pciback/xenbus.c | 4 +++-
> >>>  1 file changed, 3 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/xen/xen-pciback/xenbus.c 
> >>> b/drivers/xen/xen-pciback/xenbus.c
> >>> index 98bc345..4843741 100644
> >>> --- a/drivers/xen/xen-pciback/xenbus.c
> >>> +++ b/drivers/xen/xen-pciback/xenbus.c
> >>> @@ -44,7 +44,6 @@ static struct xen_pcibk_device *alloc_pdev(struct 
> >>> xenbus_device *xdev)
> >>>   dev_dbg(>dev, "allocated pdev @ 0x%p\n", pdev);
> >>>  
> >>>   pdev->xdev = xdev;
> >>> - dev_set_drvdata(>dev, pdev);
> >>>  
> >>>   mutex_init(>dev_lock);
> >>>  
> >>> @@ -58,6 +57,9 @@ static struct xen_pcibk_device *alloc_pdev(struct 
> >>> xenbus_device *xdev)
> >>>   kfree(pdev);
> >>>   pdev = NULL;
> >>>   }
> >>> +
> >>> + dev_set_drvdata(>dev, pdev);
> >>> +
> >>>  out:
> >>>   return pdev;
> >>>  }
> >>> -- 
> >>> 2.4.10
> >>>
> 
> 
> -- 
> Doug Goldstein
> 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 65267: regressions - FAIL

2015-12-01 Thread osstest service owner
flight 65267 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/65267/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail REGR. vs. 
65114

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm  7 host-ping-check-xen   fail REGR. vs. 65114
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 15 guest-localmigrate.2 
fail blocked in 65114
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 16 
guest-localmigrate/x10 fail blocked in 65114
 test-amd64-i386-rumpuserxen-i386 10 guest-startfail like 65114
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail   like 65114
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail like 65114
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail like 65114
 test-amd64-amd64-libvirt-vhd  9 debian-di-installfail   like 65114

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt-qcow2  9 debian-di-installfail never pass
 test-armhf-armhf-libvirt-raw  9 debian-di-installfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-xl-vhd   9 debian-di-installfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  4c6cd64519f9bc270a7278128c94e4b66e3d2077
baseline version:
 xen  713b7e4ef2aa4ec3ae697cde9c81d5a57548f9b1

Last test of basis65114  2015-11-25 19:42:37 Z6 days
Failing since 65141  2015-11-26 20:45:33 Z5 days7 attempts
Testing same since65267  2015-12-01 01:12:37 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Ian Campbell 
  Jan Beulich 
  Kevin Tian 
  Len Brown 
  Paul Durrant 
  Roger Pau Monné 
  Wei Liu 
  Yang Zhang 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern   

Re: [Xen-devel] xen-netfront crash when detaching network while some network activity

2015-12-01 Thread Marek Marczykowski-Górecki
On Tue, Dec 01, 2015 at 05:00:42PM -0500, Konrad Rzeszutek Wilk wrote:
> On Tue, Nov 17, 2015 at 03:45:15AM +0100, Marek Marczykowski-Górecki wrote:
> > On Wed, Oct 21, 2015 at 08:57:34PM +0200, Marek Marczykowski-Górecki wrote:
> > > On Wed, May 27, 2015 at 12:03:12AM +0200, Marek Marczykowski-Górecki 
> > > wrote:
> > > > On Tue, May 26, 2015 at 11:56:00AM +0100, David Vrabel wrote:
> > > > > On 22/05/15 12:49, Marek Marczykowski-Górecki wrote:
> > > > > > Hi all,
> > > > > > 
> > > > > > I'm experiencing xen-netfront crash when doing xl network-detach 
> > > > > > while
> > > > > > some network activity is going on at the same time. It happens only 
> > > > > > when
> > > > > > domU has more than one vcpu. Not sure if this matters, but the 
> > > > > > backend
> > > > > > is in another domU (not dom0). I'm using Xen 4.2.2. It happens on 
> > > > > > kernel
> > > > > > 3.9.4 and 4.1-rc1 as well.
> > > > > > 
> > > > > > Steps to reproduce:
> > > > > > 1. Start the domU with some network interface
> > > > > > 2. Call there 'ping -f some-IP'
> > > > > > 3. Call 'xl network-detach NAME 0'
> 
> Do you see this all the time or just on occassions?

Using above procedure - all the time.

> I tried to reproduce it and couldn't see it. Is your VM an PV or HVM?

PV, started by libvirt. This may have something to do, the problem didn't
existed on older Xen (4.1) and started by xl. I'm not sure about kernel
version there, but I think I've tried there 3.18 too, which has this
problem.

But I don't see anything special in domU config file (neither backend
nor frontend) - it may be some libvirt default. If that's really the
cause. Can I (and how) get any useful information about that?


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


pgpuH5w8RTS3O.pgp
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv6] 01/28] build: import Kbuild/Kconfig from Linux 4.2

2015-12-01 Thread Doug Goldstein
On 11/30/15 11:19 AM, Ian Campbell wrote:
> On Mon, 2015-11-30 at 11:00 -0600, Doug Goldstein wrote:
>> Since there is a request to have KEXEC and the UARTs
>> configurable by the user
> 
> Who asked for this?
> 
> I have quite a strong preference for not adding _any_ new[*] user
> configurable options in this first pass, since I think those need to be
> considered quite carefully whereas this first series should be largely
> about the mechanics of introducing Kconfig files.
> 
> Ian.
> 
> [*] i.e. anything which is not already controllable by the current .config
> driven thing.
> 

The ARM UARTs are the take away I had from conversations with Julien
Grall and reading past comments on the ML how people can change the ARM
UARTs. Obviously if that's not desired I can drop that. I originally had
them enabled as they are in config/arm{32,64}.mk and changed them to be
user configurable later in the series.

As far as KEXEC goes, its a user configurable option now in Rules.mk.
You can build "make kexec=n" and it will disable it. I chose that one as
an original example in v1 of how a user configurable option would look
in this scheme.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [V2 PATCH 2/9] x86/hvm: pkeys, add the flag to enable Memory Protection Keys

2015-12-01 Thread Andrew Cooper
On 27/11/15 09:51, Huaitong Han wrote:
> This patch adds the flag to enable Memory Protection Keys.
>
> Signed-off-by: Huaitong Han 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [V2 PATCH 3/9] x86/hvm: pkeys, add pkeys support when setting CR4

2015-12-01 Thread Andrew Cooper
On 27/11/15 09:51, Huaitong Han wrote:
> This patch adds pkeys support when setting CR4.
>
> Signed-off-by: Huaitong Han 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [V2 PATCH 8/9] x86/hvm: pkeys, add xstate support for pkeys

2015-12-01 Thread Andrew Cooper
On 27/11/15 09:52, Huaitong Han wrote:
> This patch adds xstate support for pkeys.
>
> Signed-off-by: Huaitong Han 
> ---
>  xen/arch/x86/xstate.c| 18 ++
>  xen/include/asm-x86/xstate.h |  5 -
>  2 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
> index 827e0e1..00bddb0 100644
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -294,6 +294,24 @@ unsigned int xstate_ctxt_size(u64 xcr0)
>  return _xstate_ctxt_size(xcr0);
>  }
>  
> +/*
> + * Given the xsave area and a state inside, this function returns the
> + * address of the state.
> + *
> + * This is the API that is called to get xstate address in standard format.
> + * Just because XSAVE function does not use compacted format of xsave
> + * area.
> + */
> +void *get_xsave_addr(struct xsave_struct *xsave, u32 xfeature)
> +{
> +u32 xstate_offsets, xstate_sizes, ecx, edx;
> +u32 xstate_nr = fls64(xfeature) - 1;
> +
> +cpuid_count(XSTATE_CPUID, xstate_nr, _sizes, _offsets, 
> , );
> +
> +return (void *)xsave + xstate_offsets;
> +}
> +

Does this even compile?  There is already

static void *get_xsave_addr(void *xsave, unsigned int xfeature_idx)

higher in the same file.

That function should be augmented to take a struct xsave_struct *xsave,
look at whether the representation is compressed or not, and use the
appropriate offset array.

>  /* Collect the information of processor's extended state */
>  void xstate_init(struct cpuinfo_x86 *c)
>  {
> diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
> index b95a5b5..e9abe71 100644
> --- a/xen/include/asm-x86/xstate.h
> +++ b/xen/include/asm-x86/xstate.h
> @@ -34,13 +34,15 @@
>  #define XSTATE_OPMASK  (1ULL << 5)
>  #define XSTATE_ZMM (1ULL << 6)
>  #define XSTATE_HI_ZMM  (1ULL << 7)
> +#define XSTATE_UNUSED  (1ULL << 8)

No need for this.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xen-netfront crash when detaching network while some network activity

2015-12-01 Thread Konrad Rzeszutek Wilk
On Tue, Nov 17, 2015 at 03:45:15AM +0100, Marek Marczykowski-Górecki wrote:
> On Wed, Oct 21, 2015 at 08:57:34PM +0200, Marek Marczykowski-Górecki wrote:
> > On Wed, May 27, 2015 at 12:03:12AM +0200, Marek Marczykowski-Górecki wrote:
> > > On Tue, May 26, 2015 at 11:56:00AM +0100, David Vrabel wrote:
> > > > On 22/05/15 12:49, Marek Marczykowski-Górecki wrote:
> > > > > Hi all,
> > > > > 
> > > > > I'm experiencing xen-netfront crash when doing xl network-detach while
> > > > > some network activity is going on at the same time. It happens only 
> > > > > when
> > > > > domU has more than one vcpu. Not sure if this matters, but the backend
> > > > > is in another domU (not dom0). I'm using Xen 4.2.2. It happens on 
> > > > > kernel
> > > > > 3.9.4 and 4.1-rc1 as well.
> > > > > 
> > > > > Steps to reproduce:
> > > > > 1. Start the domU with some network interface
> > > > > 2. Call there 'ping -f some-IP'
> > > > > 3. Call 'xl network-detach NAME 0'

Do you see this all the time or just on occassions?

I tried to reproduce it and couldn't see it. Is your VM an PV or HVM?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [V2 PATCH 6/9] x86/hvm: pkeys, add functions to support PKRU access

2015-12-01 Thread Andrew Cooper
On 27/11/15 09:52, Huaitong Han wrote:
> This patch adds functions to support PKRU access.
>
> Signed-off-by: Huaitong Han 
> ---
>  xen/include/asm-x86/processor.h | 15 +++
>  1 file changed, 15 insertions(+)
>
> diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
> index 3f8411f..68d86cb 100644
> --- a/xen/include/asm-x86/processor.h
> +++ b/xen/include/asm-x86/processor.h
> @@ -342,6 +342,21 @@ static inline void write_cr4(unsigned long val)
>  asm volatile ( "mov %0,%%cr4" : : "r" (val) );
>  }
>  
> +/* Macros for PKRU domain */
> +#define PKRU_READ  0
> +#define PKRU_WRITE 1
> +#define PKRU_ATTRS 2
> +
> +/*
> + * PKRU defines 32 bits, there are 16 domains and 2 attribute bits per
> + * domain in pkru, pkeys is index to a defined domain, so the value of
> + * pte_pkeys * PKRU_ATTRS + R/W is offset of a defined domain attribute.
> + */
> +#define READ_PKRU_AD(pkru, pkey) \
> +((pkru >> (pkey * PKRU_ATTRS + PKRU_READ)) & 1)
> +#define READ_PKRU_WD(pkru, pkey) \
> +((pkru >> (pkey * PKRU_ATTRS + PKRU_WRITE)) & 1)

Both macro parameters need quoting, but these would be better still as
static inline functions.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems 
a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was using 
(4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is also 
a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with a 
single vcpu i see to get the stuck kworker thread that got my attention, 
with a 2 vcpu that doesn't seem to happen, but you still get the dmesg 
output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?

--
Sander



-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] libxl: rename libxlConsoleCallback

2015-12-01 Thread Jim Fehlig
On 11/23/2015 11:56 AM, Joao Martins wrote:
> . to a more generic name i.e. libxlDomainStartCallback,
> since it will now cover another case other than the console.
>
> Signed-off-by: Joao Martins 
> ---
>  src/libxl/libxl_domain.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
> index 40dcea1..a7267b0 100644
> --- a/src/libxl/libxl_domain.c
> +++ b/src/libxl/libxl_domain.c
> @@ -854,7 +854,7 @@ libxlDomainFreeMem(libxl_ctx *ctx, libxl_domain_config 
> *d_config)
>  }
>  
>  static void
> -libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback)
> +libxlDomainStartCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback)
>  {
>  virDomainObjPtr vm = for_callback;
>  size_t i;
> @@ -988,7 +988,7 @@ libxlDomainStart(libxlDriverPrivatePtr driver, 
> virDomainObjPtr vm,
>  virObjectUnlock(vm);
>  
>  aop_console_how.for_callback = vm;
> -aop_console_how.callback = libxlConsoleCallback;
> +aop_console_how.callback = libxlDomainStartCallback;

Before pushing, I wanted to change the 'aop_console_how' variable to something
more generic too, but realized it is the 'const libxl_asyncprogress_how
*aop_console_how' parameter to libxl_domain_create_{new,restore}. AFAICT, this
callback is invoked when a console becomes available for the domain. It might be
possible that network devices have not been created (devid = -1) when the
callback is invoked. I thought about adding a separate libxlDomainStartCallback
and registering it with the 'const libxl_asyncop_how *ao_how' parameter, but
this would change the synchronous behavior of libxlDomainStart and be quite a
bit more intrusive.

In the end, I think it is best to explicitly call a function that creates the
ifnames after a successful libxl_domain_create_{new,restore}. See my reply to
2/2 for an example of this idea.

Regards,
Jim


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus goes 
well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems a 
kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting would 
probably
quite painful since there were some breakages this merge window with 
respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.




Thanks :)

-- Sander


Between 4.3 and 4.4-single:

-NR_IRQS:4352 nr_irqs:32 16
+Using NULL legacy PIC
+NR_IRQS:4352 nr_irqs:32 0


This is fine, as long as you have 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c.




-cpu 0 spinlock event irq 17
+cpu 0 spinlock event irq 1


This is strange. I wouldn't expect spinlocks to use legacy irqs.



Could it be .. that with your fixup:
xen/events: Always allocate legacy interrupts on PV guests
(b4ff8389ed14b849354b59ce9b360bdefcdbf99c)
for commit:
x86/irq: Probe for PIC presence before allocating descs for legacy 
IRQs

(8c058b0b9c34d8c8d7912880956543769323e2d8)

that we now have the situation described in the commit message of 
8c058b0b9c, but now for Xen PV instead of

Hyper-V ?
(seems both Xen and Hyper-V want to achieve the same but have different 
competing implementations ?)


(BTW 8c058b0b9c has a CC for stable ... so could be destined to cause 
more trouble).


--
Sander




and later on:

-hctosys: unable to open rtc device (rtc0)
+rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock

+genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

+hvc_open: request_irq failed with rc -16.
+Warning: unable to open an initial console.


between 4.4-single and 4.4-multi:

 Using NULL legacy PIC
-NR_IRQS:4352 nr_irqs:32 0
+NR_IRQS:4352 nr_irqs:48 0


This is probably OK too since nr_irqs depend on number of CPUs.

I think something is messed up with IRQ. I saw last week something
from setup_irq() generating a stack dump (warninig) for rtc_cmos but
it appeared harmless at that time and now I don't see it anymore.

-boris




and later on:

-rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
+hctosys: unable to open rtc device (rtc0)

-genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

-hvc_open: request_irq failed with rc -16.
-Warning: unable to open an initial console.

attached:
- dmesg with 4.3 kernel with 1 vcpu
- dmesg with 4.4 kernel with 1 vpcu
- dmesg with 4.4 kernel with 2 vpcus
- .config of the 4.4 kernel is attached.

-- Sander




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-12-02 00:19, Boris Ostrovsky wrote:

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with a 
single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 0
after that commit.

This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
difference.


-boris



That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck kworker 
thread .. and no:
   genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

   hvc_open: request_irq failed with rc -16.

What i did get was an conflict reverting 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c:
arch/arm64/include/asm/irq.h, although that shouldn't matter because we 
are on x86 and not on arm.


--
Sander




-- Sander



-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] libxl: implement virDomainInterfaceStats

2015-12-01 Thread Jim Fehlig
On 11/23/2015 11:57 AM, Joao Martins wrote:
> Introduce support for domainInterfaceStats API call for querying
> network interface statistics. Consequently it also enables the
> use of `virsh domifstat  ` command plus
> seeing the interfaces names instead of "-" when doing
> `virsh domiflist `.
>
> After successful guest creation we fill the network
> interfaces names based on domain, device id and append suffix
> if it's emulated in the following form: vif.[-emu].

One interesting Xen behavior that has existing for many, many years is that a PV
nic is implicitly created for each emulated nic specified in the config. The
guest OS picks which one to use. These days most will use the PV nic, and if
they are nice, "unplug" the emulated one via the unplug protocol. E.g. an HVM
guest with

 



  

results in two vif devices on the host

# ip a | grep vif
607: vif519.0-emu:  mtu 1500 qdisc pfifo_fast
master br0 state UNKNOWN group default qlen 500
608: vif519.0:  mtu 1500 qdisc pfifo_fast
master br0 state UNKNOWN group default qlen 512

both are connected to the bridge

# brctl show br0
bridge namebridge idSTP enabled   interfaces
br08000.001e676598f5noeth0
  vif519.0
  vif519.0-emu

In this case, the (not nice) guest OS is using the PV nic but did not unplug the
emulated one. So we have two interfaces, but the virDomainDef only contains one

# virsh domiflist 519
Interface  Type   Source Model   MAC
---
vif519.0-emu bridge br0-   00:16:3e:7a:35:ce

Not a fault of this patch, but we'll need to figure out how to handle the
implicitly created PV nic. The interesting case is identifying emulated nics
that have been unplugged by a nice guest, and hence no longer exist in the host
(e.g. vif519.0-emu in the above example).

> We extract the network interfaces info from libxl in
> libxlDomainStartCallback() and make ifname . On domain
> cleanup we also clear ifname, in case it was set by libvirt (i.e.
> being prefixed with "vif"). We also skip these two steps in case the name
> of the interface was manually inserted by the adminstrator.
>
> For getting the interface statistics we resort to virNetInterfaceStats
> and let libvirt handle the platform specific nits. Note that the latter
> is not yet supported in FreeBSD.
>
> Signed-off-by: Joao Martins 
> ---
> Changes since v3:
>  - Use libxl_device_nic_list() for getting each network interface
>  devid in DomainStartCallback.
>  - Improve error reporting by appropriately setting the right error
>  when no interface is known.
>  - Do not unlock vm if libxlDomainObjEndJob() returns false
>  - Set vm->def->net[i]->ifname on DomainStartCallback instead of
>  DomainStart.
>  - Change commit message reflecting the changes on the previous
>  item and mention correct interface names when doing domiflist.
>
> Changes since v2:
>  - Clear ifname if it's autogenerated, since otherwise will persist
>  on successive domain starts. Change commit message reflecting this
>  change.
>
> Changes since v1:
>  - Fill .ifname after domain start with generated
>  name from libxl  based on domain id and devid returned by libxl.
>  After that path validation don interfaceStats is enterily based
>  on ifname pretty much like the other drivers.
>  - Modify commit message reflecting the changes mentioned in
>  the previous item.
>  - Bump version to 1.2.22
> ---
>  src/libxl/libxl_domain.c | 29 +++
>  src/libxl/libxl_driver.c | 52 
> 
>  2 files changed, 81 insertions(+)
>
> diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
> index a7267b0..141f241 100644
> --- a/src/libxl/libxl_domain.c
> +++ b/src/libxl/libxl_domain.c
> @@ -728,6 +728,17 @@ libxlDomainCleanup(libxlDriverPrivatePtr driver,
>  }
>  }
>  
> +if ((vm->def->nnets)) {
> +ssize_t i;

size_t

> +
> +for (i = 0; i < vm->def->nnets; i++) {
> +virDomainNetDefPtr net = vm->def->nets[i];
> +
> +if (STRPREFIX(net->ifname, "vif"))
> +VIR_FREE(net->ifname);
> +}
> +}
> +
>  if (virAsprintf(, "%s/%s.xml", cfg->stateDir, vm->def->name) > 0) {
>  if (unlink(file) < 0 && errno != ENOENT && errno != ENOTDIR)
>  VIR_DEBUG("Failed to remove domain XML for %s", vm->def->name);
> @@ -857,6 +868,8 @@ static void
>  libxlDomainStartCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback)
>  {
>  virDomainObjPtr vm = for_callback;
> +libxl_device_nic *nics;
> +int nnics;
>  size_t i;
>  
>  virObjectLock(vm);
> @@ -883,6 +896,22 @@ libxlDomainStartCallback(libxl_ctx *ctx, libxl_event 
> *ev, void *for_callback)
>  

Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems 
a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was using 
(4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 12/01/2015 05:51 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the tip 
tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it seems 
a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the latest 
set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.




Thanks :)

-- Sander


Between 4.3 and 4.4-single:

-NR_IRQS:4352 nr_irqs:32 16
+Using NULL legacy PIC
+NR_IRQS:4352 nr_irqs:32 0


This is fine, as long as you have 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c.




-cpu 0 spinlock event irq 17
+cpu 0 spinlock event irq 1


This is strange. I wouldn't expect spinlocks to use legacy irqs.



Could it be .. that with your fixup:
xen/events: Always allocate legacy interrupts on PV guests
(b4ff8389ed14b849354b59ce9b360bdefcdbf99c)
for commit:
x86/irq: Probe for PIC presence before allocating descs for legacy 
IRQs

(8c058b0b9c34d8c8d7912880956543769323e2d8)

that we now have the situation described in the commit message of 
8c058b0b9c, but now for Xen PV instead of

Hyper-V ?
(seems both Xen and Hyper-V want to achieve the same but have 
different competing implementations ?)


(BTW 8c058b0b9c has a CC for stable ... so could be destined to cause 
more trouble).



You mean my statement that irq 1 looks bad? That was a red herring, it 
should be fine.


-boris




--
Sander




and later on:

-hctosys: unable to open rtc device (rtc0)
+rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock

+genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

+hvc_open: request_irq failed with rc -16.
+Warning: unable to open an initial console.


between 4.4-single and 4.4-multi:

 Using NULL legacy PIC
-NR_IRQS:4352 nr_irqs:32 0
+NR_IRQS:4352 nr_irqs:48 0


This is probably OK too since nr_irqs depend on number of CPUs.

I think something is messed up with IRQ. I saw last week something
from setup_irq() generating a stack dump (warninig) for rtc_cmos but
it appeared harmless at that time and now I don't see it anymore.

-boris




and later on:

-rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock
+hctosys: unable to open rtc device (rtc0)

-genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

-hvc_open: request_irq failed with rc -16.
-Warning: unable to open an initial console.

attached:
- dmesg with 4.3 kernel with 1 vcpu
- dmesg with 4.4 kernel with 1 vpcu
- dmesg with 4.4 kernel with 2 vpcus
- .config of the 4.4 kernel is attached.

-- Sander





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Sander Eikelenboom

On 2015-12-02 00:41, Boris Ostrovsky wrote:

On 12/01/2015 06:30 PM, Sander Eikelenboom wrote:

On 2015-12-02 00:19, Boris Ostrovsky wrote:

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge 
window with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with 
a single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 
0

after that commit.

This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
difference.


-boris



That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck 
kworker thread .. and no:
   genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

   hvc_open: request_irq failed with rc -16.



Let me try it again tomorrow. Can you post your guest config file, Xen
version and host HW (Intel or AMD)? 'xl info' maybe?

-boris


Guest config file == dom0 config file == the one i send you earlier.
Host is an AMD Phenom X6.

# xl info
host   : serveerstertje
release: 4.4.0-rc3-20151201-linus-doflr-boris+
version: #1 SMP Tue Dec 1 19:02:58 CET 2015
machine: x86_64
nr_cpus: 6
max_cpu_id : 5
nr_nodes   : 1
cores_per_socket   : 6
threads_per_core   : 1
cpu_mhz: 3200
hw_caps: 
178bf3ff:efd3fbff::00011300:00802001::37ff:

virt_caps  : hvm hvm_directio
total_memory   : 20479
free_memory: 7745
sharing_freed_memory   : 0
sharing_used_memory: 0
outstanding_claims : 0
free_cpus  : 0
xen_major  : 4
xen_minor  : 7
xen_extra  : -unstable
xen_version: 4.7-unstable
xen_caps   : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
hvm-3.0-x86_32p hvm-3.0-x86_64

xen_scheduler  : credit
xen_pagesize   : 4096
platform_params: virt_start=0x8000
xen_changeset  : Thu Nov 26 20:58:13 2015 +0100 
git:5252636-dirty
xen_commandline: dom0_mem=1536M,max:1536M loglvl=all 
loglvl_guest=all console_timestamps=datems vga=gfx-1280x1024x32 cpuidle 
cpufreq=xen com1=38400,8n1 console=vga,com1 ivrs_ioapic[6]=00:14.0 
iommu=on,verbose,debug,amd-iommu-debug conring_size=128k ucode=-1

cc_compiler: gcc-4.9.real (Debian 4.9.2-10) 4.9.2
cc_compile_by  : root
cc_compile_domain  : dyndns.org
cc_compile_date: Thu Nov 26 21:18:41 CET 2015
xend_config_format : 4

If you need and can get mor

Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:

On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge window 
with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second fixes
a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with a 
single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 0 
after that commit.


This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and 
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any 
difference.



-boris



--
Sander



-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu.

2015-12-01 Thread Boris Ostrovsky

On 12/01/2015 06:30 PM, Sander Eikelenboom wrote:

On 2015-12-02 00:19, Boris Ostrovsky wrote:

On 12/01/2015 06:00 PM, Sander Eikelenboom wrote:

On 2015-12-01 23:47, Boris Ostrovsky wrote:

On 11/30/2015 05:55 PM, Sander Eikelenboom wrote:

On 2015-11-30 23:54, Boris Ostrovsky wrote:

On 11/30/2015 04:46 PM, Sander Eikelenboom wrote:

On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote:
On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom 
wrote:

Hi all,

I have just tested a 4.4-rc2 kernel (current linus tree) + the 
tip tree

pulled on top.

Running this kernel under Xen on PV-guests with multiple vcpus 
goes well (on

idle < 10% cpu usage),
but a guest with only a single vcpu doesn't idle at all, it 
seems a kworker

thread is stuck:
root   569 98.0  0.0  0 0 ?R 16:02 12:47
[kworker/0:1]

Running a 4.3 kernel works fine with a single vpcu, bisecting 
would probably
quite painful since there were some breakages this merge 
window with respect

to Xen pv-guests.

There are some differences in the diff's from booting a 4.3, 
4.4-single,

4.4-multi cpu boot:


Boris has been tracking a bunch of them. I am attaching the 
latest set of

patches I've to carry on top of v4.4-rc3.


Hi Konrad,

i will test those, see if it fixes all my issues and report back


They shouldn't help you ;-( (and I just saw a message from you 
confirming this)


The first one fixes a 32-bit bug (on bare metal too). The second 
fixes

a fatal bug for 32-bit PV guests. The other two are code
improvements/cleanup.


One of these patches also fixes a bug i was having with a 
pci-passthrough device in
a HVM that wasn't working (depending on which dom0-kernel i was 
using (4.3 or 4.4)),

but didn't report yet.

Fingers crossed but i think this pv-guest single vcpu issue is the 
last i'm troubled by for now ;)


I could not reproduce this, including with your kernel config file.


Hmm that's unpleasant :-\

Hmm other strange thing is it doesn't seem to affect dom0 (which is 
also a PV guest), but only unprivileged ones
All unprivileged pv-guests seem to have the irq issue, but only with 
a single vcpu i see to get the stuck kworker thread that got my 
attention, with a 2 vcpu that doesn't seem to happen, but you still 
get the dmesg output and warnings about hvc)


Could it be that:

arch/x86/include/asm/i8259.h
static inline int nr_legacy_irqs(void)
{
return legacy_pic->nr_legacy_irqs;
}

returns something different in some circumstances ?


It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 0
after that commit.

This is the last number that you see in
NR_IRQS:4352 nr_irqs:48 0
line.

I think you should be able to safely revert both
b4ff8389ed14b849354b59ce9b360bdefcdbf99c and
8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any
difference.


-boris



That was already underway compiling :)

And it does reveal that reverting both fixes the issue, no stuck 
kworker thread .. and no:
   genirq: Flags mismatch irq 8.  (hvc_console) vs.  
(rtc0)

   hvc_open: request_irq failed with rc -16.



Let me try it again tomorrow. Can you post your guest config file, Xen 
version and host HW (Intel or AMD)? 'xl info' maybe?


-boris




What i did get was an conflict reverting 
b4ff8389ed14b849354b59ce9b360bdefcdbf99c:
arch/arm64/include/asm/irq.h, although that shouldn't matter because 
we are on x86 and not on arm.


--
Sander




-- Sander



-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: XSETBV intercept needs to check CPL on SVM only

2015-12-01 Thread Tian, Kevin
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Friday, November 27, 2015 7:02 PM
> 
> VMX doesn't need a software CPL check on the XSETBV intercept, and
> SVM can do that check without resorting to hvm_get_segment_register().
> 
> Clean up what is left of hvm_handle_xsetbv(), namely make it return a
> proper error code.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Skylake: VT-d and other error messages

2015-12-01 Thread Eric Shelton
ASRock Z170 Extreme 4 with BIOS version 2.30 was motherboard #1
Gigabyte GA-Z170XP-SLI with BIOS version F5 was motherboard #2

As I initially reported, both were reporting the error involving f0:1f.0

Not surprisingly, there are no f0:xx.y devices.  The closest I suppose is:
00:1f.0 ISA Bridge: Intel Corporation Sunrise Point-H LPC Controller (rev
31)

At the moment, I am back on vanilla 4.6.0, and no such error messages are
being reported via 'xl dmesg'.  I am guessing reporting of this error was
added post-4.6.

Eric


On Tue, Dec 1, 2015 at 9:34 PM, Tian, Kevin  wrote:

> Eric, could you provide your motherboard information and as Andrew pointed
> out what’s f0:1f.0?
>
>
>
> *From:* Andrew Cooper [mailto:andrew.coop...@citrix.com]
> *Sent:* Sunday, November 29, 2015 6:15 AM
> *To:* Eric Shelton; xen-devel; Zhang, Yang Z; Tian, Kevin; Keir Fraser;
> Jan Beulich
> *Subject:* Re: Skylake: VT-d and other error messages
>
>
>
> On 28/11/15 20:46, Eric Shelton wrote:
>
> Looking through the output of 'xl dmesg' on a Skylake system (i5-6600K), I
> found a number of error messages that I do not encounter on a Haswell-based
> system.  I have tried two motherboards from different manufacturers, with
> pretty much the same results.  Below are some of the unexpected messages:
>
>
>
> Not enabling x2APIC (upon firmware request)
>
> ...
>
> mwait-idle: does not run on family 6 model 94
>
> ...
>
> [VT-D] iommu.c:875: iommu_fault_status: Primary Pending Fault
>
> [VT-D] INTR-REMAP: Request device [:f0:1f.0] fault index 0, iommu reg
> = 82c000201000
>
> (on motherboard 1) [VT-D] INTR-REMAP: reason 22 - Present field in the
> IRTE entry is clear
>
> (on motherboard 2) [VT-D] INTR-REMAP: reason 25 - Blocked a compatibility
> format interrupt request
>
>
>
> This leads to a few questions:
>
> 1) Is there some reason x2APIC should not be enabled on Skylake?  What
> consequence, if any, is there not having x2APIC enabled?
>
>
> In this case, the firmware has set the x2apic opt-out bit in the DMAR
> table, indicating that Xen should not use x2apic.
>
> You might find an option in your BIOS to undo this; there have been enough
> errata in the past in this area that I would expect it to be a tweakable.
>
> x2apic is the extension to xapic, which permits more than 255 cpus.  So
> long as you don't have that many, there isn't a specific problem with
> missing x2apic mode.
>
>
> 2) Should mwait-idle be available on Skylake?
>
>
> We probably need to resync the mwait driver with Linux.  It is whitelisted
> on known cpu model numbers.
>
>
> 3) What about the IOMMU errors on Skylake - are they a concern?
>
>
> Yes.
>
> In both cases, PCI device f0:1f.0 is misconfigured or misbehaving.
>
> On motherboard 1, it is delivering an interrupt for which no remapping
> entry has been set up.
>
> On motherboard 2, it is delivering an compatibility-format interrupt, as
> opposed to a remapable-format interrupt.
>
> For motherboard 2, Xen should disallow such a configuration.  Either
> interrupt remapping is enabled and all devices should be configured to
> issue remmapable interrupts, or interrupt remapping is disabled and
> everything should be configured to issue compatibility-format interrupts.
>
> Either way, diagnosing the problem here starts with identifying what
> f0:1f.0 is.
>
> ~Andrew
>
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] VT-d faults with Integrated Intel graphics on 4.6

2015-12-01 Thread Wu, Feng


> -Original Message-
> From: xen-devel-boun...@lists.xen.org [mailto:xen-devel-
> boun...@lists.xen.org] On Behalf Of Konrad Rzeszutek Wilk
> Sent: Wednesday, December 2, 2015 12:16 AM
> To: Tian, Kevin ; Wu, Feng 
> Cc: Konrad Rzeszutek Wilk ; Xen-devel  de...@lists.xen.org>; Tamas K Lengyel 
> Subject: Re: [Xen-devel] VT-d faults with Integrated Intel graphics on 4.6
> 
> On Thu, Aug 27, 2015 at 07:03:56AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Thu, Aug 27, 2015 at 11:06:30AM +0800, Chen, Tiejun wrote:
> > > On 8/25/2015 10:43 PM, Konrad Rzeszutek Wilk wrote:
> > > >On Tue, Aug 25, 2015 at 02:55:31PM +0800, Chen, Tiejun wrote:
> > > >>On 8/25/2015 8:19 AM, Tamas K Lengyel wrote:
> > > >>>Hi everyone,
> > > >>>I saw some people passingly mention this on the list before but just
> in
> > > >>>case it has been missed, my serial is also being spammed with the
> following
> > > >>>printouts with both Xen 4.6 RC1 and the latest staging build:
> > > >>>
> > > >>>...
> > > >>>(XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault
> addr
> > > >>>33487d7000, iommu reg = 82c000201000
> > > >>>(XEN) [VT-D]DMAR: reason 06 - PTE Read access is not set
> > > >>>(XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault
> addr
> > > >>>33487d7000, iommu reg = 82c000201000
> > > >>>(XEN) [VT-D]DMAR: reason 06 - PTE Read access is not set
> > > >>>(XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault
> addr
> > > >>>33487d7000, iommu reg = 82c000201000
> > > >>>(XEN) [VT-D]DMAR: reason 06 - PTE Read access is not set
> > > >>>(XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault
> addr
> > > >>>33487d7000, iommu reg = 82c000201000
> > > >>>(XEN) [VT-D]DMAR: reason 06 - PTE Read access is not set
> > > >>>(XEN) [VT-D]DMAR:[DMA Read] Request device [:00:02.0] fault
> addr
> > > >>>2610742000, iommu reg = 82c000201000
> > > >>>(XEN) [VT-D]DMAR: reason 07 - Next page table ptr is invalid
> > > >>>...
> > > >>>
> > > >>
> > > >>What's your platform? BDW? And how much memory is set to your
> guest OS?
> > > >
> > > >Is see this as well. But oddly enough - only when I use the AMT feature
> > > >(normally I just use serial console on the machine).
> > > >
> > > >The platform is /DQ67SW, BIOS
> > > >SWQ6710H.86A.0066.2012.1105.1504 11/05/2012
> > > >
> > > >There is no guest OS - this is initial domain. And I boot with 2GB:
> > > >  Released 0 page(s)
> > > >
> > > >Xen: [mem 0x-0x00099fff] usable
> > > >Xen: [mem 0x0009a800-0x000f] reserved
> > > >Xen: [mem 0x0010-0x1fff] usable
> > > >Xen: [mem 0x2000-0x201f] reserved
> > > >Xen: [mem 0x2020-0x3fff] usable
> > > >Xen: [mem 0x4000-0x401f] reserved
> > > >Xen: [mem 0x4020-0x80465fff] usable
> > > >Xen: [mem 0x80466000-0x9e855fff] unusable
> > > >Xen: [mem 0x9e856000-0x9e85efff] ACPI data
> > > >Xen: [mem 0x9e85f000-0x9e8a9fff] ACPI NVS
> > > >Xen: [mem 0x9e8aa000-0x9e8b1fff] unusable
> > > >Xen: [mem 0x9e8b2000-0x9e9a4fff] reserved
> > > >Xen: [mem 0x9e9a5000-0x9e9a6fff] unusable
> > > >Xen: [mem 0x9e9a7000-0x9ebc5fff] reserved
> > > >Xen: [mem 0x9ebc6000-0x9ebc6fff] unusable
> > > >Xen: [mem 0x9ebc7000-0x9ebd6fff] reserved
> > > >Xen: [mem 0x9ebd7000-0x9ebf4fff] ACPI NVS
> > > >Xen: [mem 0x9ebf5000-0x9ec18fff] reserved
> > > >Xen: [mem 0x9ec19000-0x9ec5bfff] ACPI NVS
> > > >Xen: [mem 0x9ec5c000-0x9ee7bfff] reserved
> > > >Xen: [mem 0x9ee7c000-0x9eff] unusable
> > > >Xen: [mem 0x9f80-0xbf9f] reserved
> > > >Xen: [mem 0xfec0-0xfec00fff] reserved
> > > >Xen: [mem 0xfed1c000-0xfed3] reserved
> > > >Xen: [mem 0xfed9-0xfed91fff] reserved
> > > >Xen: [mem 0xfee0-0xfeef] reserved
> > > >Xen: [mem 0xff00-0x] reserved
> > > >Xen: [mem 0x0001-0x00043e5f] unusable
> > > >
> > > >>
> > > >>Just at first glance to fault address, this seems be issued from some
> > >
> > > As you see those fault addresses are out of the normal memory range
> here.
> > >
> > > >>known erratas on BDS and SKL.
> > > >
> > > >I am runnig v4.2-rc8.
> > >
> > > So I really doubt this is related to some erratas. Currently the pre-fetch
> > > unit of IOMMU unit dedicated to IGD can't work well on some platforms,
> so
> > > you can see these wired faults.
> >
> > Do you have some ideas for a solution/patch?
> 
> ping?
> 
> Should I use some work-around flags? There does not seem to be any
> corruption on the AMT 

[Xen-devel] [linux-mingo-tip-master test] 65271: regressions - FAIL

2015-12-01 Thread osstest service owner
flight 65271 linux-mingo-tip-master real [real]
http://logs.test-lab.xenproject.org/osstest/logs/65271/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 13 guest-localmigrate 
fail REGR. vs. 60684
 test-amd64-amd64-i386-pvgrub  9 debian-di-install fail REGR. vs. 60684
 test-amd64-i386-rumpuserxen-i386 10 guest-start   fail REGR. vs. 60684

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-libvirt-vhd  9 debian-di-install fail REGR. vs. 60684

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass

version targeted for testing:
 linuxaf9dd55d758b34b968f92e7b8d140adc68ac2928
baseline version:
 linux69f75ebe3b1d1e636c4ce0a0ee248edacc69cbe0

Last test of basis60684  2015-08-13 04:21:46 Z  110 days
Failing since 60712  2015-08-15 18:33:48 Z  108 days   72 attempts
Testing same since65271  2015-12-01 05:07:26 Z0 days1 attempts

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmfail
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-xl-xsm  pass
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 

Re: [Xen-devel] [PATCH v2] x86/vmx: enable PML by default

2015-12-01 Thread Tian, Kevin
> From: Kai Huang [mailto:kai.hu...@linux.intel.com]
> Sent: Friday, November 27, 2015 4:52 PM
> 
> Since PML series were merged (but disabled by default) we have conducted lots 
> of
> PML tests (live migration, GUI display) and PML has been working fine, 
> therefore
> turn it on by default.
> 
> Document of PML command line is adjusted accordingly as well.
> 
> Signed-off-by: Kai Huang 
> Tested-by: Robert Hu 
> Tested-by: Xudong Hao 

Acked-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Crash in set_cpu_sibling_map() booting Xen 4.6.0 on Fusion

2015-12-01 Thread Nakajima, Jun
BTW, we use "package ID", rather than "socket ID" in the SDM.
Assignment of the package IDs on a system is a BIOS matter. Basically
BIOS needs to assign package IDs to resolve APIC ID collision at early
boot time, and the convention is up to the vendor/or the specific
system configuration agents. And "contiguous package IDs" are not
required there to make it flexible. In fact, if you look at "Example
8-22. Compute the Number of Packages, Cores, and Processor
Relationships in a MP System", the sample code doesn't assume that
PACKAGE_IDs be continuous.


On Thu, Nov 26, 2015 at 6:11 PM, Chao Peng  wrote:
> On Thu, Nov 26, 2015 at 12:49:42AM -0700, Jan Beulich wrote:
>> >>> On 26.11.15 at 00:27,  wrote:
>> > A few more data points: I also tested Xen 4.6 on VMware ESXi 5.5, and
>> > it yields similar results. Not surprising, since Fusion uses basically
>> > the same virtualization engine.
>> >
>> > However, ESXi offers many more choices of number of processors, number
>> > of cores, hyperthreading, etc. The weird processor ID assignment (0,
>> > 2, 4, 6, ...) occurs only with 4 or 8 processors, 1 core per socket,
>> > and no hyperthreading. If I change any of these parameters, the
>> > processor IDs become sequential.
>> >
>> > It appears in the 4- and 8-processor cases, VMware is emulating
>> > something like a Xeon E7340:
>> > https://github.com/deater/test_proc/blob/master/x86_64/x86_64.intel.6.15.11.
>> > xeon_e7340
>> >
>> > In fact someone asked a question about running Xen on this platform
>> > way back when:
>> > http://lists.xenproject.org/archives/html/xen-users/2008-05/msg00691.html
>> >
>> > Others of similar vintage assign processor IDs 0 and 3 on a
>> > 2-processor system:
>> > https://www.centos.org/forums/viewtopic.php?t=30255
>> >
>> > or even 0 and 6:
>> > http://serverfault.com/questions/302429/interpreting-cpuinfo
>> >
>> > So there are real hardware platforms with non-sequential processor
>> > IDs. They are quite ancient and don't support CAT, but that doesn't
>> > rule out the possibility of a newer or future platform behaving
>> > similarly.
>>
>> Not supporting CAT is not a criteria, since the socket data setup
>> happens unconditionally. However (and as said before), non-
>> sequential processor IDs are fine. Non-sequential socket IDs are
>> what is problematic.
>
> I asked non-sequential socket ID problem internally but I don't know if
> I can get a clear answer in the end, please just stay tuned for a while.
>
> Thanks,
> Chao
>
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel



-- 
Jun
Intel Open Source Technology Center

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/3] VPMU fixes

2015-12-01 Thread Tian, Kevin
> From: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com]
> Sent: Wednesday, December 02, 2015 12:50 AM
> 
> * Limit VPMU support to PMU versions 2, 3 and 4 (emulated at version 3 level)
> * Always implement family 6 VPMU quirk.
>   ==>  Intel folks: is the quirk needed for all family 6 processors or can we
> limit it to certain models?

Let me confirm this information internally. btw could you provide a link
where you find out the original quirk information?

Thanks
Kevin


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Skylake: VT-d and other error messages

2015-12-01 Thread Tian, Kevin
Eric, could you provide your motherboard information and as Andrew pointed out 
what’s f0:1f.0?

From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
Sent: Sunday, November 29, 2015 6:15 AM
To: Eric Shelton; xen-devel; Zhang, Yang Z; Tian, Kevin; Keir Fraser; Jan 
Beulich
Subject: Re: Skylake: VT-d and other error messages

On 28/11/15 20:46, Eric Shelton wrote:
Looking through the output of 'xl dmesg' on a Skylake system (i5-6600K), I 
found a number of error messages that I do not encounter on a Haswell-based 
system.  I have tried two motherboards from different manufacturers, with 
pretty much the same results.  Below are some of the unexpected messages:

Not enabling x2APIC (upon firmware request)
...
mwait-idle: does not run on family 6 model 94
...
[VT-D] iommu.c:875: iommu_fault_status: Primary Pending Fault
[VT-D] INTR-REMAP: Request device [:f0:1f.0] fault index 0, iommu reg = 
82c000201000
(on motherboard 1) [VT-D] INTR-REMAP: reason 22 - Present field in the IRTE 
entry is clear
(on motherboard 2) [VT-D] INTR-REMAP: reason 25 - Blocked a compatibility 
format interrupt request

This leads to a few questions:
1) Is there some reason x2APIC should not be enabled on Skylake?  What 
consequence, if any, is there not having x2APIC enabled?

In this case, the firmware has set the x2apic opt-out bit in the DMAR table, 
indicating that Xen should not use x2apic.

You might find an option in your BIOS to undo this; there have been enough 
errata in the past in this area that I would expect it to be a tweakable.

x2apic is the extension to xapic, which permits more than 255 cpus.  So long as 
you don't have that many, there isn't a specific problem with missing x2apic 
mode.


2) Should mwait-idle be available on Skylake?

We probably need to resync the mwait driver with Linux.  It is whitelisted on 
known cpu model numbers.


3) What about the IOMMU errors on Skylake - are they a concern?

Yes.

In both cases, PCI device f0:1f.0 is misconfigured or misbehaving.

On motherboard 1, it is delivering an interrupt for which no remapping entry 
has been set up.

On motherboard 2, it is delivering an compatibility-format interrupt, as 
opposed to a remapable-format interrupt.

For motherboard 2, Xen should disallow such a configuration.  Either interrupt 
remapping is enabled and all devices should be configured to issue remmapable 
interrupts, or interrupt remapping is disabled and everything should be 
configured to issue compatibility-format interrupts.

Either way, diagnosing the problem here starts with identifying what f0:1f.0 is.

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv4] xen/gntdev: add ioctl for grant copy

2015-12-01 Thread David Vrabel
Add IOCTL_GNTDEV_GRANT_COPY to allow applications to copy between user
space buffers and grant references.

This interface is similar to the GNTTABOP_copy hypercall ABI except
the local buffers are provided using a virtual address (instead of a
GFN and offset).  To avoid userspace from having to page align its
buffers the driver will use two or more ops if required.

If the ioctl returns 0, the application must check the status of each
segment with the segments status field.  If the ioctl returns a -ve
error code (EINVAL or EFAULT), the status of individual ops is
undefined.

Signed-off-by: David Vrabel 
---
v4:
- Use readable page for source and writeable for dest.
- Disallow local -> local so at most 1 page is needed per op.
- Increase batch size to 32.
- Clarify comments/docs.

v3:
- Rewrite with different API that matches the capabilities of the
  hypervisor ABI and eliminates some of the size/alignment
  restrictions.
---
 drivers/xen/gntdev.c  | 200 ++
 include/uapi/xen/gntdev.h |  50 
 2 files changed, 250 insertions(+)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 1be5dd0..a25f6aa 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -748,6 +748,203 @@ static long gntdev_ioctl_notify(struct gntdev_priv *priv, 
void __user *u)
return rc;
 }
 
+#define GNTDEV_COPY_BATCH 32
+
+struct gntdev_copy_batch {
+   struct gnttab_copy ops[GNTDEV_COPY_BATCH];
+   struct page *pages[GNTDEV_COPY_BATCH];
+   s16 __user *status[GNTDEV_COPY_BATCH];
+   unsigned int nr_ops;
+   unsigned int nr_pages;
+};
+
+static int gntdev_get_page(struct gntdev_copy_batch *batch, void __user *virt,
+  bool writeable, unsigned long *gfn)
+{
+   unsigned long addr = (unsigned long)virt;
+   struct page *page;
+   unsigned long xen_pfn;
+   int ret;
+
+   ret = get_user_pages_fast(addr, 1, writeable, );
+   if (ret < 0)
+   return ret;
+
+   batch->pages[batch->nr_pages++] = page;
+
+   xen_pfn = page_to_xen_pfn(page) + XEN_PFN_DOWN(addr & ~PAGE_MASK);
+   *gfn = pfn_to_gfn(xen_pfn);
+
+   return 0;
+}
+
+static void gntdev_put_pages(struct gntdev_copy_batch *batch)
+{
+   unsigned int i;
+
+   for (i = 0; i < batch->nr_pages; i++)
+   put_page(batch->pages[i]);
+   batch->nr_pages = 0;
+}
+
+static int gntdev_copy(struct gntdev_copy_batch *batch)
+{
+   unsigned int i;
+
+   gnttab_batch_copy(batch->ops, batch->nr_ops);
+   gntdev_put_pages(batch);
+
+   /*
+* For each completed op, update the status if the op failed
+* and all previous ops for the segment were successful.
+*/
+   for (i = 0; i < batch->nr_ops; i++) {
+   s16 status = batch->ops[i].status;
+   s16 old_status;
+
+   if (status == GNTST_okay)
+   continue;
+
+   if (__get_user(old_status, batch->status[i]))
+   return -EFAULT;
+
+   if (old_status != GNTST_okay)
+   continue;
+
+   if (__put_user(status, batch->status[i]))
+   return -EFAULT;
+   }
+
+   batch->nr_ops = 0;
+   return 0;
+}
+
+static int gntdev_grant_copy_seg(struct gntdev_copy_batch *batch,
+struct gntdev_grant_copy_segment *seg,
+s16 __user *status)
+{
+   uint16_t copied = 0;
+
+   /*
+* Disallow local -> local copies since there is only space in
+* batch->pages for one page per-op and this would be a very
+* expensive memcpy().
+*/
+   if (!(seg->flags & (GNTCOPY_source_gref | GNTCOPY_dest_gref)))
+   return -EINVAL;
+
+   /* Can't cross page if source/dest is a grant ref. */
+   if (seg->flags & GNTCOPY_source_gref) {
+   if (seg->source.foreign.offset + seg->len > XEN_PAGE_SIZE)
+   return -EINVAL;
+   }
+   if (seg->flags & GNTCOPY_dest_gref) {
+   if (seg->dest.foreign.offset + seg->len > XEN_PAGE_SIZE)
+   return -EINVAL;
+   }
+
+   if (put_user(GNTST_okay, status))
+   return -EFAULT;
+
+   while (copied < seg->len) {
+   struct gnttab_copy *op;
+   void __user *virt;
+   size_t len, off;
+   unsigned long gfn;
+   int ret;
+
+   if (batch->nr_ops >= GNTDEV_COPY_BATCH) {
+   ret = gntdev_copy(batch);
+   if (ret < 0)
+   return ret;
+   }
+
+   len = seg->len - copied;
+
+   op = >ops[batch->nr_ops];
+   op->flags = 0;
+
+   if (seg->flags & GNTCOPY_source_gref) {
+   op->source.u.ref = seg->source.foreign.ref;
+   

[Xen-devel] [PATCH] x86/time: Don't use EFI's GetTime call by default

2015-12-01 Thread Ross Lagerwall
When EFI is used, don't use EFI's GetTime() to get the time, because it
is broken on many platforms. From Linux commit 7efe665903d0 ("rtc:
Disable EFI rtc for x86"):
"Disable it explicitly for x86 so that we don't give users false
hope that this driver will work - it won't, and your machine is likely
to crash."

Signed-off-by: Ross Lagerwall 
---
 xen/arch/x86/time.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 30d52c4..28895c2 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -679,20 +679,28 @@ static void __get_cmos_time(struct rtc_time *rtc)
 rtc->year += 100;
 }
 
+/* EFI's GetTime() is frequently broken so don't use it by default. */
+#undef USE_EFI_GET_TIME
+
 static unsigned long get_cmos_time(void)
 {
-unsigned long res, flags;
+#ifdef USE_EFI_GET_TIME
+unsigned long res;
+#endif
+unsigned long flags;
 struct rtc_time rtc;
 unsigned int seconds = 60;
 static bool_t __read_mostly cmos_rtc_probe;
 boolean_param("cmos-rtc-probe", cmos_rtc_probe);
 
+#ifdef USE_EFI_GET_TIME
 if ( efi_enabled )
 {
 res = efi_get_time();
 if ( res )
 return res;
 }
+#endif
 
 if ( likely(!(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC)) )
 cmos_rtc_probe = 0;
-- 
2.4.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-pciback: fix up cleanup path when alloc fails

2015-12-01 Thread Konrad Rzeszutek Wilk
On Thu, Nov 26, 2015 at 02:32:39PM -0600, Doug Goldstein wrote:
> When allocating a pciback device fails, avoid the possibility of a
> use after free.

Reviewed-by: Konrad Rzeszutek Wilk 

Ugh, and it looks like xen-blkfront has the same issue.

> 
> Reported-by: Jonathan Creekmore 
> Signed-off-by: Doug Goldstein 
> ---
>  drivers/xen/xen-pciback/xenbus.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/xen/xen-pciback/xenbus.c 
> b/drivers/xen/xen-pciback/xenbus.c
> index 98bc345..4843741 100644
> --- a/drivers/xen/xen-pciback/xenbus.c
> +++ b/drivers/xen/xen-pciback/xenbus.c
> @@ -44,7 +44,6 @@ static struct xen_pcibk_device *alloc_pdev(struct 
> xenbus_device *xdev)
>   dev_dbg(>dev, "allocated pdev @ 0x%p\n", pdev);
>  
>   pdev->xdev = xdev;
> - dev_set_drvdata(>dev, pdev);
>  
>   mutex_init(>dev_lock);
>  
> @@ -58,6 +57,9 @@ static struct xen_pcibk_device *alloc_pdev(struct 
> xenbus_device *xdev)
>   kfree(pdev);
>   pdev = NULL;
>   }
> +
> + dev_set_drvdata(>dev, pdev);
> +
>  out:
>   return pdev;
>  }
> -- 
> 2.4.10
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] libxl: Introduce a template for devices with a controller

2015-12-01 Thread George Dunlap
On Tue, Dec 1, 2015 at 3:58 PM, Wei Liu  wrote:
> On Tue, Dec 01, 2015 at 12:09:58PM +, George Dunlap wrote:
> [...]
>> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
>> index 6b73848..44e2951 100644
>> --- a/tools/libxl/libxl.h
>> +++ b/tools/libxl/libxl.h
>> @@ -1396,6 +1396,71 @@ void libxl_vtpminfo_list_free(libxl_vtpminfo *, int 
>> nr_vtpms);
>>   *
>>   *   This function does not interact with the guest and therefore
>>   *   cannot block on the guest.
>> + *
>> + * Controllers
>> + * ---
>> + *
>> + * Most devices are treated individually.  Some classes of device,
>> + * however, like USB or SCSI, inherently have the need to have a
>> + * hierarchy of different levels, with lower-level devices "attached"
>> + * to higher-level ones.  USB for instance has "controllers" at the
>> + * top, which have buses, on which are devices, which consist of
>> + * multiple interfaces.  SCSI has "hosts" at the top, then buses,
>> + * targets, and LUNs.
>> + *
>> + * In that case, for each , there will be a set of functions
>> + * and types for each .  For example, for =usb, there
>> + * may be  ctrl (controller) and dev (device), with ctrl being
>> + * level 0.
>> + *
>> + * libxl_device__ will act more or
>
> Missed "level0" comment from Chunyan?

The only comment of Chunyan's I could find that has  in it is
actually correcting  => .  Did I
misunderstand, or did you? :-)

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/time: Don't use EFI's GetTime call by default

2015-12-01 Thread Jan Beulich
>>> On 01.12.15 at 17:57,  wrote:
> When EFI is used, don't use EFI's GetTime() to get the time, because it
> is broken on many platforms. From Linux commit 7efe665903d0 ("rtc:
> Disable EFI rtc for x86"):
> "Disable it explicitly for x86 so that we don't give users false
> hope that this driver will work - it won't, and your machine is likely
> to crash."
> 
> Signed-off-by: Ross Lagerwall 

NAK, since being conceptually wrong (and both of my systems work
fine). Vendors should get their firmware fixed, and by not using
runtime service functions we would give them even less reason to
do so. Until then we have "efi=no-rs".

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] libxl: Introduce a template for devices with a controller

2015-12-01 Thread Wei Liu
On Tue, Dec 01, 2015 at 05:03:28PM +, George Dunlap wrote:
> On Tue, Dec 1, 2015 at 3:58 PM, Wei Liu  wrote:
> > On Tue, Dec 01, 2015 at 12:09:58PM +, George Dunlap wrote:
> > [...]
> >> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> >> index 6b73848..44e2951 100644
> >> --- a/tools/libxl/libxl.h
> >> +++ b/tools/libxl/libxl.h
> >> @@ -1396,6 +1396,71 @@ void libxl_vtpminfo_list_free(libxl_vtpminfo *, int 
> >> nr_vtpms);
> >>   *
> >>   *   This function does not interact with the guest and therefore
> >>   *   cannot block on the guest.
> >> + *
> >> + * Controllers
> >> + * ---
> >> + *
> >> + * Most devices are treated individually.  Some classes of device,
> >> + * however, like USB or SCSI, inherently have the need to have a
> >> + * hierarchy of different levels, with lower-level devices "attached"
> >> + * to higher-level ones.  USB for instance has "controllers" at the
> >> + * top, which have buses, on which are devices, which consist of
> >> + * multiple interfaces.  SCSI has "hosts" at the top, then buses,
> >> + * targets, and LUNs.
> >> + *
> >> + * In that case, for each , there will be a set of functions
> >> + * and types for each .  For example, for =usb, there
> >> + * may be  ctrl (controller) and dev (device), with ctrl being
> >> + * level 0.
> >> + *
> >> + * libxl_device__ will act more or
> >
> > Missed "level0" comment from Chunyan?
> 
> The only comment of Chunyan's I could find that has  in it is
> actually correcting  => .  Did I
> misunderstand, or did you? :-)

Oops. I misread. Sorry about the noise.

Wei.

> 
>  -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 0/3] VPMU fixes

2015-12-01 Thread Boris Ostrovsky
* Limit VPMU support to PMU versions 2, 3 and 4 (emulated at version 3 level)
* Always implement family 6 VPMU quirk.
  ==>  Intel folks: is the quirk needed for all family 6 processors or can we
limit it to certain models?
* Update (or rather restore) arch VPMU files maintainership

Boris Ostrovsky (3):
  x86/VPMU: Support only versions 2 through 4 of architectural
performance monitoring
  x86/VPMU: No need to check whether VPMU quirk is needed on Intel
  MAINTAINERS: Restore original maintainership of arch VPMU files

 MAINTAINERS   |  2 +
 xen/arch/x86/cpu/vpmu_intel.c | 91 ---
 2 files changed, 26 insertions(+), 67 deletions(-)

-- 
1.8.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 3/3] MAINTAINERS: Restore original maintainership of arch VPMU files

2015-12-01 Thread Boris Ostrovsky
It was lost when vpmu* files were moved from xen/arch/x86/hvm/{vmx|svm}/ to
xen/arch/x86/cpu/

Signed-off-by: Boris Ostrovsky 
Cc: ian.campb...@citrix.com
Cc: ian.jack...@eu.citrix.com
Cc: t...@xen.org
Cc: suravee.suthikulpa...@amd.com
Cc: aravind.gopalakrish...@amd.com
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e376646..87bc106 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -117,6 +117,7 @@ M:  Suravee Suthikulpanit 
 M: Aravind Gopalakrishnan 
 S: Supported
 F: xen/arch/x86/hvm/svm/
+F:  xen/arch/x86/cpu/vpmu_amd.c
 
 ARINC653 SCHEDULER
 M: Josh Whitehead 
@@ -205,6 +206,7 @@ S:  Supported
 F: xen/arch/x86/hvm/vmx/
 F: xen/arch/x86/mm/p2m-ept.c
 F: xen/include/asm-x86/hvm/vmx/
+F:  xen/arch/x86/cpu/vpmu_intel.c
 
 IOMMU VENDOR INDEPENDENT CODE
 M: Jan Beulich 
-- 
1.8.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/3] x86/VPMU: Support only versions 2 through 4 of architectural performance monitoring

2015-12-01 Thread Boris Ostrovsky
We need to have at least version 2 since it's the first version to
support various control and status registers (such as
MSR_CORE_PERF_GLOBAL_CTRL) that VPMU relies on always having.

Since we don't fully support version 4 yet report it as version 3 in
CPUID.

With explicit testing for PMU version we can now remove CPUID model
check.

Signed-off-by: Boris Ostrovsky 
---

v2:
* Support PMU version 4 (emulated at version 3 level)
* Minor code adjustments


 xen/arch/x86/cpu/vpmu_intel.c | 73 ++-
 1 file changed, 23 insertions(+), 50 deletions(-)

diff --git a/xen/arch/x86/cpu/vpmu_intel.c b/xen/arch/x86/cpu/vpmu_intel.c
index d5ea7fe..a267a3c 100644
--- a/xen/arch/x86/cpu/vpmu_intel.c
+++ b/xen/arch/x86/cpu/vpmu_intel.c
@@ -733,11 +733,11 @@ static void core2_vpmu_do_cpuid(unsigned int input,
 unsigned int *eax, unsigned int *ebx,
 unsigned int *ecx, unsigned int *edx)
 {
-if (input == 0x1)
+switch ( input )
 {
-struct vpmu_struct *vpmu = vcpu_vpmu(current);
+case 0x1:
 
-if ( vpmu_is_set(vpmu, VPMU_CPU_HAS_DS) )
+if ( vpmu_is_set(vcpu_vpmu(current), VPMU_CPU_HAS_DS) )
 {
 /* Switch on the 'Debug Store' feature in CPUID.EAX[1]:EDX[21] */
 *edx |= cpufeat_mask(X86_FEATURE_DS);
@@ -746,6 +746,13 @@ static void core2_vpmu_do_cpuid(unsigned int input,
 if ( cpu_has(_cpu_data, X86_FEATURE_DSCPL) )
 *ecx |= cpufeat_mask(X86_FEATURE_DSCPL);
 }
+break;
+
+case 0xa:
+/* Since we don't fully emulate version 4 report version 3 */
+if ( MASK_EXTR(*eax, PMU_VERSION_MASK) == 4 )
+*eax = (*eax & ~PMU_VERSION_MASK) | MASK_INSR(3, PMU_VERSION_MASK);
+break;
 }
 }
 
@@ -955,59 +962,25 @@ int vmx_vpmu_initialise(struct vcpu *v)
 int __init core2_vpmu_init(void)
 {
 u64 caps;
+unsigned int version = 0;
 
-if ( current_cpu_data.x86 != 6 )
+if ( current_cpu_data.cpuid_level >= 0xa )
+version = MASK_EXTR(cpuid_eax(0xa), PMU_VERSION_MASK);
+
+if ( version == 4 )
+printk(XENLOG_INFO "VPMU: PMU version 4 is not fully supported. "
+   "Emulating version 3\n");
+else if ( (version != 2) && (version != 3) )
 {
-printk(XENLOG_WARNING "VPMU: only family 6 is supported\n");
+printk(XENLOG_WARNING "VPMU: PMU version %u is not supported\n",
+   version);
 return -EINVAL;
 }
 
-switch ( current_cpu_data.x86_model )
+if ( current_cpu_data.x86 != 6 )
 {
-/* Core2: */
-case 0x0f: /* original 65 nm celeron/pentium/core2/xeon, 
"Merom"/"Conroe" */
-case 0x16: /* single-core 65 nm celeron/core2solo "Merom-L"/"Conroe-L" 
*/
-case 0x17: /* 45 nm celeron/core2/xeon "Penryn"/"Wolfdale" */
-case 0x1d: /* six-core 45 nm xeon "Dunnington" */
-
-case 0x2a: /* SandyBridge */
-case 0x2d: /* SandyBridge, "Romley-EP" */
-
-/* Nehalem: */
-case 0x1a: /* 45 nm nehalem, "Bloomfield" */
-case 0x1e: /* 45 nm nehalem, "Lynnfield", "Clarksfield", "Jasper 
Forest" */
-case 0x2e: /* 45 nm nehalem-ex, "Beckton" */
-
-/* Westmere: */
-case 0x25: /* 32 nm nehalem, "Clarkdale", "Arrandale" */
-case 0x2c: /* 32 nm nehalem, "Gulftown", "Westmere-EP" */
-case 0x2f: /* 32 nm Westmere-EX */
-
-case 0x3a: /* IvyBridge */
-case 0x3e: /* IvyBridge EP */
-
-/* Haswell: */
-case 0x3c:
-case 0x3f:
-case 0x45:
-case 0x46:
-
-/* Broadwell */
-case 0x3d:
-case 0x4f:
-case 0x56:
-
-/* future: */
-case 0x4e:
-
-/* next gen Xeon Phi */
-case 0x57:
-break;
-
-default:
-printk(XENLOG_WARNING "VPMU: Unsupported CPU model %#x\n",
-   current_cpu_data.x86_model);
-return -EINVAL;
+printk(XENLOG_WARNING "VPMU: only family 6 is supported\n");
+return -EINVAL;
 }
 
 arch_pmc_cnt = core2_get_arch_pmc_count();
-- 
1.8.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 2/3] x86/VPMU: No need to check whether VPMU quirk is needed on Intel

2015-12-01 Thread Boris Ostrovsky
We only support family 6 so quirk handling is always needed.

Signed-off-by: Boris Ostrovsky 
---
 xen/arch/x86/cpu/vpmu_intel.c | 18 +-
 1 file changed, 1 insertion(+), 17 deletions(-)

diff --git a/xen/arch/x86/cpu/vpmu_intel.c b/xen/arch/x86/cpu/vpmu_intel.c
index a267a3c..86d390f 100644
--- a/xen/arch/x86/cpu/vpmu_intel.c
+++ b/xen/arch/x86/cpu/vpmu_intel.c
@@ -106,24 +106,11 @@ static const unsigned int regs_off =
  * 1 (or another value != 0) into it.
  * There exist no errata and the real cause of this behaviour is unknown.
  */
-bool_t __read_mostly is_pmc_quirk;
-
-static void check_pmc_quirk(void)
-{
-if ( current_cpu_data.x86 == 6 )
-is_pmc_quirk = 1;
-else
-is_pmc_quirk = 0;
-}
-
 static void handle_pmc_quirk(u64 msr_content)
 {
 int i;
 u64 val;
 
-if ( !is_pmc_quirk )
-return;
-
 val = msr_content;
 for ( i = 0; i < arch_pmc_cnt; i++ )
 {
@@ -812,8 +799,7 @@ static int core2_vpmu_do_interrupt(struct cpu_user_regs 
*regs)
 rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, msr_content);
 if ( msr_content )
 {
-if ( is_pmc_quirk )
-handle_pmc_quirk(msr_content);
+handle_pmc_quirk(msr_content);
 core2_vpmu_cxt->global_status |= msr_content;
 msr_content = 0xC007 | ((1 << arch_pmc_cnt) - 1);
 wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, msr_content);
@@ -998,8 +984,6 @@ int __init core2_vpmu_init(void)
   sizeof(uint64_t) * fixed_pmc_cnt +
   sizeof(struct xen_pmu_cntr_pair) * arch_pmc_cnt;
 
-check_pmc_quirk();
-
 if ( sizeof(struct xen_pmu_data) + sizeof(uint64_t) * fixed_pmc_cnt +
  sizeof(struct xen_pmu_cntr_pair) * arch_pmc_cnt > PAGE_SIZE )
 {
-- 
1.8.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] iommu/quirk: disable shared EPT for Sandybridge and earlier processors.

2015-12-01 Thread Anshul Makkar
Based on the discussion below, can I assume there is an agreement for using 
processor model for filtering or chipset ID will be the preferred candidate.

Thanks
Anshul Makkar

-Original Message-
From: Tian, Kevin [mailto:kevin.t...@intel.com] 
Sent: 26 November 2015 07:17
To: Malcolm Crossley ; Jan Beulich 
; Andrew Cooper ; Anshul Makkar 

Cc: Zhang, Yang Z ; xen-devel@lists.xen.org
Subject: RE: [Xen-devel] [PATCH] iommu/quirk: disable shared EPT for 
Sandybridge and earlier processors.

> From: Malcolm Crossley [mailto:malcolm.cross...@citrix.com]
> Sent: Wednesday, November 25, 2015 11:59 PM
> 
> On 25/11/15 15:38, Jan Beulich wrote:
>  On 25.11.15 at 16:13,  wrote:
> >> On 25/11/15 10:49, Jan Beulich wrote:
> >> On 25.11.15 at 11:28,  wrote:
>  On 24/11/15 17:41, Jan Beulich wrote:
>  On 24.11.15 at 18:17,  wrote:
> >> --- a/xen/drivers/passthrough/vtd/quirks.c
> >> +++ b/xen/drivers/passthrough/vtd/quirks.c
> >> @@ -320,6 +320,20 @@ void __init platform_quirks_init(void)
> >>  /* Tylersburg interrupt remap quirk */
> >>  if ( iommu_intremap )
> >>  tylersburg_intremap_quirk();
> >> +
> >> +/*
> >> + * Disable shared EPT ("sharept") on Sandybridge and older 
> >> processors
> >> + * by default.
> >> + * SandyBridge has no huge page support for IOTLB which 
> >> + leads to
> >> fallback
> >> + * on 4k pages and leads to performance degradation.
> >> + *
> >> + * Shared EPT ("sharept") will be disabled only if user has not
> >> + * provided explicit choice on the command line thus 
> >> + iommu_hap_pt_share
> >> is
> >> + * at its initialized value of -1.
> >> + */
> >> +if ( (boot_cpu_data.x86 == 0x06 && 
> >> + (boot_cpu_data.x86_model <= 0x2F
> ||
> >> +  boot_cpu_data.x86_model == 0x36)) && 
> >> + (iommu_hap_pt_share ==
> -1) )
> >> +iommu_hap_pt_share = 0;
> > If we really want to do this, then I think we should key this on 
> > EPT but not VT-d having 2M support, instead of on CPU models.
>  This check is already performed by vtd_ept_page_compatible()
> >>> Yeah, I realized there would be such a check on the way home.
> >>>
>  The problem is that SandyBridge IOMMUs advertise 2M support and 
>  do function with it, but cannot cache 2MB translations in the IOTLBs.
> 
>  As a result, attempting to use 2M translations causes 
>  substantially worse performance than 4K translations.
> >>> So commit message and comment should make this more explicit, to 
> >>> avoid the impression "IOTLB" isn't just the relatively common 
> >>> mis-naming of "IOMMU".
> >>>
> >>> Plus I guess the sharing won't need suppressing if !opt_hap_2mb?
> >>>
> >>> Further the model based check is relatively broad, and includes 
> >>> Atoms (0x36 actually is one), which can't be considered 
> >>> "Sandybridge or older" imo.
> >>>
> >>> And finally I'm not fully convinced using CPU model info to deduce 
> >>> chipset behavior is entirely correct (albeit perhaps in practice 
> >>> it'll be fine except maybe when running Xen itself virtualized).
> >>
> >> What else would you suggest? I can't think of any better 
> >> identifying information.
> >
> > Chipset IDs / revisions?
> 
> In this case the IOMMU is integrated into the Sandybridge-EP processor itself.
> Unfortunately there's no register to query the IOTLB configuration of 
> the IOMMU and so we're stuck identifying the via the processor model number 
> itself.
> 
> Malcolm
> 

I'm OK to use processor model here, though ideally Jan is right. :-)

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Hotplugged devices in Xen 4.5 and domain reboot

2015-12-01 Thread Ian Campbell
On Tue, 2015-12-01 at 18:48 +0200, Iurii Mykhalskyi wrote:
> Thanks to all for a replays, please see my answers below:
> 
> On 12/01/2015 05:29 PM, Wei Liu wrote:
> > On Tue, Dec 01, 2015 at 04:58:55PM +0200, Iurii Mykhalskyi wrote:
> > > Our real usb mass-storage device are located at driver domain (DomD).
> > > So we
> > > setup second block-device backend there.
> > > 
> > > To hotplug usb mass-storage from DomD we use follow command:
> > > 
> > > xl block-attach domU_id phy:/bla-bla,xvda10,w,backend="DomD"
> > > 
> > What happens if you run this in Dom0? I guess DomD doesn't respond to
> > the request?
>     Yes, there is no responded from domD, because actual storage device
> are located there, and toolstack stuck on real device existence check.

This check is a toolstack bug which should be fixed. We've squashed some of
them at various points, but I'd not be surprised if others remain.

> > > There was no support of attaching block-device in runtime from domain
> > > other
> > > to Domain-0, so we have made some hacks to allow call block-attach
> > > command
> > > from non-dom0 privileged domain.
> > So this is a special use case. This is the first time I know people
> > actually run xl block-attach in driver domain.
>     Yes, this is special case and we this by our solution design.
> > > One of patches was - don't update
> > > /var/lib/xen/userdata-d.$DOMID-$UUID.libxl-json during execution of
> > > this
> > > command (because this log located on dom0 rootfs and we don't have
> > > any
> > > access to it from DomD). So, there is no different in configs before
> > > and
> > > after hotplug.
> > > 
> > The state of $DOMID is recorded in libxl-json file. No wonder you lose
> > all state.
> > 
> > But even if you write those states, they are going to be inside driver
> > domain.  There is no way at the moment to synthesise the state inside
> > Dom0 and DomD into one. There is also difficulty in how you can split
> > the synthesised and dispatch the states to multiple entities again when
> > rebuilding a domain.
> > 
> > So I think having multiple entities managing state of one single domain
> > is bad. I think the proper way of making it work is to make hotplug
> > device from domain other than Dom0 work.
> > 
> > There is a daemon "xl devd" in driver domain. We might be able to teach
> > it to response to Dom0 toostack request. I'm a bit surprised if it
> > doesn't do that already. Did you forget to start that daemon?
>     We can't run devd in driver domain, because it failed on connect to
> xenstored socket (/var/run/xenstored/socket - we have xenstored running
> only in dom0).   

devd _should_ be able to talk to xenstored over the kernel provided
interface to the shared ring rather than the local socket.

It is certainly not expected that devd be colocated in the same domain as
happens to be running xenstored.

If this is not working then there is another bug.

> > In general a driver domain would not be expected to have sufficient
> > privilege over e.g. a guest domain's /local/domain/domU/devices to
> > create
> > the f.e. dirs.

>     In our solution we have to create 2 full privileged domains - Dom0 and 
> DomD, so we need 2 toolstack domains.
>    Any special privileges hack wasn't done - we need just to setup additional 
> permissions for DomD.

I'm afraid you simply cannot have 2 toolstack domains. The toolstack is a
singleton entity in a Xen system.

If you want to run toolstack operations from a non-toolstack domain then
you will need to arrange for some (likely out-of-band) mechanism for such
domains to ask the single toolstack domain to do something on their behalf.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 0/2] libxc: domain builder related enhancements

2015-12-01 Thread Juergen Gross
Add some more checks for having got allocated memory in the domain
builder. Related to this add an INVALID_PFN macro to be able to do
these checks.

Juergen Gross (2):
  libxc: replace INVALID_P2M_ENTRY by INVALID_PFN
  libxc: do proper return code checking of allocator in domain builder

 tools/libxc/include/xc_dom.h  |  2 +-
 tools/libxc/xc_compression.c  | 10 +-
 tools/libxc/xc_core.c |  4 ++--
 tools/libxc/xc_dom_arm.c  |  2 +-
 tools/libxc/xc_dom_core.c | 11 ---
 tools/libxc/xc_dom_x86.c  | 12 +++-
 tools/libxc/xc_offline_page.c |  2 +-
 7 files changed, 29 insertions(+), 14 deletions(-)

-- 
2.6.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/2] libxc: replace INVALID_P2M_ENTRY by INVALID_PFN

2015-12-01 Thread Juergen Gross
INVALID_P2M_ENTRY is defined as (xen_pfn_t)-1 and is often used
according to it's type for an invalid pfn. Change the name of the
macro to INVALID_PFN.

Signed-off-by: Juergen Gross 
---
 tools/libxc/include/xc_dom.h  |  2 +-
 tools/libxc/xc_compression.c  | 10 +-
 tools/libxc/xc_core.c |  4 ++--
 tools/libxc/xc_dom_arm.c  |  2 +-
 tools/libxc/xc_dom_core.c |  4 ++--
 tools/libxc/xc_dom_x86.c  |  2 +-
 tools/libxc/xc_offline_page.c |  2 +-
 7 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 43a65ee..3c94b57 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -19,7 +19,7 @@
 #include 
 #include 
 
-#define INVALID_P2M_ENTRY   ((xen_pfn_t)-1)
+#define INVALID_PFN ((xen_pfn_t)-1)
 
 /* --- typedefs and structs  */
 
diff --git a/tools/libxc/xc_compression.c b/tools/libxc/xc_compression.c
index b1b16e8..89c1114 100644
--- a/tools/libxc/xc_compression.c
+++ b/tools/libxc/xc_compression.c
@@ -217,7 +217,7 @@ char *get_cache_page(comp_ctx *ctx, xen_pfn_t pfn,
 
 /* If the list is full, evict a page from the tail end. */
 item = ctx->page_list_tail;
-if (item->pfn != INVALID_P2M_ENTRY)
+if (item->pfn != INVALID_PFN)
 ctx->pfn2cache[item->pfn] = NULL;
 
 item->pfn = pfn;
@@ -278,7 +278,7 @@ void invalidate_cache_page(comp_ctx *ctx, xen_pfn_t pfn)
 ctx->page_list_tail = item;
 }
 ctx->pfn2cache[pfn] = NULL;
-(ctx->page_list_tail)->pfn = INVALID_P2M_ENTRY;
+(ctx->page_list_tail)->pfn = INVALID_PFN;
 }
 }
 
@@ -295,7 +295,7 @@ int xc_compression_add_page(xc_interface *xch, comp_ctx 
*ctx,
 /* pagetable page */
 if (israw)
 invalidate_cache_page(ctx, pfn);
-ctx->sendbuf_pfns[ctx->pfns_len] = israw ? INVALID_P2M_ENTRY : pfn;
+ctx->sendbuf_pfns[ctx->pfns_len] = israw ? INVALID_PFN : pfn;
 memcpy(ctx->inputbuf + ctx->pfns_len * XC_PAGE_SIZE, page, XC_PAGE_SIZE);
 ctx->pfns_len++;
 
@@ -329,7 +329,7 @@ int xc_compression_compress_pages(xc_interface *xch, 
comp_ctx *ctx,
 cache_copy = NULL;
 current_page = ctx->inputbuf + ctx->pfns_index * XC_PAGE_SIZE;
 
-if (ctx->sendbuf_pfns[ctx->pfns_index] == INVALID_P2M_ENTRY)
+if (ctx->sendbuf_pfns[ctx->pfns_index] == INVALID_PFN)
 israw = 1;
 else
 cache_copy = get_cache_page(ctx,
@@ -518,7 +518,7 @@ comp_ctx *xc_compression_create_context(xc_interface *xch,
 
 for (i = 0; i < num_cache_pages; i++)
 {
-ctx->cache[i].pfn = INVALID_P2M_ENTRY;
+ctx->cache[i].pfn = INVALID_PFN;
 ctx->cache[i].page = ctx->cache_base + i * XC_PAGE_SIZE;
 ctx->cache[i].prev = (i == 0) ? NULL : &(ctx->cache[i - 1]);
 ctx->cache[i].next = ((i+1) == num_cache_pages)? NULL :
diff --git a/tools/libxc/xc_core.c b/tools/libxc/xc_core.c
index 011336c..d792566 100644
--- a/tools/libxc/xc_core.c
+++ b/tools/libxc/xc_core.c
@@ -808,13 +808,13 @@ xc_domain_dumpcore_via_callback(xc_interface *xch,
 gmfn = p2m[i];
 else
 gmfn = ((uint64_t *)p2m)[i];
-if ( gmfn == INVALID_P2M_ENTRY )
+if ( gmfn == INVALID_PFN )
 continue;
 }
 else
 {
 gmfn = ((uint32_t *)p2m)[i];
-if ( gmfn == (uint32_t)INVALID_P2M_ENTRY )
+if ( gmfn == (uint32_t)INVALID_PFN )
continue;
 }
 
diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
index d9a6371..64a8b67 100644
--- a/tools/libxc/xc_dom_arm.c
+++ b/tools/libxc/xc_dom_arm.c
@@ -439,7 +439,7 @@ static int meminit(struct xc_dom_image *dom)
 if ( dom->p2m_host == NULL )
 return -EINVAL;
 for ( pfn = 0; pfn < p2m_size; pfn++ )
-dom->p2m_host[pfn] = INVALID_P2M_ENTRY;
+dom->p2m_host[pfn] = INVALID_PFN;
 
 /* setup initial p2m and allocate guest memory */
 for ( i = 0; i < GUEST_RAM_BANKS && dom->rambank_size[i]; i++ )
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index 8967970..d0c6596 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -971,7 +971,7 @@ int xc_dom_update_guest_p2m(struct xc_dom_image *dom)
   __FUNCTION__, dom->p2m_size);
 p2m_32 = dom->p2m_guest;
 for ( i = 0; i < dom->p2m_size; i++ )
-if ( dom->p2m_host[i] != INVALID_P2M_ENTRY )
+if ( dom->p2m_host[i] != INVALID_PFN )
 p2m_32[i] = dom->p2m_host[i];
 else
 p2m_32[i] = (uint32_t) - 1;
@@ -981,7 +981,7 @@ int xc_dom_update_guest_p2m(struct xc_dom_image *dom)
   __FUNCTION__, dom->p2m_size);
 p2m_64 = 

[Xen-devel] [PATCH 2/2] libxc: do proper return code checking of allocator in domain builder

2015-12-01 Thread Juergen Gross
Signed-off-by: Juergen Gross 
---
 tools/libxc/xc_dom_core.c |  7 ++-
 tools/libxc/xc_dom_x86.c  | 10 ++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index d0c6596..841e7dc 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -630,7 +630,7 @@ xen_pfn_t xc_dom_alloc_page(struct xc_dom_image *dom, char 
*name)
 pfn = dom->pfn_alloc_end - dom->rambase_pfn;
 
 if ( xc_dom_chk_alloc_pages(dom, name, 1) )
-return (xen_pfn_t)-1;
+return INVALID_PFN;
 
 DOMPRINTF("%-20s:   %-12s : 0x%" PRIx64 " (pfn 0x%" PRIpfn ")",
   __FUNCTION__, name, start, pfn);
@@ -1107,7 +1107,12 @@ int xc_dom_build_image(struct xc_dom_image *dom)
 if ( dom->arch_hooks->alloc_pgtables(dom) != 0 )
 goto err;
 if ( dom->alloc_bootstack )
+{
 dom->bootstack_pfn = xc_dom_alloc_page(dom, "boot stack");
+if ( dom->bootstack_pfn == INVALID_PFN )
+goto err;
+}
+
 DOMPRINTF("%-20s: virt_alloc_end : 0x%" PRIx64 "",
   __FUNCTION__, dom->virt_alloc_end);
 DOMPRINTF("%-20s: virt_pgtab_end : 0x%" PRIx64 "",
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 7c77e69..71b042e 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -537,10 +537,20 @@ static int alloc_magic_pages(struct xc_dom_image *dom)
 {
 /* allocate special pages */
 dom->start_info_pfn = xc_dom_alloc_page(dom, "start info");
+if ( dom->start_info_pfn == INVALID_PFN )
+return -1;
 dom->xenstore_pfn = xc_dom_alloc_page(dom, "xenstore");
+if ( dom->xenstore_pfn == INVALID_PFN )
+return -1;
 dom->console_pfn = xc_dom_alloc_page(dom, "console");
+if ( dom->console_pfn == INVALID_PFN )
+return -1;
 if ( xc_dom_feature_translated(dom) )
+{
 dom->shared_info_pfn = xc_dom_alloc_page(dom, "shared info");
+if ( dom->shared_info_pfn == INVALID_PFN )
+return -1;
+}
 dom->alloc_bootstack = 1;
 
 return 0;
-- 
2.6.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] iommu/quirk: disable shared EPT for Sandybridge and earlier processors.

2015-12-01 Thread Jan Beulich
>>> On 01.12.15 at 17:45,  wrote:
> Based on the discussion below, can I assume there is an agreement for using 
> processor model for filtering or chipset ID will be the preferred candidate.

I think the subsequent suggestion by Andrew makes it even more
desirable to remain independent of CPU model here. As said before,
we should simply leverage the PCI IDs we already have quirks for.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] build: fix clean rule to cover objects in unvisited subdirs

2015-12-01 Thread Jan Beulich
>>> On 01.12.15 at 17:34,  wrote:
>> On Dec 1, 2015, at 10:07 AM, Jan Beulich  wrote:
>> 
>> For one build run, yes. But then you can (a) build individual object
>> files and (b) as mentioned above change configuration (implying
>> that you know what you're doing). Also you could, using the
>> example above, do a kexec=y build, then a kexec=n one, then
>> notice you needed to clean in between, so you then clean using
>> kexec=n and build again with that option, but cleaning again
>> would still leave the kexec files around.
>> 
>> And btw., we have a similar issue already when you switch
>> between arches (no cleaning happens cross-arch).
> 
> OK, so you are working on a different assumption than I was. I was
> treating the clean rule as needing to be run when you are wanting to
> explicitly rebuild all object files needed for the current build 
> configuration 
> (i.e., only cleaning files that would be linked into the current hypervisor 
> build).
> It sounds like you are expecting the clean rule to clean out all object
> files no matter whether they are part of the current build configuration
> or not. 
> 
> Working on that assumption, it seems like running a:
> find . -name “*.o” -type f -delete 
> from the xen/ directory would accomplish that and would be less
> fragile than trying to grab various different variables and munge
> them to try to grab all possible .o files specified by the system. Plus,
> the find command would likely execute quicker. 
> 
> Does something like that seem acceptable?

I can't see an immediate reason why it would not be, as long
as it's clear that this won't eliminate the need to recurse into
the subdirectories. But I'd certainly recommend to wait for
other feedback (namely by other hypervisor maintainers)
before you go that route.

Also please note that -delete is not a standard primary, so
would need replacing.

Also the same global approach could then perhaps be used to
remove all the .*.d files.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86: Disable IOAPIC earlier during shutdown

2015-12-01 Thread Jan Beulich
>>> On 01.12.15 at 17:38,  wrote:
> On 01/12/15 16:26, Jan Beulich wrote:
> On 01.12.15 at 17:13,  wrote:
>>> Commit fc0c3fa2ad5c ("x86/IO-APIC: fix setup of Xen internally used IRQs
>>> (take 2)") introduced a regression on some hardware where Xen would hang
>>> during shutdown, repeating the following message:
>>> APIC error on CPU0: 08(08), Receive accept error
>>>
>>> This appears to be because an interrupt (in this case from the serial
>>> console) destined for a CPU other than the boot CPU is left unhandled so
>>> an APIC error on CPU 0 is generated instead.
>>>
>>> To fix this, disable the IOAPIC before each CPU's local APIC is
>>> disabled so that these interrupts are not generated.
>> But wouldn't a similar issue occur for MSI or MSI-like (IOMMU)
>> interrupts? I.e. shouldn't we perhaps invoke fixup_irqs() after
>> having restricted cpu_online_map to just CPU0?
> 
> a fixup_irq()s in __stop_this_cpu() might do it, although there will be
> heavy lock contention on all the irq descriptors.
> 
> A better option would be to run fixup_irq()s once and make them all
> point to cpu0, then take the others down.  This will probably involve
> passing a parameter to fixup_irq()s to conditionally override its use of
> the cpu_online_map.

The latter was what I actually had in mind, I just didn't check
whether we can re-write cpu_online_map up front (which it looks
like we can't). So yes, passing fixup_irqs() a cpumask would be
the way to go.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Hotplugged devices in Xen 4.5 and domain reboot

2015-12-01 Thread Iurii Mykhalskyi

Thanks to all for a replays, please see my answers below:

On 12/01/2015 05:29 PM, Wei Liu wrote:

On Tue, Dec 01, 2015 at 04:58:55PM +0200, Iurii Mykhalskyi wrote:

Our real usb mass-storage device are located at driver domain (DomD). So we
setup second block-device backend there.

To hotplug usb mass-storage from DomD we use follow command:

xl block-attach domU_id phy:/bla-bla,xvda10,w,backend="DomD"


What happens if you run this in Dom0? I guess DomD doesn't respond to
the request?

Yes, there is no responded from domD, because actual storage device
are located there, and toolstack stuck on real device existence check.

There was no support of attaching block-device in runtime from domain other
to Domain-0, so we have made some hacks to allow call block-attach command
from non-dom0 privileged domain.

So this is a special use case. This is the first time I know people
actually run xl block-attach in driver domain.

Yes, this is special case and we this by our solution design.



One of patches was - don't update
/var/lib/xen/userdata-d.$DOMID-$UUID.libxl-json during execution of this
command (because this log located on dom0 rootfs and we don't have any
access to it from DomD). So, there is no different in configs before and
after hotplug.


The state of $DOMID is recorded in libxl-json file. No wonder you lose
all state.

But even if you write those states, they are going to be inside driver
domain.  There is no way at the moment to synthesise the state inside
Dom0 and DomD into one. There is also difficulty in how you can split
the synthesised and dispatch the states to multiple entities again when
rebuilding a domain.

So I think having multiple entities managing state of one single domain
is bad. I think the proper way of making it work is to make hotplug
device from domain other than Dom0 work.

There is a daemon "xl devd" in driver domain. We might be able to teach
it to response to Dom0 toostack request. I'm a bit surprised if it
doesn't do that already. Did you forget to start that daemon?

We can't run devd in driver domain, because it failed on connect to
xenstored socket (/var/run/xenstored/socket - we have xenstored running
only in dom0).


Roger, Ian and Ian, any thought?

Wei.


On 12/01/2015 05:41 PM, Ian Campbell wrote:

On Tue, 2015-12-01 at 15:29 +, Wei Liu wrote:

On Tue, Dec 01, 2015 at 04:58:55PM +0200, Iurii Mykhalskyi wrote:

Our real usb mass-storage device are located at driver domain (DomD).
So we
setup second block-device backend there.

To hotplug usb mass-storage from DomD we use follow command:

xl block-attach domU_id phy:/bla-bla,xvda10,w,backend="DomD"


What happens if you run this in Dom0? I guess DomD doesn't respond to
the request?


There was no support of attaching block-device in runtime from domain
other
to Domain-0, so we have made some hacks to allow call block-attach
command
from non-dom0 privileged domain.

So this is a special use case. This is the first time I know people
actually run xl block-attach in driver domain.

Toolstack commands (xl *) should be run in the toolstack domain, not in the
driver domain.

I don't think it should be expected that the latter work (at least not
without a large amount of development work).

In general a driver domain would not be expected to have sufficient
privilege over e.g. a guest domain's /local/domain/domU/devices to create
the f.e. dirs.

In our solution we have to create 2 full privileged domains - Dom0 and
DomD, so we need 2 toolstack domains.
   Any special privileges hack wasn't done - we need just to setup
additional permissions for DomD.

There is a daemon "xl devd" in driver domain. We might be able to teach
it to response to Dom0 toostack request. I'm a bit surprised if it
doesn't do that already. Did you forget to start that daemon?

That's the entire purpose of that daemon, isn't it?

I can't find any valuable documentation or examples of use for this
daemon. Can you point me please to any documentation about it, please?
   Thank you.



Roger, Ian and Ian, any thought?

Wei.


On 12/01/2015 05:56 PM, Roger Pau Monné wrote:

Hello,

El 01/12/15 a les 15.58, Iurii Mykhalskyi ha escrit:

Our real usb mass-storage device are located at driver domain (DomD). So we
setup second block-device backend there.

To hotplug usb mass-storage from DomD we use follow command:

xl block-attach domU_id phy:/bla-bla,xvda10,w,backend="DomD"

This is not possible by design, you should only be able to execute `xl
block-attach ...` from the control domain (Dom0). This is due to the
fact that attaching a new device to a guest requires write permissions
in the guest xenstore paths, which the driver domain should not have.

As I mention earlier, we need this case by design of our solution. And
DomD in our most of same privilegies as Dom0,
   so access to xenstore isn't point of the problem.



There was no support of attaching block-device in runtime from domain other
to Domain-0, so we have made some hacks to allow 

Re: [Xen-devel] [PATCH] x86/time: Don't use EFI's GetTime call by default

2015-12-01 Thread David Vrabel
On 01/12/15 16:57, Ross Lagerwall wrote:
> When EFI is used, don't use EFI's GetTime() to get the time, because it
> is broken on many platforms.
[...]
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -679,20 +679,28 @@ static void __get_cmos_time(struct rtc_time *rtc)
>  rtc->year += 100;
>  }
>  
> +/* EFI's GetTime() is frequently broken so don't use it by default. */
> +#undef USE_EFI_GET_TIME
> +
>  static unsigned long get_cmos_time(void)
>  {
> -unsigned long res, flags;
> +#ifdef USE_EFI_GET_TIME
> +unsigned long res;
> +#endif

You could move this res into the if ( efi_enabled ) below.

> +unsigned long flags;
>  struct rtc_time rtc;
>  unsigned int seconds = 60;
>  static bool_t __read_mostly cmos_rtc_probe;
>  boolean_param("cmos-rtc-probe", cmos_rtc_probe);
>  
> +#ifdef USE_EFI_GET_TIME
>  if ( efi_enabled )
>  {
>  res = efi_get_time();
>  if ( res )
>  return res;
>  }
> +#endif

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Hotplugged devices in Xen 4.5 and domain reboot

2015-12-01 Thread Wei Liu
On Tue, Dec 01, 2015 at 06:48:32PM +0200, Iurii Mykhalskyi wrote:
> On 12/01/2015 05:41 PM, Ian Campbell wrote:
> >On Tue, 2015-12-01 at 15:29 +, Wei Liu wrote:
> >>On Tue, Dec 01, 2015 at 04:58:55PM +0200, Iurii Mykhalskyi wrote:
> >>>Our real usb mass-storage device are located at driver domain (DomD).
> >>>So we
> >>>setup second block-device backend there.
> >>>
> >>>To hotplug usb mass-storage from DomD we use follow command:
> >>>
> >>>xl block-attach domU_id phy:/bla-bla,xvda10,w,backend="DomD"
> >>>
> >>What happens if you run this in Dom0? I guess DomD doesn't respond to
> >>the request?
> >>
> >>>There was no support of attaching block-device in runtime from domain
> >>>other
> >>>to Domain-0, so we have made some hacks to allow call block-attach
> >>>command
> >>>from non-dom0 privileged domain.
> >>So this is a special use case. This is the first time I know people
> >>actually run xl block-attach in driver domain.
> >Toolstack commands (xl *) should be run in the toolstack domain, not in the
> >driver domain.
> >
> >I don't think it should be expected that the latter work (at least not
> >without a large amount of development work).
> >
> >In general a driver domain would not be expected to have sufficient
> >privilege over e.g. a guest domain's /local/domain/domU/devices to create
> >the f.e. dirs.
> In our solution we have to create 2 full privileged domains - Dom0 and
> DomD, so we need 2 toolstack domains.
>Any special privileges hack wasn't done - we need just to setup
> additional permissions for DomD.
> >>There is a daemon "xl devd" in driver domain. We might be able to teach
> >>it to response to Dom0 toostack request. I'm a bit surprised if it
> >>doesn't do that already. Did you forget to start that daemon?
> >That's the entire purpose of that daemon, isn't it?
> I can't find any valuable documentation or examples of use for this
> daemon. Can you point me please to any documentation about it, please?
>Thank you.

You just run "xl devd" in driver domain with a init script or systemd
unit.

To help debug, run xl -F devd

> >
> >>Roger, Ian and Ian, any thought?
> >>
> >>Wei.
> 
> On 12/01/2015 05:56 PM, Roger Pau Monné wrote:
> >Hello,
> >
> >El 01/12/15 a les 15.58, Iurii Mykhalskyi ha escrit:
> >>Our real usb mass-storage device are located at driver domain (DomD). So we
> >>setup second block-device backend there.
> >>
> >>To hotplug usb mass-storage from DomD we use follow command:
> >>
> >>xl block-attach domU_id phy:/bla-bla,xvda10,w,backend="DomD"
> >This is not possible by design, you should only be able to execute `xl
> >block-attach ...` from the control domain (Dom0). This is due to the
> >fact that attaching a new device to a guest requires write permissions
> >in the guest xenstore paths, which the driver domain should not have.
> As I mention earlier, we need this case by design of our solution. And
> DomD in our most of same privilegies as Dom0,
>so access to xenstore isn't point of the problem.

It's a bit confusing because this email contains a) the design and b)
workaround in Xen to meet your design.

As I understand it, you have a driver domain (DomD), and you need to
attach the device inside DomD to DomU. At the moment Xen _should_ be
able to do that. There might be bugs, but we shall fix it. There are
bugs because not that many people use such setup.

Note that toolstack doesn't care if DomD being privileged or not.

> >>There was no support of attaching block-device in runtime from domain other
> >>to Domain-0, so we have made some hacks to allow call block-attach command
> >>from non-dom0 privileged domain.
> >Do you have either the `xl devd` command running or udev rules
> >correctly setup inside of the driver domain?
> Yes, we have setup our own udev rules, that executes "xl block-attach
> .." during usb stick insert.
> >Does something like the following work? If not, could you paste the
> >error when running it with -vvv.
> >
> >xl block-attach DomU 
> >format=raw,vdev=hdc,access=rw,backend=DomD,target=/path/to/dev
> In dom0 we have next issue:
> /libxl: error: libxl_device.c:283:libxl__device_disk_set_backend: Disk
> vdev=xvda10 failed to stat: /dev/sda1: No such file or directory//-
> /this issue occurs due to missing /dev/sda1 device (all hardware are
> placed in DomD domain).
> 

Looks like the path that is stat'ed is not correct.

Wei.

>   In domD with our patches - all ok. /dev/sda1 successfully forwarded
> to DomU.
> 
> >
> >Roger.
> >
> 
> With the best regards, Iurii.
> 
> 
> 

> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Hotplugged devices in Xen 4.5 and domain reboot

2015-12-01 Thread Roger Pau Monné
El 01/12/15 a les 17.48, Iurii Mykhalskyi ha escrit:
>> Does something like the following work? If not, could you paste the
>> error when running it with -vvv.
>>
>> xl block-attach DomU
>> format=raw,vdev=hdc,access=rw,backend=DomD,target=/path/to/dev
> In dom0 we have next issue:
> /libxl: error: libxl_device.c:283:libxl__device_disk_set_backend: Disk
> vdev=xvda10 failed to stat: /dev/sda1: No such file or directory//-
> /this issue occurs due to missing /dev/sda1 device (all hardware are
> placed in DomD domain).

I'm not sure how can you get to this path, the libxl chunk in 
stable-4.5 is:

271 if (disk->format == LIBXL_DISK_FORMAT_EMPTY) {
272 if (!disk->is_cdrom) {
273 LOG(ERROR, "Disk vdev=%s is empty but not cdrom", disk->vdev);
274 return ERROR_INVAL;
275 }
276 memset(, 0, sizeof(a.stab));
277 } else if ((disk->backend == LIBXL_DISK_BACKEND_UNKNOWN ||
278 disk->backend == LIBXL_DISK_BACKEND_PHY) &&
279disk->backend_domid == LIBXL_TOOLSTACK_DOMID &&
280!disk->script) {
281 if (stat(disk->pdev_path, )) {
282 LOGE(ERROR, "Disk vdev=%s failed to stat: %s",
283 disk->vdev, disk->pdev_path);
284 return ERROR_INVAL;
285 }
286 }

So it seems that block-attach is ignoring the 'backend=foo' field in 
the disk configuration?

Can you paste the full output of the execution with -vvv?

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 0/4] xen/arm: p2m: Add support to remove empty translation table

2015-12-01 Thread Julien Grall
Hello,

The main purpose of this patch series is to allow creation of superpage when
it has been previously shattered.

The first patch is not related to the main purpose of this series but fix a
latent bug I've found while looking at the p2m code.

For all the changes see in each patch.

Sincerely yours,

Julien Grall (4):
  xen/arm: p2m: Flush for every exit paths in apply_p2m_changes
  xen/arm: p2m: Store the page for each mapping
  xen/arm: p2m: Introduce a helper to remove an entry in the page table
  xen/arm: p2m: Remove translation table when it's empty

 xen/arch/arm/p2m.c   | 86 +---
 xen/include/asm-arm/mm.h |  6 
 2 files changed, 87 insertions(+), 5 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 1/4] xen/arm: p2m: Flush for every exit paths in apply_p2m_changes

2015-12-01 Thread Julien Grall
Currently, the TLB is not flushed if an error occured while updating the
stage-2 p2m. However, the TLB will contain stale mappings for any entry
updated so far.

To avoid a such situation, flush on every exit path when the variable
"flush" is set.

Signed-off-by: Julien Grall 
Acked-by: Ian Campbell 

---

Changes in v2:
- Add Ian's Acked-by
- Fix typoes
---
 xen/arch/arm/p2m.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index e396c40..f910cab 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1010,7 +1010,7 @@ static int apply_p2m_changes(struct domain *d,
 if ( (egfn - sgfn) > progress && !(progress & mask) )
 {
 rc = progress;
-goto tlbflush;
+goto out;
 }
 break;
 }
@@ -1096,15 +1096,13 @@ static int apply_p2m_changes(struct domain *d,
 
 rc = 0;
 
-tlbflush:
+out:
 if ( flush )
 {
 flush_tlb_domain(d);
 iommu_iotlb_flush(d, sgfn, egfn - sgfn);
 }
 
-out:
-
 if ( rc < 0 && ( op == INSERT || op == ALLOCATE ) &&
  addr != start_gpaddr )
 {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 65264: regressions - FAIL

2015-12-01 Thread osstest service owner
flight 65264 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/65264/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemut-winxpsp3  6 xen-boot fail REGR. vs. 59254
 test-amd64-i386-qemut-rhel6hvm-intel  6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-freebsd10-i386  6 xen-bootfail REGR. vs. 59254
 test-amd64-i386-rumpuserxen-i386  6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-xl-qemut-debianhvm-amd64  6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-freebsd10-amd64  6 xen-boot   fail REGR. vs. 59254
 test-amd64-i386-xl-qemuu-win7-amd64  6 xen-boot   fail REGR. vs. 59254
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-xl-xsm6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-xl6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-xl-qemut-win7-amd64  6 xen-boot   fail REGR. vs. 59254
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-qemut-rhel6hvm-amd  6 xen-bootfail REGR. vs. 59254
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  6 xen-boot fail REGR. vs. 59254
 test-armhf-armhf-xl-arndale   6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-xl   6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-xl-cubietruck  6 xen-bootfail REGR. vs. 59254
 test-armhf-armhf-xl-credit2   6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-pair 10 xen-boot/dst_host fail REGR. vs. 59254
 test-amd64-i386-pair  9 xen-boot/src_host fail REGR. vs. 59254
 test-amd64-i386-qemuu-rhel6hvm-amd  6 xen-bootfail REGR. vs. 59254
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 15 guest-localmigrate.2 
fail REGR. vs. 59254
 test-amd64-i386-xl-qemuu-debianhvm-amd64  6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-xl-qemuu-ovmf-amd64  6 xen-boot   fail REGR. vs. 59254
 test-armhf-armhf-xl-xsm   6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-qemuu-rhel6hvm-intel  6 xen-boot  fail REGR. vs. 59254
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 15 guest-localmigrate.2 
fail REGR. vs. 59254
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail REGR. vs. 59254
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-xl-qemuu-winxpsp3  6 xen-boot fail REGR. vs. 59254

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt-xsm   6 xen-boot  fail REGR. vs. 59254
 test-amd64-amd64-libvirt-xsm  6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-libvirt   6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-libvirt-xsm  6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-xl-rtds  6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-libvirt  6 xen-boot  fail REGR. vs. 59254
 test-amd64-i386-xl-raw6 xen-bootfail baseline untested
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 6 xen-boot fail baseline 
untested
 test-armhf-armhf-xl-vhd   6 xen-bootfail baseline untested
 test-amd64-i386-libvirt-pair 10 xen-boot/dst_host   fail baseline untested
 test-amd64-i386-libvirt-pair  9 xen-boot/src_host   fail baseline untested
 test-amd64-amd64-libvirt-vhd  9 debian-di-install   fail baseline untested
 test-amd64-amd64-i386-pvgrub  6 xen-bootfail baseline untested
 test-armhf-armhf-libvirt-raw  6 xen-bootfail baseline untested
 test-armhf-armhf-libvirt-qcow2  6 xen-boot  fail baseline untested
 test-amd64-amd64-libvirt18 guest-start/debian.repeat fail blocked in 59254
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail like 59254

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-qemuu-nested-amd  6 xen-boot  fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass

version targeted for testing:
 linux31ade3b83e1821da5fbb2f11b5b3d4ab2ec39db8
baseline version:
 linux45820c294fe1b1a9df495d57f40585ef2d069a39

Last test of basis59254  2015-07-09 04:20:48 Z  

[Xen-devel] [PATCH v3 3/4] xen/arm: p2m: Introduce a helper to remove an entry in the page table

2015-12-01 Thread Julien Grall
Factorize the code to remove an entry in p2m_remove_pte so we can re-use
it later.

Signed-off-by: Julien Grall 
Acked-by: Ian Campbell 

---

Changes in v2:
- Add Ian's acked-by
- Fix typoes
---
 xen/arch/arm/p2m.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index f28ae3f..ae0acf0 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -367,6 +367,14 @@ static inline void p2m_write_pte(lpae_t *p, lpae_t pte, 
bool_t flush_cache)
 clean_dcache(*p);
 }
 
+static inline void p2m_remove_pte(lpae_t *p, bool_t flush_cache)
+{
+lpae_t pte;
+
+memset(, 0x00, sizeof(pte));
+p2m_write_pte(p, pte, flush_cache);
+}
+
 /*
  * Allocate a new page table page and hook it in via the given entry.
  * apply_one_level relies on this returning 0 on success
@@ -839,8 +847,7 @@ static int apply_one_level(struct domain *d,
 
 *flush = true;
 
-memset(, 0x00, sizeof(pte));
-p2m_write_pte(entry, pte, flush_cache);
+p2m_remove_pte(entry, flush_cache);
 p2m_mem_access_radix_set(p2m, paddr_to_pfn(*addr), p2m_access_rwx);
 
 *addr += level_size;
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 0/2] block/xen-blkfront: Support non-indirect grant with 64KB page granularity

2015-12-01 Thread Julien Grall
Hi Konrad,

On 01/12/15 15:37, Konrad Rzeszutek Wilk wrote:
> On Wed, Nov 18, 2015 at 06:57:23PM +, Julien Grall wrote:
>> Hi all,
>>
>> This is a follow-up on the previous discussion [1] related to guest using 
>> 64KB
>> page granularity which doesn't boot when the backend isn't using indirect
>> descriptor.
>>
>> This has been successfully tested on ARM64 with both 64KB and 4KB page
>> granularity guests and QEMU as the backend. Indeed QEMU doesn't support
>> indirect descriptor.
>>
>> This series is based on xentip/for-linus-4.4 which include the support for
>> 64KB Linux guest.
> 
> In the meantime the multi-queue patches have been put in the queue
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git 
> #devel/for-jens-4.5
> 
> I will try rebasing the patches on top of that.

It will likely clash with the multiqueue changes. I will rebase this
patch series and resend it.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 4/4] xen/arm: p2m: Remove translation table when it's empty

2015-12-01 Thread Julien Grall
Currently, the translation table is left in place even if no entries is
inuse. Because of how the p2m code has been implemented, replacing a
translation table by a block (i.e superpage) is not supported. Therefore,
any mapping of a superpage size will be split in smaller chunks making
the translation less efficient.

Replacing a table by a block when a new mapping is added would be too
complicated because it requires to check if all the upper levels are not
inuse and free them if necessary.

Instead, we will remove the empty translation table when the mapping are
removed. To avoid going through all the table checking if no entry is
inuse, a counter representing the number of entry currently inuse is
kept per table translation and updated when an entry change state (i.e
valid <-> invalid).

As Xen allocates a page for each translation table, it's possible to
store the counter in the struct page_info. A new field p2m_refcount as
been introduced in the inuse union for this purpose. This is fine as the
page is only used by the P2M code and nobody touch the other field of
the union type_info.

For record, type_info has not been used because it would require
more work to use it properly as Xen on ARM doesn't yet have the concept
of type.

Once Xen has finished to remove a mapping and all the reference to each
translation table has been updated, the level will be lookup backward to
check if we need first need to free an unused translation table at an
higher level and then the lower levels. This will allow to propagate the
number of reference and free multiple translation table at different level
in one go.

Signed-off-by: Julien Grall 

---
Changes in v3:
- Fix indentation in the code again. I hope it's good know.
- Fix reference counting when the entry became valid

Changes in v2:
- Introduce a new field p2m_refcount in the inuse union
- Update the commit message to explain why we didn't use
type_info
- Fix indentation in the code
- Explain what protect the page in update_reference_mapping
---
 xen/arch/arm/p2m.c   | 65 
 xen/include/asm-arm/mm.h |  6 +
 2 files changed, 71 insertions(+)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index ae0acf0..2190908 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -427,6 +427,8 @@ static int p2m_create_table(struct domain *d, lpae_t *entry,
 
  write_pte([i], pte);
  }
+
+ page->u.inuse.p2m_refcount = LPAE_ENTRIES;
 }
 else
 clear_page(p);
@@ -936,6 +938,20 @@ static int apply_one_level(struct domain *d,
 BUG(); /* Should never get here */
 }
 
+/*
+ * The page is only used by the P2M code which is protected by the p2m->lock.
+ * So we can avoid to use atomic helpers.
+ */
+static void update_reference_mapping(struct page_info *page,
+ lpae_t old_entry,
+ lpae_t new_entry)
+{
+if ( p2m_valid(old_entry) && !p2m_valid(new_entry) )
+page->u.inuse.p2m_refcount--;
+else if ( !p2m_valid(old_entry) && p2m_valid(new_entry) )
+page->u.inuse.p2m_refcount++;
+}
+
 static int apply_p2m_changes(struct domain *d,
  enum p2m_operation op,
  paddr_t start_gpaddr,
@@ -961,6 +977,8 @@ static int apply_p2m_changes(struct domain *d,
 const bool_t preempt = !is_idle_vcpu(current);
 bool_t flush = false;
 bool_t flush_pt;
+PAGE_LIST_HEAD(free_pages);
+struct page_info *pg;
 
 /* Some IOMMU don't support coherent PT walk. When the p2m is
  * shared with the CPU, Xen has to make sure that the PT changes have
@@ -1070,6 +1088,7 @@ static int apply_p2m_changes(struct domain *d,
 {
 unsigned offset = offsets[level];
 lpae_t *entry = [level][offset];
+lpae_t old_entry = *entry;
 
 ret = apply_one_level(d, entry,
   level, flush_pt, op,
@@ -1078,6 +1097,10 @@ static int apply_p2m_changes(struct domain *d,
   mattr, t, a);
 if ( ret < 0 ) { rc = ret ; goto out; }
 count += ret;
+
+if ( ret != P2M_ONE_PROGRESS_NOP )
+update_reference_mapping(pages[level], old_entry, *entry);
+
 /* L3 had better have done something! We cannot descend any 
further */
 BUG_ON(level == 3 && ret == P2M_ONE_DESCEND);
 if ( ret != P2M_ONE_DESCEND ) break;
@@ -1099,6 +1122,45 @@ static int apply_p2m_changes(struct domain *d,
 }
 /* else: next level already valid */
 }
+
+BUG_ON(level > 3);
+
+if ( op == REMOVE )
+{
+for ( ; level > P2M_ROOT_LEVEL; level-- )
+{
+lpae_t old_entry;
+lpae_t *entry;
+unsigned int offset;
+
+  

[Xen-devel] [PATCH v3 2/4] xen/arm: p2m: Store the page for each mapping

2015-12-01 Thread Julien Grall
The page will be use later for reference counting. So we need a quick
access to the page associated to the mapping.

Signed-off-by: Julien Grall 
Acked-by: Ian Campbell 

---

Changes in v2:
- Add Ian's acked-by
---
 xen/arch/arm/p2m.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index f910cab..f28ae3f 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -942,6 +942,7 @@ static int apply_p2m_changes(struct domain *d,
 int rc, ret;
 struct p2m_domain *p2m = >arch.p2m;
 lpae_t *mappings[4] = { NULL, NULL, NULL, NULL };
+struct page_info *pages[4] = { NULL, NULL, NULL, NULL };
 paddr_t addr, orig_maddr = maddr;
 unsigned int level = 0;
 unsigned int cur_root_table = ~0;
@@ -964,7 +965,10 @@ static int apply_p2m_changes(struct domain *d,
 
 /* Static mapping. P2M_ROOT_PAGES > 1 are handled below */
 if ( P2M_ROOT_PAGES == 1 )
+{
 mappings[P2M_ROOT_LEVEL] = __map_domain_page(p2m->root);
+pages[P2M_ROOT_LEVEL] = p2m->root;
+}
 
 addr = start_gpaddr;
 while ( addr < end_gpaddr )
@@ -1047,6 +1051,7 @@ static int apply_p2m_changes(struct domain *d,
 unmap_domain_page(mappings[P2M_ROOT_LEVEL]);
 mappings[P2M_ROOT_LEVEL] =
 __map_domain_page(p2m->root + root_table);
+pages[P2M_ROOT_LEVEL] = p2m->root + root_table;
 cur_root_table = root_table;
 /* Any mapping further down is now invalid */
 for ( i = P2M_ROOT_LEVEL; i < 4; i++ )
@@ -1079,6 +1084,7 @@ static int apply_p2m_changes(struct domain *d,
 if ( mappings[level+1] )
 unmap_domain_page(mappings[level+1]);
 mappings[level+1] = map_domain_page(_mfn(entry->p2m.base));
+pages[level+1] = mfn_to_page(entry->p2m.base);
 cur_offset[level] = offset;
 /* Any mapping further down is now invalid */
 for ( i = level+1; i < 4; i++ )
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Regression: Xen guest with 5G of RAM on 32bit fail to boot

2015-12-01 Thread Anthony PERARD
Hi,

Under Xen, a guest with 5G of RAM, with a 32bit binary QEMU (well, with a
32bit dom0) does not boot anymore. QEMU abort() with "Bad ram offset efffd000".

This issue first appear in 4ed023ce2a39ab5812d33cf4d819def168965a7f (Round
up RAMBlock sizes to host page sizes).

The problem is in qemu_ram_alloc_internal() where 'size' and 'maxsize' are
now been truncate to 32bit, due to 'qemu_host_page_size' been an uintptr_t
in the HOST_PAGE_ALIGN macro.

ram_add_t is uint64_t when compiled with --enable-xen.

Regards,

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] libxl: Introduce a template for devices with a controller

2015-12-01 Thread Chun Yan Liu


>>> On 12/1/2015 at 08:09 PM, in message
<1448971798-3498-1-git-send-email-george.dun...@eu.citrix.com>, George Dunlap
 wrote: 
> We have several outstanding patch series which add devices that have 
> two levels: a controller and individual devices attached to that 
> controller. 
>  
> In the interest of consistency, this patch introduces a section that 
> sketches out a template for interfaces for such devices. 

Acked.

>  
> Signed-off-by: George Dunlap  
> Acked-by: Juergen Gross  
> --- 
> CC: Ian Campbell  
> CC: Ian Jackson  
> CC: Wei Liu  
> CC: Juergen Gross  
> CC: Chun Yan Liu  
> CC: Olaf Hering  
>  
> Changes in v2: 
> - Fixed typos 
>  
> Changes in v1 (since the RFC): 
>  
> - Use  rather than , and  rather than specifying 
>   controller and device.  The idea being to allow SCSI to use 
>   terminology more natural to it (i.e., scsihost, scsitarget, scsilun) 
>   rather than naming things after USB (controller & device). 
>  
> - Do not require each  to have a deviceid, but just a unique 
>   naming schema. 
>  
> - Allow multiple levels. 
>  
> - Include the paragraph about domain configuration lists. 
> --- 
>  tools/libxl/libxl.h | 65  
> + 
>  1 file changed, 65 insertions(+) 
>  
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h 
> index 6b73848..44e2951 100644 
> --- a/tools/libxl/libxl.h 
> +++ b/tools/libxl/libxl.h 
> @@ -1396,6 +1396,71 @@ void libxl_vtpminfo_list_free(libxl_vtpminfo *, int  
> nr_vtpms); 
>   * 
>   *   This function does not interact with the guest and therefore 
>   *   cannot block on the guest. 
> + * 
> + * Controllers 
> + * --- 
> + * 
> + * Most devices are treated individually.  Some classes of device, 
> + * however, like USB or SCSI, inherently have the need to have a 
> + * hierarchy of different levels, with lower-level devices "attached" 
> + * to higher-level ones.  USB for instance has "controllers" at the 
> + * top, which have buses, on which are devices, which consist of 
> + * multiple interfaces.  SCSI has "hosts" at the top, then buses, 
> + * targets, and LUNs. 
> + * 
> + * In that case, for each , there will be a set of functions 
> + * and types for each .  For example, for =usb, there 
> + * may be  ctrl (controller) and dev (device), with ctrl being 
> + * level 0. 
> + * 
> + * libxl_device__ will act more or 
> + * less like top-level non-bus devices: they will either create or 
> + * accept a libxl_devid which will be unique within the 
> + *  libxl_devid namespace. 
> + * 
> + * Lower-level devices must have a unique way to be identified.  One 
> + * way to do this would be to name it via the name of the next level 
> + * up plus an index; for instance, .  Another 
> + * way would be to have another devid namespace for that level.  This 
> + * identifier will be used for queries and removals. 
> + * 
> + * Lower-level devices will include in their 
> + * libxl_device_ struct a field referring to the unique 
> + * index of the level above.  For instance, libxl_device_usbdev might 
> + * contain the controller devid. 
> + * 
> + * In the case where there are multiple different ways to implement a 
> + * given device -- for instance, one which is fully PV and one which 
> + * uses an emulator -- the controller will contain a field which 
> + * specifies what type of implementation is used.  The implementations 
> + * of individual devices will be known by the controller to which they 
> + * are attached. 
> + * 
> + * If libxl_device__add receives an empty reference to 
> + * the level above, it may return an error.  Or it may (but is not 
> + * required to) automatically choose a suitable device in the level 
> + * above to which to attach the new device at this level.  It may also 
> + * (but is not required to) automatically create a new device at the 
> + * level above if no suitable devices exist.  Each class should 
> + * document its behavior. 
> + * 
> + * libxl_device__list will list all devices of  
> + * at  in the domain.  For example, libxl_device_usbctrl_list 
> + * will list all usb controllers; libxl_class_usbdev_list will list 
> + * all usb devices across all controllers. 
> + * 
> + * For each class, the domain config file will contain a single list 
> + * for each level.  libxl will first iterate through the list of 
> + * top-level devices, then iterate through each level down in turn, 
> + * adding devices to devices in the level above.  For instance, there 
> + * will be one list for all usb controllers, and one list for all usb 
> + * devices. 
> + *  
> + * If libxl_device__add automatically creates 
> + * higher-level devices as necessary, then it is permissible for the 
> + * higher-level lists to be empty and the device list to have devices 
> + * 

Re: [Xen-devel] [RFC PATCHv2 0/3]: x86/ept: reduce translation invalidation impact

2015-12-01 Thread Tian, Kevin
> From: David Vrabel [mailto:david.vra...@citrix.com]
> Sent: Saturday, November 14, 2015 2:50 AM
> 
> This RFC series improves the performance of EPT by reducing the impact
> of the translation invalidations (ept_sync_domain()).  Two approaches
> are used:
> 
> a) Removing unnecessary invalidations after fixing misconfigured
>entries (after a type change).
> 
> b) Deferring invalidations until the p2m write lock is released.

Do you have a sense which one above incurs more overhead?

> 
> Prior to this change a 16 VCPU guest could not be successfully
> migrated on an (admittedly slow) 160 PCPU box because the p2m write
> lock was held for such extended periods of time.  This starved the
> read lock needed (by the toolstack) to map the domain's memory,
> triggering the watchdog.
> 
> After this change a 64 VCPU guest could be successfully migrated.
> 
> ept_sync_domain() is very expensive because:
> 
> a) it uses on_selected_cpus() and the IPI cost can be particularly
>high for a multi-socket machine.
> 
> b) on_selected_cpus() is serialized by its own spin lock.
> 
> On this particular box, ept_sync_domain() could take ~3-5 ms.
> 
> Simply using a fair rw lock was not sufficient to resolve this (but it
> was an improvement) as the cost of the ept_sync_domain calls() was
> still delaying the read locks enough for the watchdog to trigger (the
> toolstack maps a batch of 1024 GFNs at a time, which means trying to
> acquire the p2m read lock 1024 times).
> 
> Changes in v2:
> 
> - Use a per-p2m (not per-CPU) list for page table pages to be freed.
> - Hold the write lock while updating the synced_mask.
> 
> David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [V2 PATCH 8/9] x86/hvm: pkeys, add xstate support for pkeys

2015-12-01 Thread Han, Huaitong

> Does this even compile?  There is already
> 
> static void *get_xsave_addr(void *xsave, unsigned int xfeature_idx)
> 
> higher in the same file.
> 
> That function should be augmented to take a struct xsave_struct
> *xsave,
> look at whether the representation is compressed or not, and use the
> appropriate offset array.
> 
Just because I have pulled staging branch when "static void
*get_xsave_addr(void *xsave, unsigned int xfeature_idx)" is not added.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv2 2/3] mm: don't free pages until mm locks are released

2015-12-01 Thread Tian, Kevin
> From: David Vrabel [mailto:david.vra...@citrix.com]
> Sent: Saturday, November 14, 2015 2:50 AM
> 
> If a page is freed without translations being invalidated, and the page is
> subsequently allocated to another domain, a guest with a cached
> translation will still be able to access the page.
> 
> Currently translations are invalidated before releasing the page ref, but
> while still holding the mm locks.  To allow translations to be invalidated
> without holding the mm locks, we need to keep a reference to the page
> for a bit longer in some cases.
> 
> [ This seems difficult to a) verify as correct; and b) difficult to get
> correct in the future.  A better suggestion would be useful.  Perhaps
> using something like pg->tlbflush_needed mechanism that already exists
> for pages from PV guests? ]

Per-page flag looks clean in general, but not an expert here. Tim might
have a better idea.

> 
> Signed-off-by: David Vrabel 
> ---
>  xen/arch/x86/mm/p2m.c | 9 +++--
>  xen/common/memory.c   | 2 +-
>  2 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index ed0bbd7..e2c82b1 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -2758,6 +2758,7 @@ int p2m_add_foreign(struct domain *tdom, unsigned long 
> fgfn,
>  p2m_type_t p2mt, p2mt_prev;
>  unsigned long prev_mfn, mfn;
>  struct page_info *page;
> +struct page_info *prev_page = NULL;
>  int rc;
>  struct domain *fdom;
> 
> @@ -2805,6 +2806,9 @@ int p2m_add_foreign(struct domain *tdom, unsigned long 
> fgfn,
>  prev_mfn = mfn_x(get_gfn(tdom, gpfn, _prev));
>  if ( mfn_valid(_mfn(prev_mfn)) )
>  {
> +prev_page = mfn_to_page(_mfn(prev_mfn));
> +get_page(prev_page, tdom);
> +
>  if ( is_xen_heap_mfn(prev_mfn) )
>  /* Xen heap frames are simply unhooked from this phys slot */
>  guest_physmap_remove_page(tdom, gpfn, prev_mfn, 0);
> @@ -2823,14 +2827,15 @@ int p2m_add_foreign(struct domain *tdom, unsigned long
> fgfn,
>   "gpfn:%lx mfn:%lx fgfn:%lx td:%d fd:%d\n",
>   gpfn, mfn, fgfn, tdom->domain_id, fdom->domain_id);
> 
> -put_page(page);
> -
>  /*
>   * This put_gfn for the above get_gfn for prev_mfn.  We must do this
>   * after set_foreign_p2m_entry so another cpu doesn't populate the gpfn
>   * before us.
>   */
>  put_gfn(tdom, gpfn);
> +if ( prev_page )
> +put_page(prev_page);
> +put_page(page);
> 
>  out:
>  if ( fdom )
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index a3bffb7..571c754 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -272,8 +272,8 @@ int guest_remove_page(struct domain *d, unsigned long 
> gmfn)
> 
>  guest_physmap_remove_page(d, gmfn, mfn, 0);
> 
> -put_page(page);
>  put_gfn(d, gmfn);
> +put_page(page);
> 
>  return 1;
>  }
> --
> 2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 65281: regressions - FAIL

2015-12-01 Thread osstest service owner
flight 65281 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/65281/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-libvirt   5 libvirt-build fail REGR. vs. 63340
 test-amd64-amd64-libvirt-vhd  9 debian-di-install fail REGR. vs. 63340

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  2340f3ebfbf58f40ffbe159bde0b67c07d7f1ce5
baseline version:
 libvirt  3c7590e0a435d833895fc7b5be489e53e223ad95

Last test of basis63340  2015-10-28 04:19:47 Z   35 days
Failing since 63352  2015-10-29 04:20:29 Z   34 days   29 attempts
Testing same since65281  2015-12-01 14:24:19 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Chen Hanxiao 
  Christian Loehle 
  Cole Robinson 
  Daniel P. Berrange 
  Daniel Veillard 
  Dmitry Andreev 
  Erik Skultety 
  Guido Günther 
  Jim Fehlig 
  Jiri Denemark 
  Joao Martins 
  John Ferlan 
  Ján Tomko 
  Laine Stump 
  Luyao Huang 
  Marc-André Lureau 
  Martin Kletzander 
  Maxim Perevedentsev 
  Michal Privoznik 
  Michel Normand 
  Mikhail Feoktistov 
  Nikolay Shirokovskiy 
  Pavel Hrdina 
  Peter Krempa 
  Richard Weinberger 
  Roman Bogorodskiy 
  Stefan Berger 
  Stefan Berger 
  Wang Yufei 
  Wei Jiangang 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  fail
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm blocked
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt blocked
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-armhf-armhf-libvirt-qcow2   blocked
 test-armhf-armhf-libvirt-raw blocked
 test-amd64-amd64-libvirt-vhd fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at

Re: [Xen-devel] [PATCHv2 1/3] x86/ept: remove unnecessary sync after resolving misconfigured entries

2015-12-01 Thread Tian, Kevin
> From: David Vrabel [mailto:david.vra...@citrix.com]
> Sent: Saturday, November 14, 2015 2:50 AM
> 
> When using EPT, type changes are done with the following steps:
> 
> 1. Set entry as invalid (misconfigured) by settings a reserved memory
> type.
> 
> 2. Flush all EPT and combined translations (ept_sync_domain()).
> 
> 3. Fixup misconfigured entries as required (on EPT_MISCONFIG vmexits or
> when explicitly setting an entry.
> 
> Since resolve_misconfig() only updates entries that were misconfigured,
> there is no need to invalidate any translations since the hardware
> does not cache misconfigured translations (vol 3, section 28.3.2).
> 
> Remove the unnecessary (and very expensive) ept_sync_domain() calls).
> 
> Signed-off-by: David Vrabel 
> Reviewed-by: Andrew Cooper 

Acked-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] libxc: try to find last used pfn when migrating

2015-12-01 Thread Juergen Gross
For migration the last used pfn of a guest is needed to size the
logdirty bitmap and as an upper bound of the page loop. Unfortunately
there are pv-kernels advertising a much higher maximum pfn as they
are really using in order to support memory hotplug. This will lead
to allocation of much more memory in Xen tools during migration as
really needed.

Try to find the last used guest pfn of a pv-domu by scanning the p2m
tree from the last entry towards it's start and search for an entry
not being invalid.

Normally the mid pages of the p2m tree containing all invalid entries
are being reused, so we can just scan the top page for identical
entries and skip them but the first one.

Signed-off-by: Juergen Gross 
---
Changes in V2:
- Modified comments regarding setup callback in structures
  xc_sr_save_ops and xc_sr_restore_ops (suggested by Andrew).
- Got rid of calls of xc_domain_maximum_gpfn() especially in the pv
  case (suggested by Wei).

---
 tools/libxc/xc_sr_common.h| 14 -
 tools/libxc/xc_sr_common_x86_pv.c | 19 +-
 tools/libxc/xc_sr_save.c  | 24 ---
 tools/libxc/xc_sr_save_x86_hvm.c  | 14 +
 tools/libxc/xc_sr_save_x86_pv.c   | 41 ---
 5 files changed, 66 insertions(+), 46 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 64f6082..9aecde2 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -54,9 +54,11 @@ struct xc_sr_save_ops
   void **page);
 
 /**
- * Set up local environment to restore a domain.  This is called before
- * any records are written to the stream.  (Typically querying running
- * domain state, setting up mappings etc.)
+ * Set up local environment to save a domain. (Typically querying
+ * running domain state, setting up mappings etc.)
+ *
+ * This is called once before any common setup has occurred, allowing for
+ * guest-specific adjustments to be made to common state.
  */
 int (*setup)(struct xc_sr_context *ctx);
 
@@ -121,8 +123,10 @@ struct xc_sr_restore_ops
 int (*localise_page)(struct xc_sr_context *ctx, uint32_t type, void *page);
 
 /**
- * Set up local environment to restore a domain.  This is called before
- * any records are read from the stream.
+ * Set up local environment to restore a domain.
+ *
+ * This is called once before any common setup has occurred, allowing for
+ * guest-specific adjustments to be made to common state.
  */
 int (*setup)(struct xc_sr_context *ctx);
 
diff --git a/tools/libxc/xc_sr_common_x86_pv.c 
b/tools/libxc/xc_sr_common_x86_pv.c
index eb68c07..f233c87 100644
--- a/tools/libxc/xc_sr_common_x86_pv.c
+++ b/tools/libxc/xc_sr_common_x86_pv.c
@@ -68,8 +68,7 @@ uint64_t mfn_to_cr3(struct xc_sr_context *ctx, xen_pfn_t _mfn)
 int x86_pv_domain_info(struct xc_sr_context *ctx)
 {
 xc_interface *xch = ctx->xch;
-unsigned int guest_width, guest_levels, fpp;
-xen_pfn_t max_pfn;
+unsigned int guest_width, guest_levels;
 
 /* Get the domain width */
 if ( xc_domain_get_guest_width(xch, ctx->domid, _width) )
@@ -89,25 +88,9 @@ int x86_pv_domain_info(struct xc_sr_context *ctx)
 }
 ctx->x86_pv.width = guest_width;
 ctx->x86_pv.levels = guest_levels;
-fpp = PAGE_SIZE / ctx->x86_pv.width;
 
 DPRINTF("%d bits, %d levels", guest_width * 8, guest_levels);
 
-/* Get the domain's size */
-if ( xc_domain_maximum_gpfn(xch, ctx->domid, _pfn) < 0 )
-{
-PERROR("Unable to obtain guests max pfn");
-return -1;
-}
-
-if ( max_pfn > 0 )
-{
-ctx->x86_pv.max_pfn = max_pfn;
-ctx->x86_pv.p2m_frames = (ctx->x86_pv.max_pfn + fpp) / fpp;
-
-DPRINTF("max_pfn %#lx, p2m_frames %d", max_pfn, 
ctx->x86_pv.p2m_frames);
-}
-
 return 0;
 }
 
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 0c12e56..cefcef5 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -677,6 +677,10 @@ static int setup(struct xc_sr_context *ctx)
 DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
 >save.dirty_bitmap_hbuf);
 
+rc = ctx->save.ops.setup(ctx);
+if ( rc )
+goto err;
+
 dirty_bitmap = xc_hypercall_buffer_alloc_pages(
xch, dirty_bitmap, 
NRPAGES(bitmap_size(ctx->save.p2m_size)));
 ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE *
@@ -692,10 +696,6 @@ static int setup(struct xc_sr_context *ctx)
 goto err;
 }
 
-rc = ctx->save.ops.setup(ctx);
-if ( rc )
-goto err;
-
 rc = 0;
 
  err:
@@ -824,7 +824,6 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t 
dom,
uint32_t max_iters, uint32_t max_factor, uint32_t flags,
struct save_callbacks* callbacks, int hvm)
 {
-xen_pfn_t nr_pfns;
 

[Xen-devel] [ovmf test] 65278: all pass - PUSHED

2015-12-01 Thread osstest service owner
flight 65278 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/65278/

Perfect :-)
All tests in this flight passed
version targeted for testing:
 ovmf 911f3dede219d2bb220954768f5e853e0dd976c1
baseline version:
 ovmf 8786ba4fe7220d24c41bf1386b7186eabda00a0c

Last test of basis65258  2015-11-30 14:43:23 Z1 days
Testing same since65278  2015-12-01 12:24:31 Z0 days1 attempts


People who touched revisions under test:
  "Anbazhagan, Baraneedharan" 
  "Paolo Bonzini" 
  "Yao, Jiewen" 
  Anbazhagan, Baraneedharan 
  Laszlo Ersek 
  Michael Kinney 
  Paolo Bonzini 
  Ruiyu Ni 
  Star Zeng 
  Yao, Jiewen 
  Yonghong Zhu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=911f3dede219d2bb220954768f5e853e0dd976c1
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
911f3dede219d2bb220954768f5e853e0dd976c1
+ branch=ovmf
+ revision=911f3dede219d2bb220954768f5e853e0dd976c1
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.6-testing
+ '[' x911f3dede219d2bb220954768f5e853e0dd976c1 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local 

Re: [Xen-devel] xen 4.5.0 rtds scheduler perform poorly with 2vms

2015-12-01 Thread Meng Xu
Hi Lars and Dario,

2015-12-01 4:11 GMT-06:00 Lars Kurth :
>
> I wonder whether we need to add some health warnings and recommended 
> background reading to http://wiki.xenproject.org/wiki/RTDS-Based-Scheduler


Maybe we could add some health warning and add a link to this discussion?
Misconfiguration of the system will usually cause performance
degradation, even for the other schedulers, such as ARINC653, credit,
credit2.
What I'm thinking is how much expert information we should expose to
users. Sometimes, exposing too much information may not be so helpful.
Sometimes, more information just  cause more confusion.

What do you guys think which type of information we should include?

Thanks,

Meng


>
>
> Lars
>
> > On 1 Dec 2015, at 08:59, Dario Faggioli  wrote:
> >
> > On Sun, 2015-11-29 at 11:44 -0500, Meng Xu wrote:
> >> 2015-11-29 11:27 GMT-05:00 Dario Faggioli 
> >> :
> >>>
> >>> Mmmm... As I said many times, I don't remember much of all those RT
> >>> schedulability formulas, but, is really that simple?
> >>
> >> Ah, let me clarify...
> >> It is not that simple. ;-) I just simplify it, hoping it can simplify
> >> the problem and highlight the possible reason.
> >>
> > Ok, glad to know I haven't completely lost my mind, or anything like
> > that! :-)
> >
> >>> I mean, if the in-
> >>> guest scheduling algorithm is global (e.g., global-EDF), the task
> >>> could
> >>> migrate, couldn't it?
> >>
> >> Yes. If these partial VCPUs happen to be scheduled "sequentially",
> >> the
> >> OS inside VM can migrate the task and make the task keep running. But
> >> that is not the worst-case for the OS.
> >>
> > Right, I see it now, and (FWIW) I absolutely agree with the worst-case
> > analysis you provided (thanks). I did not get the fact that you were
> > talking about the worst-case, sorry for the noise. :-D
> >
> >> The detailed illustration of the worst case scenario is at Arvind's
> >> paper: http://link.springer.com/article/10.1007%2Fs11241-009-9073-x
> >> My latest journal paper
> >> (http://link.springer.com/article/10.1007%2Fs11241-015-9223-2)
> >> tighten
> >> the resource supply bound function of the MPR model. I believe the
> >> equations are too boring to most of people in the mailing list.
> >>
> >> So let's avoid the complex equations here. ;-)
> >>
> > Thanks for this too! :-)
> >
> > Regards,
> > Dario
> > --
> > <> (Raistlin Majere)
> > -
> > Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> > Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)
> >
> > ___
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
>



-- 


---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: Xen guest with 5G of RAM on 32bit fail to boot

2015-12-01 Thread Dr. David Alan Gilbert
* Anthony PERARD (anthony.per...@citrix.com) wrote:
> Hi,
> 
> Under Xen, a guest with 5G of RAM, with a 32bit binary QEMU (well, with a
> 32bit dom0) does not boot anymore. QEMU abort() with "Bad ram offset 
> efffd000".
> 
> This issue first appear in 4ed023ce2a39ab5812d33cf4d819def168965a7f (Round
> up RAMBlock sizes to host page sizes).
> 
> The problem is in qemu_ram_alloc_internal() where 'size' and 'maxsize' are
> now been truncate to 32bit, due to 'qemu_host_page_size' been an uintptr_t
> in the HOST_PAGE_ALIGN macro.
> 
> ram_add_t is uint64_t when compiled with --enable-xen.

Hmm, that's a fun problem.
Would changing qemu_host_page_[size|mask] to ram_addr_t  work?

Dave

> 
> Regards,
> 
> -- 
> Anthony PERARD
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Build problems with xen 4.7

2015-12-01 Thread M A Young
On Tue, 1 Dec 2015, Jan Beulich wrote:

> I.e. there must be something different in how make gets invoked or
> the environment set up.

It happens if CFLAGS is set to anything as a environment variable, eg.
export CFLAGS=" "
make dist-xen

Michael Young

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: Xen guest with 5G of RAM on 32bit fail to boot

2015-12-01 Thread Anthony PERARD
On Tue, Dec 01, 2015 at 06:37:36PM +, Dr. David Alan Gilbert wrote:
> * Anthony PERARD (anthony.per...@citrix.com) wrote:
> > Hi,
> > 
> > Under Xen, a guest with 5G of RAM, with a 32bit binary QEMU (well, with a
> > 32bit dom0) does not boot anymore. QEMU abort() with "Bad ram offset 
> > efffd000".
> > 
> > This issue first appear in 4ed023ce2a39ab5812d33cf4d819def168965a7f (Round
> > up RAMBlock sizes to host page sizes).
> > 
> > The problem is in qemu_ram_alloc_internal() where 'size' and 'maxsize' are
> > now been truncate to 32bit, due to 'qemu_host_page_size' been an uintptr_t
> > in the HOST_PAGE_ALIGN macro.
> > 
> > ram_add_t is uint64_t when compiled with --enable-xen.
> 
> Hmm, that's a fun problem.
> Would changing qemu_host_page_[size|mask] to ram_addr_t  work?

Yes, well, I did change the type to uint64_t and I could boot a guest. With
ram_addr_t, it works fine as well.

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 0/2] block/xen-blkfront: Support non-indirect grant with 64KB page granularity

2015-12-01 Thread Konrad Rzeszutek Wilk
On Tue, Dec 01, 2015 at 05:55:48PM +, Julien Grall wrote:
> Hi Konrad,
> 
> On 01/12/15 15:37, Konrad Rzeszutek Wilk wrote:
> > On Wed, Nov 18, 2015 at 06:57:23PM +, Julien Grall wrote:
> >> Hi all,
> >>
> >> This is a follow-up on the previous discussion [1] related to guest using 
> >> 64KB
> >> page granularity which doesn't boot when the backend isn't using indirect
> >> descriptor.
> >>
> >> This has been successfully tested on ARM64 with both 64KB and 4KB page
> >> granularity guests and QEMU as the backend. Indeed QEMU doesn't support
> >> indirect descriptor.
> >>
> >> This series is based on xentip/for-linus-4.4 which include the support for
> >> 64KB Linux guest.
> > 
> > In the meantime the multi-queue patches have been put in the queue
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git 
> > #devel/for-jens-4.5
> > 
> > I will try rebasing the patches on top of that.
> 
> It will likely clash with the multiqueue changes. I will rebase this
> patch series and resend it.

I got patch #1 ported over (see attached). Testing it now.
> 
> Regards,
> 
> -- 
> Julien Grall
>From ebbda22e54e6557188298a3e1d6c0dcf4b04da26 Mon Sep 17 00:00:00 2001
From: Julien Grall 
Date: Wed, 18 Nov 2015 18:57:24 +
Subject: [PATCH] block/xen-blkfront: Introduce blkif_ring_get_request
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The code to get a request is always the same. Therefore we can factorize
it in a single function.

Signed-off-by: Julien Grall 
Acked-by: Roger Pau Monné 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/block/xen-blkfront.c | 30 --
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 4f77d36..38af260 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -481,6 +481,24 @@ static int blkif_ioctl(struct block_device *bdev, fmode_t 
mode,
return 0;
 }
 
+static unsigned long blkif_ring_get_request(struct blkfront_ring_info *rinfo,
+   struct request *req,
+   struct blkif_request **ring_req)
+{
+   unsigned long id;
+   struct blkfront_info *info = rinfo->dev_info;
+
+   *ring_req = RING_GET_REQUEST(>ring, rinfo->ring.req_prod_pvt);
+   rinfo->ring.req_prod_pvt++;
+
+   id = get_id_from_freelist(rinfo);
+   rinfo->shadow[id].request = req;
+
+   (*ring_req)->u.rw.id = id;
+
+   return id;
+}
+
 static int blkif_queue_discard_req(struct request *req, struct 
blkfront_ring_info *rinfo)
 {
struct blkfront_info *info = rinfo->dev_info;
@@ -488,9 +506,7 @@ static int blkif_queue_discard_req(struct request *req, 
struct blkfront_ring_inf
unsigned long id;
 
/* Fill out a communications ring structure. */
-   ring_req = RING_GET_REQUEST(>ring, rinfo->ring.req_prod_pvt);
-   id = get_id_from_freelist(rinfo);
-   rinfo->shadow[id].request = req;
+   id = blkif_ring_get_request(rinfo, req, _req);
 
ring_req->operation = BLKIF_OP_DISCARD;
ring_req->u.discard.nr_sectors = blk_rq_sectors(req);
@@ -501,8 +517,6 @@ static int blkif_queue_discard_req(struct request *req, 
struct blkfront_ring_inf
else
ring_req->u.discard.flag = 0;
 
-   rinfo->ring.req_prod_pvt++;
-
/* Keep a private copy so we can reissue requests when recovering. */
rinfo->shadow[id].req = *ring_req;
 
@@ -635,9 +649,7 @@ static int blkif_queue_rw_req(struct request *req, struct 
blkfront_ring_info *ri
}
 
/* Fill out a communications ring structure. */
-   ring_req = RING_GET_REQUEST(>ring, rinfo->ring.req_prod_pvt);
-   id = get_id_from_freelist(rinfo);
-   rinfo->shadow[id].request = req;
+   id = blkif_ring_get_request(rinfo, req, _req);
 
BUG_ON(info->max_indirect_segments == 0 &&
   GREFS(req->nr_phys_segments) > BLKIF_MAX_SEGMENTS_PER_REQUEST);
@@ -716,8 +728,6 @@ static int blkif_queue_rw_req(struct request *req, struct 
blkfront_ring_info *ri
if (setup.segments)
kunmap_atomic(setup.segments);
 
-   rinfo->ring.req_prod_pvt++;
-
/* Keep a private copy so we can reissue requests when recovering. */
rinfo->shadow[id].req = *ring_req;
 
-- 
2.1.0

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv2] 1/3] libxc: prefer using privcmd character device

2015-12-01 Thread Doug Goldstein
On 12/1/15 5:46 AM, Ian Campbell wrote:
> On Tue, 2015-11-24 at 14:14 -0600, Doug Goldstein wrote:
>> Prefer using the character device over the proc file if the character
>> device exists. This follows similar conversions of xenbus to avoid
>> issues with FMODE_ATOMIC_POS added in Linux 3.14 and newer.
>>
>> CC: Ian Jackson 
>> CC: Stefano Stabellini 
>> CC: Ian Campbell 
>> CC: Wei Liu 
>> Signed-off-by: Doug Goldstein 
>> ---
>>  tools/libxc/xc_linux_osdep.c | 9 -
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/libxc/xc_linux_osdep.c b/tools/libxc/xc_linux_osdep.c
>> index 76c55ff..c078b3d 100644
>> --- a/tools/libxc/xc_linux_osdep.c
>> +++ b/tools/libxc/xc_linux_osdep.c
>> @@ -46,7 +46,14 @@
>>  static xc_osdep_handle linux_privcmd_open(xc_interface *xch)
>>  {
>>  int flags, saved_errno;
>> -int fd = open("/proc/xen/privcmd", O_RDWR);
>> +int fd = open("/dev/xen/privcmd", O_RDWR); /* prefer this newer
>> interface */
>> +
>> +if ( fd == -1 && ( errno == ENOENT || errno == ENXIO ||
>> +   errno == ENODEV || errno == EACCES ))
> 
> This adds EACCESS to the set Ian suggested would be tolerable in his reply
> to v1. I'm leaning towards thinking that if the device is present but not
> openable by the current user then that's a system configuration error which
> should be reported.
> 
> Anyone want to argue otherwise?
> 

I'll drop EACCES. I was just trying to be proactive for another possible
error.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-pciback: fix up cleanup path when alloc fails

2015-12-01 Thread Doug Goldstein
On 12/1/15 10:47 AM, Konrad Rzeszutek Wilk wrote:
> On Thu, Nov 26, 2015 at 02:32:39PM -0600, Doug Goldstein wrote:
>> When allocating a pciback device fails, avoid the possibility of a
>> use after free.
> 
> Reviewed-by: Konrad Rzeszutek Wilk 
> 
> Ugh, and it looks like xen-blkfront has the same issue.

I believe that case is covered because xen_blkbk_remove() is called in
all the failure cases of xen_blkbk_probe() in that case.

> 
>>
>> Reported-by: Jonathan Creekmore 
>> Signed-off-by: Doug Goldstein 
>> ---
>>  drivers/xen/xen-pciback/xenbus.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/xen/xen-pciback/xenbus.c 
>> b/drivers/xen/xen-pciback/xenbus.c
>> index 98bc345..4843741 100644
>> --- a/drivers/xen/xen-pciback/xenbus.c
>> +++ b/drivers/xen/xen-pciback/xenbus.c
>> @@ -44,7 +44,6 @@ static struct xen_pcibk_device *alloc_pdev(struct 
>> xenbus_device *xdev)
>>  dev_dbg(>dev, "allocated pdev @ 0x%p\n", pdev);
>>  
>>  pdev->xdev = xdev;
>> -dev_set_drvdata(>dev, pdev);
>>  
>>  mutex_init(>dev_lock);
>>  
>> @@ -58,6 +57,9 @@ static struct xen_pcibk_device *alloc_pdev(struct 
>> xenbus_device *xdev)
>>  kfree(pdev);
>>  pdev = NULL;
>>  }
>> +
>> +dev_set_drvdata(>dev, pdev);
>> +
>>  out:
>>  return pdev;
>>  }
>> -- 
>> 2.4.10
>>


-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/time: Don't use EFI's GetTime call by default

2015-12-01 Thread Andrew Cooper
On 01/12/15 17:26, Jan Beulich wrote:
 On 01.12.15 at 17:57,  wrote:
>> When EFI is used, don't use EFI's GetTime() to get the time, because it
>> is broken on many platforms. From Linux commit 7efe665903d0 ("rtc:
>> Disable EFI rtc for x86"):
>> "Disable it explicitly for x86 so that we don't give users false
>> hope that this driver will work - it won't, and your machine is likely
>> to crash."
>>
>> Signed-off-by: Ross Lagerwall 
> NAK, since being conceptually wrong (and both of my systems work
> fine). Vendors should get their firmware fixed, and by not using
> runtime service functions we would give them even less reason to
> do so. Until then we have "efi=no-rs".

This is completely unreasonable.

It is not conceptually wrong.  GetTime() is very well known completely
broken, especially after ExitBootServices(), to the point that every
other EFI implementation (including windows) completely avoids it.

The fact that your two systems don't crash immediately is curious, but
they are not a representative of systems in general.  Not a single
broadwell or skylake platform I have access to boots in EFI mode if
GetTime() is used (which include 4 different manufactures).

Vendors will not fix their firmware.  Disabling all runtime services is
not a reasonable alternative.

This is a firmware bug just like many others and needs to be worked
around by default like others.

Anything else is actively damaging to the Xen community.  People just
get frustrated when it doesn't work (especially if the problem has been
identify and a fix rejected upstream) and will move elsewhere instead.

Any situation where a command line override is required to make Xen boot
is a bug in Xen and should be fixed.  This is why we have __init time
quirks. it doesn't matter if we have some truly horrendous workarounds;
Xen needs to be able to boot by default wherever possible.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3] 3/3] xendomains initscript: test for privcmd char device

2015-12-01 Thread Doug Goldstein
Allow the init script to continue if either the character device or the
proc file is available.

CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Signed-off-by: Doug Goldstein 
Acked-by: Ian Jackson 
---
 tools/hotplug/Linux/xendomains.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/hotplug/Linux/xendomains.in 
b/tools/hotplug/Linux/xendomains.in
index 0603842..686f061 100644
--- a/tools/hotplug/Linux/xendomains.in
+++ b/tools/hotplug/Linux/xendomains.in
@@ -45,7 +45,7 @@ fi
 
 # Correct exit code would probably be 5, but it's enough
 # if xend complains if we're not running as privileged domain
-if ! [ -e /proc/xen/privcmd ]; then
+if ! [ -e /dev/xen/privcmd ] || [ -e /proc/xen/privcmd ]; then
exit 0
 fi
 
-- 
2.4.10


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3] 1/3] libxc: prefer using privcmd character device

2015-12-01 Thread Doug Goldstein
Prefer using the character device over the proc file if the character
device exists. This follows similar conversions of xenbus to avoid
issues with FMODE_ATOMIC_POS added in Linux 3.14 and newer.

CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Signed-off-by: Doug Goldstein 
---
 tools/libxc/xc_linux_osdep.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_linux_osdep.c b/tools/libxc/xc_linux_osdep.c
index 76c55ff..c3a3a14 100644
--- a/tools/libxc/xc_linux_osdep.c
+++ b/tools/libxc/xc_linux_osdep.c
@@ -46,7 +46,13 @@
 static xc_osdep_handle linux_privcmd_open(xc_interface *xch)
 {
 int flags, saved_errno;
-int fd = open("/proc/xen/privcmd", O_RDWR);
+int fd = open("/dev/xen/privcmd", O_RDWR); /* prefer this newer interface 
*/
+
+if ( fd == -1 && ( errno == ENOENT || errno == ENXIO || errno == ENODEV ))
+{
+/* Fallback to /proc/xen/privcmd */
+fd = open("/proc/xen/privcmd", O_RDWR);
+}
 
 if ( fd == -1 )
 {
-- 
2.4.10


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3] 2/3] update outdated header comment on privcmd.h

2015-12-01 Thread Doug Goldstein
The BSDs have always accessed privcmd via /dev/xen/privcmd while Linux
has used /proc/xen/privcmd but things are shifting to /dev/xen/privcmd
as well.

CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Signed-off-by: Doug Goldstein 
Acked-by: Ian Jackson 
---
 tools/include/xen-sys/FreeBSD/privcmd.h| 2 +-
 tools/include/xen-sys/Linux/privcmd.h  | 2 +-
 tools/include/xen-sys/NetBSD/privcmd.h | 2 +-
 tools/include/xen-sys/NetBSDRump/privcmd.h | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/include/xen-sys/FreeBSD/privcmd.h 
b/tools/include/xen-sys/FreeBSD/privcmd.h
index 0434d4d..cf1241f 100644
--- a/tools/include/xen-sys/FreeBSD/privcmd.h
+++ b/tools/include/xen-sys/FreeBSD/privcmd.h
@@ -1,7 +1,7 @@
 /**
  * privcmd.h
  *
- * Interface to /proc/xen/privcmd.
+ * Interface to /dev/xen/privcmd.
  *
  * Copyright (c) 2003-2005, K A Fraser
  *
diff --git a/tools/include/xen-sys/Linux/privcmd.h 
b/tools/include/xen-sys/Linux/privcmd.h
index 5be860a..e4e666a 100644
--- a/tools/include/xen-sys/Linux/privcmd.h
+++ b/tools/include/xen-sys/Linux/privcmd.h
@@ -1,7 +1,7 @@
 /**
  * privcmd.h
  * 
- * Interface to /proc/xen/privcmd.
+ * Interface to /dev/xen/privcmd.
  * 
  * Copyright (c) 2003-2005, K A Fraser
  * 
diff --git a/tools/include/xen-sys/NetBSD/privcmd.h 
b/tools/include/xen-sys/NetBSD/privcmd.h
index 1296b30..555bad9 100644
--- a/tools/include/xen-sys/NetBSD/privcmd.h
+++ b/tools/include/xen-sys/NetBSD/privcmd.h
@@ -30,7 +30,7 @@
 #ifndef __NetBSD_PRIVCMD_H__
 #define __NetBSD_PRIVCMD_H__
 
-/* Interface to /proc/xen/privcmd */
+/* Interface to /dev/xen/privcmd */
 
 typedef struct privcmd_hypercall
 {
diff --git a/tools/include/xen-sys/NetBSDRump/privcmd.h 
b/tools/include/xen-sys/NetBSDRump/privcmd.h
index 1296b30..555bad9 100644
--- a/tools/include/xen-sys/NetBSDRump/privcmd.h
+++ b/tools/include/xen-sys/NetBSDRump/privcmd.h
@@ -30,7 +30,7 @@
 #ifndef __NetBSD_PRIVCMD_H__
 #define __NetBSD_PRIVCMD_H__
 
-/* Interface to /proc/xen/privcmd */
+/* Interface to /dev/xen/privcmd */
 
 typedef struct privcmd_hypercall
 {
-- 
2.4.10


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-pciback: fix up cleanup path when alloc fails

2015-12-01 Thread Konrad Rzeszutek Wilk
On Tue, Dec 01, 2015 at 11:47:17AM -0500, Konrad Rzeszutek Wilk wrote:
> On Thu, Nov 26, 2015 at 02:32:39PM -0600, Doug Goldstein wrote:
> > When allocating a pciback device fails, avoid the possibility of a
> > use after free.
> 
> Reviewed-by: Konrad Rzeszutek Wilk 
> 
> Ugh, and it looks like xen-blkfront has the same issue.

 Nope. No problems there.

The ->probe if it fails (so xenbus_dev_probe returns the error)
ends up in the 'probe_failed' label in really_probe which takes care by doing:

dev_set_drvdata(dev, NULL);

Wheew!

either way the patch should go in, but the 'possibility' should
be perhaps removed? Unless there is some other path I missed?

> 
> > 
> > Reported-by: Jonathan Creekmore 
> > Signed-off-by: Doug Goldstein 
> > ---
> >  drivers/xen/xen-pciback/xenbus.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/xen/xen-pciback/xenbus.c 
> > b/drivers/xen/xen-pciback/xenbus.c
> > index 98bc345..4843741 100644
> > --- a/drivers/xen/xen-pciback/xenbus.c
> > +++ b/drivers/xen/xen-pciback/xenbus.c
> > @@ -44,7 +44,6 @@ static struct xen_pcibk_device *alloc_pdev(struct 
> > xenbus_device *xdev)
> > dev_dbg(>dev, "allocated pdev @ 0x%p\n", pdev);
> >  
> > pdev->xdev = xdev;
> > -   dev_set_drvdata(>dev, pdev);
> >  
> > mutex_init(>dev_lock);
> >  
> > @@ -58,6 +57,9 @@ static struct xen_pcibk_device *alloc_pdev(struct 
> > xenbus_device *xdev)
> > kfree(pdev);
> > pdev = NULL;
> > }
> > +
> > +   dev_set_drvdata(>dev, pdev);
> > +
> >  out:
> > return pdev;
> >  }
> > -- 
> > 2.4.10
> > 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Build problems with xen 4.7

2015-12-01 Thread Olaf Hering
On Tue, Dec 01, M A Young wrote:

> It happens if CFLAGS is set to anything as a environment variable, eg.
> export CFLAGS=" "
> make dist-xen

This never worked. I have this in xen.spec to workaround the way
%configure is implemented in rpm:

%configure 
unset CFLAGS
unset CXXFLAGS
unset FFLAGS
unset LDFLAGS
make 


Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Build problems with xen 4.7

2015-12-01 Thread Andrew Cooper
On 01/12/15 19:40, Olaf Hering wrote:
> On Tue, Dec 01, M A Young wrote:
>
>> It happens if CFLAGS is set to anything as a environment variable, eg.
>> export CFLAGS=" "
>> make dist-xen
> This never worked.

Works for me with CentOS-based RPM builds. 
(https://github.com/xenserver/xen-4.6.pg/blob/master/master/builder-makefiles.patch#L413
from XenServer)

>  I have this in xen.spec to workaround the way
> %configure is implemented in rpm:
>
> %configure 
> unset CFLAGS
> unset CXXFLAGS
> unset FFLAGS
> unset LDFLAGS
> make 

Requiring these unset's is definitely buggy behaviour, and should be fixing.

For me, the example given by Michael breaks even earlier.

andrewcoop@andrewcoop:/local/xen.git/xen$ CFLAGS=" " make -j4 -s
 __  ___  ____  
 \ \/ /___ _ __   | || |  / /_  / |   _ __  _ __ ___
  \  // _ \ '_ \  | || |_| '_ \ | |__| '_ \| '__/ _ \
  /  \  __/ | | | |__   _| (_) || |__| |_) | | |  __/
 /_/\_\___|_| |_||_|(_)___(_)_|  | .__/|_|  \___|
 |_|
Fields of 'compat_gnttab_cache_flush' not found in 'compat/grant_table.h'
Makefile:70: recipe for target 'compat/.xlat/grant_table.h' failed
make[2]: *** [compat/.xlat/grant_table.h] Error 1
make[2]: *** Waiting for unfinished jobs
Fields of 'compat_mem_access_op' not found in 'compat/memory.h'
Makefile:70: recipe for target 'compat/.xlat/memory.h' failed
make[2]: *** [compat/.xlat/memory.h] Error 1
Makefile:104: recipe for target '/local/xen.git/xen/xen' failed
make[1]: *** [/local/xen.git/xen/xen] Error 2
Makefile:29: recipe for target 'build' failed
make: *** [build] Error 2


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)

2015-12-01 Thread Ian Campbell
On Tue, 2015-12-01 at 08:41 +0100, Juergen Gross wrote:
> >> I'm not quite sure what to make of this, in particular I don't see
> anything
> >> in kexec.c which obviously looks after unmapping the heap or brk
> areas.
> > 
> > Nah, this backtrace shows a normal allocation path while
> > uncompressing the kernel image. I'd expect something like that.
> > Why shouldn't mini-os make use of pfn 4d81 somewhere?

That pfn ends up right in the middle of the next-kernels vaddr mapping,
so at best it indicates some sort of disconnect/overlap between the
mini-os memory allocator and the domain-builder memory allocator.

Since it seems to be in the middle of the padding area (which might
have been new in ea7c8a3d0e82, I'm not sure, it seems to be more
explicit at the least) it occurred to me on the way home last night
that maybe we need to unmap the padding area as well.

I'll try that and your suggested patch below as well once I get to the
office this morning.

> > 
> > I guess there is something wrong either in mini-os's memory
> > allocator (not very likely) or in kexec_allocate(). I'll try to
> > check those.
> 
> Hmm, kexec_allocate() seems to be a little bit fishy.
> 
> I suspect a problem in the loop for the case new_pfn == i. I think
> in this case the p2m list will be written with a wrong entry.
> 
> Ian, could you verify via:
> 
> diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c
> index 8fd9ff9..9421023 100644
> --- a/stubdom/grub/kexec.c
> +++ b/stubdom/grub/kexec.c
> @@ -131,6 +131,8 @@ int kexec_allocate(struct xc_dom_image *dom)
>   /* Store destination PFN of currently requested page. */
>   pages_moved2pfns[i] = new_pfn;
> 
> + BUG_ON(new_pfn == i);
> +
>   /* Put old page at new PFN */
>   dom->p2m_host[new_pfn] = old_mfn;
> 
> 
> Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)

2015-12-01 Thread Ian Campbell
On Tue, 2015-12-01 at 08:15 +0100, Juergen Gross wrote:
> 
> BTW: up to now I haven't managed to reproduce your problem. My
> domains are just booting fine up to now. Is there a way I could
> get the domain image which is failing? Maybe I could just try
> to use that on my test machine with the same domain config you are
> using.

I think it should be sufficient just to drop the Debian kernel into an
empty filesystem attached to the guest, just doing "kernel
(hd0,0)/boot/vmlinuz ; boot" from the grub command line seems to
replicate the issue for me. Probably easier than actually going through
an installation.

I can send you the vmlinuz if you think that approach is worth trying.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)

2015-12-01 Thread Juergen Gross
On 01/12/15 09:30, Ian Campbell wrote:
> On Tue, 2015-12-01 at 08:41 +0100, Juergen Gross wrote:
 I'm not quite sure what to make of this, in particular I don't see
>> anything
 in kexec.c which obviously looks after unmapping the heap or brk
>> areas.
>>>
>>> Nah, this backtrace shows a normal allocation path while
>>> uncompressing the kernel image. I'd expect something like that.
>>> Why shouldn't mini-os make use of pfn 4d81 somewhere?
> 
> That pfn ends up right in the middle of the next-kernels vaddr mapping,
> so at best it indicates some sort of disconnect/overlap between the
> mini-os memory allocator and the domain-builder memory allocator.

I don't think so.

mini-os just allocates single pages and keeps the relation to the
(future) pfn of that page. The p2m list is adjusted later to move the
allocated page to that pfn before activating the new kernel.

> Since it seems to be in the middle of the padding area (which might
> have been new in ea7c8a3d0e82, I'm not sure, it seems to be more
> explicit at the least) it occurred to me on the way home last night
> that maybe we need to unmap the padding area as well.

We do. The page tables need to be unmapped independently as they
have been mapped explicitly during setup_pgtables(dom). All the
mini-os mappings are removed in a loop just after that.

> I'll try that and your suggested patch below as well once I get to the
> office this morning.

Thanks.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xen 4.5.0 rtds scheduler perform poorly with 2vms

2015-12-01 Thread Dario Faggioli
On Sun, 2015-11-29 at 11:44 -0500, Meng Xu wrote:
> 2015-11-29 11:27 GMT-05:00 Dario Faggioli 
> :
> > 
> > Mmmm... As I said many times, I don't remember much of all those RT
> > schedulability formulas, but, is really that simple?
> 
> Ah, let me clarify...
> It is not that simple. ;-) I just simplify it, hoping it can simplify
> the problem and highlight the possible reason.
> 
Ok, glad to know I haven't completely lost my mind, or anything like
that! :-)

> > I mean, if the in-
> > guest scheduling algorithm is global (e.g., global-EDF), the task
> > could
> > migrate, couldn't it?
> 
> Yes. If these partial VCPUs happen to be scheduled "sequentially",
> the
> OS inside VM can migrate the task and make the task keep running. But
> that is not the worst-case for the OS.
> 
Right, I see it now, and (FWIW) I absolutely agree with the worst-case
analysis you provided (thanks). I did not get the fact that you were
talking about the worst-case, sorry for the noise. :-D

> The detailed illustration of the worst case scenario is at Arvind's
> paper: http://link.springer.com/article/10.1007%2Fs11241-009-9073-x
> My latest journal paper
> (http://link.springer.com/article/10.1007%2Fs11241-015-9223-2)
> tighten
> the resource supply bound function of the MPR model. I believe the
> equations are too boring to most of people in the mailing list.
> 
> So let's avoid the complex equations here. ;-)
> 
Thanks for this too! :-)

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] x86/VPMU: Support only versions 2 and 3 of architectural performance monitoring

2015-12-01 Thread Dietmar Hahn
Am Montag 30 November 2015, 13:38:40 schrieb Boris Ostrovsky:
> We need to have at least version 2 since it's the first version to
> support various control and status registers (such as
> MSR_CORE_PERF_GLOBAL_CTRL) that VPMU relies on always having.
> 
> With explicit testing for PMU version we can now remove CPUID model
> check.
> 
> Signed-off-by: Boris Ostrovsky 
> ---
>  xen/arch/x86/cpu/vpmu_intel.c | 55 
> +++
>  1 file changed, 8 insertions(+), 47 deletions(-)
> 
> diff --git a/xen/arch/x86/cpu/vpmu_intel.c b/xen/arch/x86/cpu/vpmu_intel.c
> index d5ea7fe..bb4ddcc 100644
> --- a/xen/arch/x86/cpu/vpmu_intel.c
> +++ b/xen/arch/x86/cpu/vpmu_intel.c
> @@ -955,59 +955,20 @@ int vmx_vpmu_initialise(struct vcpu *v)
>  int __init core2_vpmu_init(void)
>  {
>  u64 caps;
> +unsigned int version = 0;
>  
> -if ( current_cpu_data.x86 != 6 )
> +if ( current_cpu_data.cpuid_level >= 0xa )
> +version = cpuid_eax(0xa) & 0xff;
> +if ( (version != 2) && (version != 3) )
>  {
> -printk(XENLOG_WARNING "VPMU: only family 6 is supported\n");
> +printk(XENLOG_WARNING "VPMU: version %d is not supported\n", 
> version);
>  return -EINVAL;

But this means that all (newer?) processors with version=4 are not supported
even though the SDM 3B tells:
"Processors supporting architectural performance monitoring version 4 also
 supports version 1, 2, and 3, ..."

Shold we not only write a hint that version 4 capabilities are not supported
and fake this cpuid-flag for the guests to the version 3?

Dietmar.

>  }
>  
> -switch ( current_cpu_data.x86_model )
> +if ( current_cpu_data.x86 != 6 )
>  {
> -/* Core2: */
> -case 0x0f: /* original 65 nm celeron/pentium/core2/xeon, 
> "Merom"/"Conroe" */
> -case 0x16: /* single-core 65 nm celeron/core2solo 
> "Merom-L"/"Conroe-L" */
> -case 0x17: /* 45 nm celeron/core2/xeon "Penryn"/"Wolfdale" */
> -case 0x1d: /* six-core 45 nm xeon "Dunnington" */
> -
> -case 0x2a: /* SandyBridge */
> -case 0x2d: /* SandyBridge, "Romley-EP" */
> -
> -/* Nehalem: */
> -case 0x1a: /* 45 nm nehalem, "Bloomfield" */
> -case 0x1e: /* 45 nm nehalem, "Lynnfield", "Clarksfield", "Jasper 
> Forest" */
> -case 0x2e: /* 45 nm nehalem-ex, "Beckton" */
> -
> -/* Westmere: */
> -case 0x25: /* 32 nm nehalem, "Clarkdale", "Arrandale" */
> -case 0x2c: /* 32 nm nehalem, "Gulftown", "Westmere-EP" */
> -case 0x2f: /* 32 nm Westmere-EX */
> -
> -case 0x3a: /* IvyBridge */
> -case 0x3e: /* IvyBridge EP */
> -
> -/* Haswell: */
> -case 0x3c:
> -case 0x3f:
> -case 0x45:
> -case 0x46:
> -
> -/* Broadwell */
> -case 0x3d:
> -case 0x4f:
> -case 0x56:
> -
> -/* future: */
> -case 0x4e:
> -
> -/* next gen Xeon Phi */
> -case 0x57:
> -break;
> -
> -default:
> -printk(XENLOG_WARNING "VPMU: Unsupported CPU model %#x\n",
> -   current_cpu_data.x86_model);
> -return -EINVAL;
> +printk(XENLOG_WARNING "VPMU: only family 6 is supported\n");
> +return -EINVAL;
>  }
>  
>  arch_pmc_cnt = core2_get_arch_pmc_count();
> 

-- 
Company details: http://ts.fujitsu.com/imprint.html

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)

2015-12-01 Thread Ian Campbell
On Tue, 2015-12-01 at 09:53 +0100, Juergen Gross wrote:
> On 01/12/15 09:30, Ian Campbell wrote:
> > On Tue, 2015-12-01 at 08:41 +0100, Juergen Gross wrote:
> > > > > I'm not quite sure what to make of this, in particular I don't
> > > > > see
> > > anything
> > > > > in kexec.c which obviously looks after unmapping the heap or brk
> > > areas.
> > > > 
> > > > Nah, this backtrace shows a normal allocation path while
> > > > uncompressing the kernel image. I'd expect something like that.
> > > > Why shouldn't mini-os make use of pfn 4d81 somewhere?
> > 
> > That pfn ends up right in the middle of the next-kernels vaddr mapping,
> > so at best it indicates some sort of disconnect/overlap between the
> > mini-os memory allocator and the domain-builder memory allocator.
> 
> I don't think so.
> 
> mini-os just allocates single pages and keeps the relation to the
> (future) pfn of that page. The p2m list is adjusted later to move the
> allocated page to that pfn before activating the new kernel.

Ah, I was wondering how it could possibly work so I half expected I must be
missing something.

> > Since it seems to be in the middle of the padding area (which might
> > have been new in ea7c8a3d0e82, I'm not sure, it seems to be more
> > explicit at the least) it occurred to me on the way home last night
> > that maybe we need to unmap the padding area as well.
> 
> We do. The page tables need to be unmapped independently as they
> have been mapped explicitly during setup_pgtables(dom). All the
> mini-os mappings are removed in a loop just after that.

"a loop" is this:
/* Unmap day0 pages to avoid having a r/w mapping of the future page table 
*/
   for (pfn = 0; pfn < allocated; pfn++)
munmap((void*) pages[pfn], PAGE_SIZE);

In my debugging this extends only to the end of the actual mappings, not to
the end of the padding, e.g. for me it is extending to "Unmap pfns 0 ..
0x4c0f" while the unexpected PT pfn is at 0x4d80 and the padding area
extends to pfn 0x5000.

> > I'll try that and your suggested patch below as well once I get to the
> > office this morning.
> 
> Thanks.

The BUG_ON doesn't seem to be triggering. I'm not seeing pfn==0x4d80 going
anywhere near kexec_allocate, the highest is 0x4c0f.

Maybe the issue is that the ->allocate hook (==kexec_allocate) isn't called
from xc_dom_alloc_pad?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)

2015-12-01 Thread Ian Campbell
On Tue, 2015-12-01 at 10:01 +, Ian Campbell wrote:
> 
> > > I'll try that and your suggested patch below as well once I get to the
> > > office this morning.
> > 
> > Thanks.
> 
> The BUG_ON doesn't seem to be triggering. I'm not seeing pfn==0x4d80 going
> anywhere near kexec_allocate, the highest is 0x4c0f.
> 
> Maybe the issue is that the ->allocate hook (==kexec_allocate) isn't called
> from xc_dom_alloc_pad?

That seems like it might be the answer, this patchlet fixes it for me:

diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index 5d6c3ba..6d3f97a 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -579,7 +579,13 @@ static int xc_dom_alloc_pad(struct xc_dom_image *dom, 
xen_vaddr_t boundary)
 }
 pages = (boundary - dom->virt_alloc_end) / page_size;
 
-return xc_dom_chk_alloc_pages(dom, "padding", pages);
+if ( xc_dom_chk_alloc_pages(dom, "padding", pages) )
+return -1;
+
+if (dom->allocate)
+dom->allocate(dom);
+
+return 0;
 }
 
 int xc_dom_alloc_segment(struct xc_dom_image *dom,

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxc: correct domain builder for 64 bit guest with 32 bit tools

2015-12-01 Thread Wei Liu
On Tue, Dec 01, 2015 at 08:49:49AM +0100, Juergen Gross wrote:
> Commit 8c45adec18e0512c3d34dcafb13414ecba21be6a ("create unmapped
> initrd in domain builder if supported") introduced an error for
> building a 64 bit guest with a 32 bit toolset.
> 
> The initrd start address and size where stored in an unsigned long
> instead of using a 64 bit type.
> 
> Signed-off-by: Juergen Gross 
> Tested-by: Boris Ostrovsky 

Acked-by: Wei Liu 

Thanks for fixing this.

> ---
>  tools/libxc/include/xc_dom.h | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
> index 2176216..fd8c5e8 100644
> --- a/tools/libxc/include/xc_dom.h
> +++ b/tools/libxc/include/xc_dom.h
> @@ -98,9 +98,14 @@ struct xc_dom_image {
>  xen_vaddr_t virt_alloc_end;
>  xen_vaddr_t bsd_symtab_start;
>  
> -/* initrd parameters as specified in start_info page */
> -unsigned long initrd_start;
> -unsigned long initrd_len;
> +/*
> + * initrd parameters as specified in start_info page
> + * Depending on capabilities of the booted kernel this may be a virtual
> + * address or a pfn. Type is neutral and large enough to hold a virtual
> + * address of a 64 bit kernel even with 32 bit toolstack.
> + */
> +uint64_t initrd_start;
> +uint64_t initrd_len;
>  
>  unsigned int alloc_bootstack;
>  xen_vaddr_t virt_pgtab_end;
> -- 
> 2.6.2
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xen 4.5.0 rtds scheduler perform poorly with 2vms

2015-12-01 Thread Lars Kurth
I wonder whether we need to add some health warnings and recommended background 
reading to http://wiki.xenproject.org/wiki/RTDS-Based-Scheduler
Lars

> On 1 Dec 2015, at 08:59, Dario Faggioli  wrote:
> 
> On Sun, 2015-11-29 at 11:44 -0500, Meng Xu wrote:
>> 2015-11-29 11:27 GMT-05:00 Dario Faggioli 
>> :
>>>  
>>> Mmmm... As I said many times, I don't remember much of all those RT
>>> schedulability formulas, but, is really that simple?
>> 
>> Ah, let me clarify...
>> It is not that simple. ;-) I just simplify it, hoping it can simplify
>> the problem and highlight the possible reason.
>> 
> Ok, glad to know I haven't completely lost my mind, or anything like
> that! :-)
> 
>>> I mean, if the in-
>>> guest scheduling algorithm is global (e.g., global-EDF), the task
>>> could
>>> migrate, couldn't it?
>> 
>> Yes. If these partial VCPUs happen to be scheduled "sequentially",
>> the
>> OS inside VM can migrate the task and make the task keep running. But
>> that is not the worst-case for the OS.
>> 
> Right, I see it now, and (FWIW) I absolutely agree with the worst-case
> analysis you provided (thanks). I did not get the fact that you were
> talking about the worst-case, sorry for the noise. :-D
> 
>> The detailed illustration of the worst case scenario is at Arvind's
>> paper: http://link.springer.com/article/10.1007%2Fs11241-009-9073-x
>> My latest journal paper
>> (http://link.springer.com/article/10.1007%2Fs11241-015-9223-2)
>> tighten
>> the resource supply bound function of the MPR model. I believe the
>> equations are too boring to most of people in the mailing list.
>> 
>> So let's avoid the complex equations here. ;-)
>> 
> Thanks for this too! :-)
> 
> Regards,
> Dario
> -- 
> <> (Raistlin Majere)
> -
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH XEN v5 13/23] tools: Refactor foreign memory mapping into libxenforeignmemory

2015-12-01 Thread Paul Durrant
> -Original Message-
> From: Ian Campbell [mailto:ian.campb...@citrix.com]
> Sent: 30 November 2015 09:52
> To: Paul Durrant; Andrew Cooper; Ian Jackson
> Cc: Wei Liu; xen-devel@lists.xen.org; win-pv-de...@lists.xenproject.org
> Subject: Re: [PATCH XEN v5 13/23] tools: Refactor foreign memory mapping
> into libxenforeignmemory
> 
> On Sun, 2015-11-29 at 09:54 +, Paul Durrant wrote:
> > > -Original Message-
> > [snip]
> > > > C99 was 16 years ago now, I'm struggling to think of a reason not to
> > > > move
> > > > the baseline for tools stuff at least to that.
> > > >
> > > > https://en.wikipedia.org/wiki/Visual_C%2B%2B might be one such
> reason
> > > > I
> > > > suppose, although I'm not convinced a libvchan port to Windows, even
> > > > if
> > > not
> > > > entirely hypothetical, would be using any of xen.git/tools/libs/*
> > > > rather
> > > > than the equivalent frameworks provided by the Windows PV drivers.
> > >
> > > It would be nice to at least be able to use the same header files, for
> > > ease of porting userspace software.
> > >
> >
> > It's possible that libvchan on Windows will make use of the tools/libs
> > headers. As Andy says it would ease porting client software.
> >
> > > In this case, VLAs are just being used as an aid for the compiler to
> > > spot errors.  It doesn't change the API/ABI, and could be #ifdef'd
> > > around, if we care both for using C99 in general, and Windows support.
> > >
> >
> > We still compile with VS2012 in Citrix and Xen Project uses VS2013 so we
> > can't rely on C99. A #ifdef here does seem like the best solution.
> 
> Would you expect new projects (i.e. stuff based on tools/libs) to continue
> to have requirements to build with those older versions? (Although with
> Andy's link saying even VS2015 doesn't do VLAs maybe it is a bit moot).
> 

It would certainly be preferable to have something that did build under a 
sufficiently old C standard to cover as many environments as is reasonable, and 
I think any supported MSVC build environment should be included in that set 
(although I have no problem using #ifdefs to achieve that).

> I don't think using VLA here is worth an ifdef, but I think it is worth
> reordering the arguments such that once we end up with a new enough
> compiler baseline (in $donkeys years) we can switch without changing the
> ABI (I think that's the case).
> 

IIUC correctly the use of VLAs does not change the calling convention in any 
way, right? It would still mean a pointer to err_out is passed and the use of a 
VLA is so that bounds checking can be performed?
I certainly don't think we should get ourselves into a situation where 
additions to compilers or C standards do cause changes in the ABI (and we 
already have to be careful with Windows 32-bit since it defaults to pascal 
calling convention).

  Paul

> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [qemu-mainline test] 65237: regressions - FAIL

2015-12-01 Thread Ian Campbell
On Mon, 2015-11-30 at 18:47 +, Anthony PERARD wrote:
> On Mon, Nov 30, 2015 at 09:40:28AM +, osstest service owner wrote:
> > flight 65237 qemu-mainline real [real]
> > http://logs.test-lab.xenproject.org/osstest/logs/65237/
> > 
> > Regressions :-(
> > 
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> >  test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail REGR. vs. 
> > 64579
> >  test-amd64-i386-xl-qemuu-debianhvm-amd64 9 debian-hvm-install fail REGR. 
> > vs. 64579
> 
> In this test, qemu SIGABRT with "Bad ram offset efffd000" as an error
> message. The test started to fail when osstest started to test with a guest
> with 5000 of memory instead of 768.

That's still a QEMU and/or Xen bug though, right?

IOW you aren't trying to suggest this is an osstest bug, I think.

NB the size of the guest ram depends on the amount of host RAM available,
so this will vary across hosts.

The apparent switch to 5000 for test-amd64-i386-xl-qemuu-ovmf-amd64 was
from merlot1 starting in flight 65078, but there was a pass on merlot1
at 63741. My guess is that a change in one of the other trees between those
two has exposed the issue.

The test-amd64-i386-xl-qemuu-debianhvm-amd64 case[1] started in 65147 when
it ran on nocera1, but nocera* only went into service last week and haven't
run any flights on this branch before now.

The bisector is having a look at the merlot case[2], there are some interim
progress reports on the osstest-putput list[3] (search for 
test-amd64-i386-xl-qemuu-ovmf-amd64).

Looking at the complexity of the graph it'll probably take a little while
to reach a verdict, although it does appear to be making some progress.

Ian.

[0] 
http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-i386-xl-qemuu-ovmf-amd64/qemu-mainline.html
[1] 
http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-i386-xl-qemuu-debianhvm-amd64/qemu-mainline.html
[2] 
http://logs.test-lab.xenproject.org/osstest/results/bisect/qemu-mainline/test-amd64-i386-xl-qemuu-ovmf-amd64.debian-hvm-install.html
[3] 
http://lists.xenproject.org/archives/html/osstest-output/2015-11/threads.html

http://lists.xenproject.org/archives/html/osstest-output/2015-12/threads.html


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)

2015-12-01 Thread Wei Liu
On Tue, Dec 01, 2015 at 10:04:38AM +, Ian Campbell wrote:
> On Tue, 2015-12-01 at 10:01 +, Ian Campbell wrote:
> > 
> > > > I'll try that and your suggested patch below as well once I get to the
> > > > office this morning.
> > > 
> > > Thanks.
> > 
> > The BUG_ON doesn't seem to be triggering. I'm not seeing pfn==0x4d80 going
> > anywhere near kexec_allocate, the highest is 0x4c0f.
> > 
> > Maybe the issue is that the ->allocate hook (==kexec_allocate) isn't called
> > from xc_dom_alloc_pad?
> 
> That seems like it might be the answer, this patchlet fixes it for me:
> 
> diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
> index 5d6c3ba..6d3f97a 100644
> --- a/tools/libxc/xc_dom_core.c
> +++ b/tools/libxc/xc_dom_core.c
> @@ -579,7 +579,13 @@ static int xc_dom_alloc_pad(struct xc_dom_image *dom, 
> xen_vaddr_t boundary)
>  }
>  pages = (boundary - dom->virt_alloc_end) / page_size;
>  
> -return xc_dom_chk_alloc_pages(dom, "padding", pages);
> +if ( xc_dom_chk_alloc_pages(dom, "padding", pages) )
> +return -1;
> +
> +if (dom->allocate)
> +dom->allocate(dom);
> +
> +return 0;
>  }
>  
>  int xc_dom_alloc_segment(struct xc_dom_image *dom,

Currently there are three places that call dom->allocate (if we include
the call in the proposed diff). I think it would be better if we push
dom->allocate down to xc_dom_chk_alloc_pages, then refactor
xc_dom_alloc_page to use xc_dom_chk_alloc_pages.

Just some thought after a quick look at the code. I will see what I can
do after confirming this is the culprit.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 65252: regressions - FAIL

2015-12-01 Thread osstest service owner
flight 65252 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/65252/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail REGR. vs. 64579
 test-amd64-i386-xl-qemuu-debianhvm-amd64 9 debian-hvm-install fail REGR. vs. 
64579

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-libvirt-raw  6 xen-bootfail pass in 65237

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 64579
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail baseline 
untested
 test-armhf-armhf-xl-rtds 11 guest-start  fail   like 64579

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-raw  9 debian-di-install fail in 65237 never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-vhd   9 debian-di-installfail   never pass
 test-armhf-armhf-libvirt-qcow2  9 debian-di-installfail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd  9 debian-di-installfail   never pass

version targeted for testing:
 qemuu714487515dbe0c65d5904251e796cd3a5b3579fb
baseline version:
 qemuu9be060f5278dc0d732ebfcf2bf0a293f88b833eb

Last test of basis64579  2015-11-17 15:37:49 Z   13 days
Failing since 64797  2015-11-19 03:03:30 Z   12 days   11 attempts
Testing same since65167  2015-11-27 19:59:25 Z3 days3 attempts


People who touched revisions under test:
  "Dr. David Alan Gilbert" 
  "Eugene (jno) Dvurechenski" 
  Alberto Garcia 
  Alistair Francis 
  Andreas Färber 
  Andrew Baumann 
  Anthony PERARD 
  Bandan Das 
  Daniel P. Berrange 
  Denis V. Lunev 
  Dr. David Alan Gilbert 
  Eduardo Habkost 
  Eric Blake 
  Eugene (jno) Dvurechenski 
  Fam Zheng 
  François Baldassari 
  Gerd Hoffmann 
  Greg Kurz 
  Ildar Isaev 
  James Hogan 
  Jason J. Herne 
  Jason Wang 
  John Arbuckle 
  John Clarke 
  John Snow 
  Juan Quintela 
  Kevin Wolf 
  Leon Alrae 
  Marc-André Lureau 
  Markus Armbruster 
  Max 

  1   2   3   >