Re: [Xen-devel] [PATCH v15 05/10] x86: add multiboot2 protocol support for EFI platforms

2017-02-16 Thread Jan Beulich
>>> On 16.02.17 at 22:49,  wrote:
> On Thu, Feb 16, 2017 at 02:29:45AM -0700, Jan Beulich wrote:
>> >>> On 15.02.17 at 22:53,  wrote:
>> > On Wed, Feb 15, 2017 at 03:22:02AM -0700, Jan Beulich wrote:
>> >> >>> On 14.02.17 at 19:38,  wrote:
>> >> > --- a/xen/arch/x86/boot/head.S
>> >> > +++ b/xen/arch/x86/boot/head.S
>> >> > @@ -394,10 +394,18 @@ __start:
>> >> >
>> >> >  /* EFI IA-32 platforms are not supported. */
>> >> >  cmpl$MULTIBOOT2_TAG_TYPE_EFI32,MB2_tag_type(%ecx)
>> >> > +/*
>> >> > + * Here we should zap vga_text_buffer. However, we can disable
>> >> > + * VGA updates in simpler and more reliable way later.
>> >> > + */
>> >> >  je  .Lmb2_efi_ia_32
>> >> >
>> >> >  /* Bootloader shutdown EFI x64 boot services. */
>> >> >  cmpl$MULTIBOOT2_TAG_TYPE_EFI64,MB2_tag_type(%ecx)
>> >> > +/*
>> >> > + * Here we should zap vga_text_buffer. However, we can disable
>> >> > + * VGA updates in simpler and more reliable way later.
>> >> > + */
>> >> >  je  .Lmb2_no_bs
>> >>
>> >> I'm afraid I don't view these comments as helpful in understanding
>> >> the whole situation. That's partly because I don't follow both the
>> >> "simpler" and "more reliable" parts (using just the information here,
>> >
>> > OK, I will clarify it.
>> >
>> >> i.e. leaving aside what you've given as explanation earlier, albeit I
>> >> don't think that was fully clarifying things either), and partly
>> >> because I continue to think that the explanation should go where
>> >> the labels are (which is what I had meant to suggest with my
>> >> comment placement in reply to v14). Nor does the adjustment
>> >
>> > OK.
>> >
>> >> above help (me) understand the correctness of the dual use of
>> >> .Lmb2_no_bs.
>> >
>> > What do you mean by "dual use of .Lmb2_no_bs."? I would like to be sure.
>>
>> As said in v14 review, it's being jumped to from two rather different
>> places, and hence the VGA aspect isn't obviously the same for both.
> 
> OK, I will try to clarify. If a bootloader called us using __efi64_mb2_start
> we are sure that we are running on EFI platform and there is no VGA there.
> It means that we can safely zap vga_text_buffer unconditionally in first steps
> (we do that in second instruction). Then we do not need to take care about
> that in case of error. And one of these errors is lack of 
> MULTIBOOT2_TAG_TYPE_EFI_BS
> tag. It means that EFI boot services are shutdown. So, we are in black hole.

Well, see - this is one of my problems with the overall approach here.
Running on EFI in no way means there _is_ no VGA there, it only
means there _may not be_ any VGA. With boot services shut down
and without serial console you have no way of informing the user, so
making an attempt at writing something to VGA may still be helpful.
In the worst case the memory writes go no-where, to RAM, or to an
unrelated device (both of the latter rather unlikely on x86). Granted
there's the second problem of you perhaps not knowing the video
mode, and hence having a hard time producing output that's also
readable.

> We have to inform user about that and halt the system. And that is why we
> jump to .Lmb2_no_bs here.
> 
> On the other hand if the bootloader called us using start label then in most
> cases we are running on legacy BIOS platforms. However, if the bootloader also
> provided MULTIBOOT2_TAG_TYPE_EFI64 tag here then we are sure that we are 
> running
> on EFI platform and EFI boot services are shutdown. This happens when we are
> loaded by old boot loader which does not understand 
> MULTIBOOT2_HEADER_TAG_EFI_BS
> and MULTIBOOT2_HEADER_TAG_ENTRY_ADDRESS_EFI64 tags. So, as above we can jump
> to .Lmb2_no_bs here too.

However, when the boot loader invoked our start label, can't we be
sure there is VGA (at least as much or as little as there would be on
non-EFI platforms, where headless systems certainly also exist)? I
don't think the boot loader can reasonably invoke our legacy entry
point with the system in a state that's not legacy compatible (which,
among other things, means if there is VGA, then it would be in
traditional 80x25 text mode). Hence this second use of the label
ought to avoid suppressing the VGA output attempt in any case.

> I hope that helps.

The above aside, yes, it does. An abbreviated variant of this is
what I would hope to have attached as comment to the error
handling labels, to namely help readers understand why some of
them inhibit VGA output while others don't.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf baseline-only test] 68571: all pass

2017-02-16 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 68571 ovmf real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/68571/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf b173ad78519b2ade309019614b52e1453727e20d
baseline version:
 ovmf fd12acdeff7a04ad34ccb95103eb6204b8901749

Last test of basis68568  2017-02-16 15:17:39 Z0 days
Testing same since68571  2017-02-17 00:50:59 Z0 days1 attempts


People who touched revisions under test:
  Haojian Zhuang 
  Jiaxin Wu 
  Ruiyu Ni 
  Wu Jiaxin 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


commit b173ad78519b2ade309019614b52e1453727e20d
Author: Jiaxin Wu 
Date:   Wed Feb 15 14:32:14 2017 +0800

NetworkPkg/HttpBootDxe: Update to check specified media type

IANA has approved below new media type for EFI http(s) boot usage:
  application/vnd.efi.img
  application/vnd.efi.iso

HTTP boot driver should be updated to check the above media type
from Content-Type header field.

Cc: Ye Ting 
Cc: Fu Siyuan 
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Wu Jiaxin 
Reviewed-by: Fu Siyuan 
Reviewed-by: Ye Ting 

commit 6c6452c6e23a89be0e565501500e83c136e3fcbd
Author: Jiaxin Wu 
Date:   Thu Feb 16 09:15:11 2017 +0800

NetworkPkg/HttpBootDxe: Declare the functions as EFIAPI to pass the GCC 
build

Cc: Laszlo Ersek 
Cc: Gerd Hoffmann 
Cc: Ye Ting 
Cc: Fu Siyuan 
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Wu Jiaxin 
Reviewed-by: Laszlo Ersek 
Build-tested-by: Laszlo Ersek 

commit d176bb3c5c28e0c89ae86995ecd6b9e21b4e0b9f
Author: Haojian Zhuang 
Date:   Mon Feb 13 15:53:00 2017 +0800

ArmPlatformPkg/PL061Gpio: fix the offset value in Get function

When call PL061GetPins() or PL061SetPins(), should use GPIO_PIN_MASK(offset)
as parameter, not offset.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Haojian Zhuang 
Reviewed-by: Ard Biesheuvel 

commit d164a0e31bf8aa5bc8f9a184a02648585ff4f0d7
Author: Haojian Zhuang 
Date:   Mon Feb 13 15:52:59 2017 +0800

ArmPlatformPkg/PL061: remove duplicated PL061_GPIO_DATA_REG

PL061_GPIO_DATA_REG offset is referenced in PL061EffectiveAddress ()
already. So remove the duplicated reference when invoke PL061GetPins ()
or PL061SetPins ().

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Haojian Zhuang 
Reviewed-by: Ard Biesheuvel 

commit 31d7be0135b8e8b95508daa6484bebba6280af15
Author: Ruiyu Ni 
Date:   Thu Feb 9 17:24:05 2017 +0800

ShellPkg/pci: Report error when invalid value is specified for "-ec"

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Ruiyu Ni 
Reviewed-by: Jaben Carsey 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 09/35] x86: Convert remaining uses of pr_warning to pr_warn

2017-02-16 Thread Pekka Paalanen
On Thu, 16 Feb 2017 23:11:22 -0800
Joe Perches  wrote:

> To enable eventual removal of pr_warning
> 
> This makes pr_warn use consistent for arch/x86
> 
> Prior to this patch, there were 46 uses of pr_warning and
> 122 uses of pr_warn in arch/x86
> 
> Miscellanea:
> 
> o Coalesce a few formats and realign arguments
> o Convert a couple of multiple line printks to single line
> 
> Signed-off-by: Joe Perches 
> ---
>  arch/x86/kernel/amd_gart_64.c  | 12 +++--
>  arch/x86/kernel/apic/apic.c| 46 
> --
>  arch/x86/kernel/apic/apic_noop.c   |  2 +-
>  arch/x86/kernel/setup_percpu.c |  4 +--
>  arch/x86/kernel/tboot.c| 15 ++-
>  arch/x86/kernel/tsc_sync.c |  8 +++---
>  arch/x86/mm/kmmio.c|  8 +++---
>  arch/x86/mm/mmio-mod.c |  5 ++--
>  arch/x86/mm/numa.c | 12 -
>  arch/x86/mm/numa_emulation.c   |  6 ++---
>  arch/x86/mm/testmmiotrace.c|  5 ++--
>  arch/x86/oprofile/op_x86_model.h   |  6 ++---
>  arch/x86/platform/olpc/olpc-xo15-sci.c |  2 +-
>  arch/x86/platform/sfi/sfi.c|  3 +--
>  arch/x86/xen/debugfs.c |  2 +-
>  arch/x86/xen/setup.c   |  2 +-
>  16 files changed, 63 insertions(+), 75 deletions(-)
> 

Hi,

seems fine to me, even though I haven't been involved in the kernel
side for years.

For the hunks quoted below *only*:
Reviewed-by: Pekka Paalanen 

> diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c
> index afc47f5c9531..ad70518cdcc7 100644
> --- a/arch/x86/mm/kmmio.c
> +++ b/arch/x86/mm/kmmio.c
> @@ -187,8 +187,8 @@ static int arm_kmmio_fault_page(struct kmmio_fault_page 
> *f)
>   int ret;
>   WARN_ONCE(f->armed, KERN_ERR pr_fmt("kmmio page already armed.\n"));
>   if (f->armed) {
> - pr_warning("double-arm: addr 0x%08lx, ref %d, old %d\n",
> -f->addr, f->count, !!f->old_presence);
> + pr_warn("double-arm: addr 0x%08lx, ref %d, old %d\n",
> + f->addr, f->count, !!f->old_presence);
>   }
>   ret = clear_page_presence(f, true);
>   WARN_ONCE(ret < 0, KERN_ERR pr_fmt("arming at 0x%08lx failed.\n"),
> @@ -335,8 +335,8 @@ static int post_kmmio_handler(unsigned long condition, 
> struct pt_regs *regs)
>* something external causing them (f.e. using a debugger while
>* mmio tracing enabled), or erroneous behaviour
>*/
> - pr_warning("unexpected debug trap on CPU %d.\n",
> -smp_processor_id());
> + pr_warn("unexpected debug trap on CPU %d\n",
> + smp_processor_id());
>   goto out;
>   }
>  
> diff --git a/arch/x86/mm/mmio-mod.c b/arch/x86/mm/mmio-mod.c
> index bef36622e408..706ae44d1af7 100644
> --- a/arch/x86/mm/mmio-mod.c
> +++ b/arch/x86/mm/mmio-mod.c
> @@ -407,7 +407,7 @@ static void enter_uniprocessor(void)
>   }
>  out:
>   if (num_online_cpus() > 1)
> - pr_warning("multiple CPUs still online, may miss events.\n");
> + pr_warn("multiple CPUs still online, may miss events\n");
>  }
>  
>  static void leave_uniprocessor(void)
> @@ -431,8 +431,7 @@ static void leave_uniprocessor(void)
>  static void enter_uniprocessor(void)
>  {
>   if (num_online_cpus() > 1)
> - pr_warning("multiple CPUs are online, may miss events. "
> -"Suggest booting with maxcpus=1 kernel argument.\n");
> + pr_warn("multiple CPUs are online, may miss events. Suggest 
> booting with maxcpus=1 kernel argument.\n");
>  }
>  
>  static void leave_uniprocessor(void)

> diff --git a/arch/x86/mm/testmmiotrace.c b/arch/x86/mm/testmmiotrace.c
> index 38868adf07ea..4a55e453296d 100644
> --- a/arch/x86/mm/testmmiotrace.c
> +++ b/arch/x86/mm/testmmiotrace.c
> @@ -121,9 +121,8 @@ static int __init init(void)
>   return -ENXIO;
>   }
>  
> - pr_warning("WARNING: mapping %lu kB @ 0x%08lx in PCI address space, "
> -"and writing 16 kB of rubbish in there.\n",
> -size >> 10, mmio_address);
> + pr_warn("WARNING: mapping %lu kB @ 0x%08lx in PCI address space, and 
> writing 16 kB of rubbish in there\n",
> + size >> 10, mmio_address);
>   do_test(size);
>   do_test_bulk_ioremapping();
>   pr_info("All done.\n");


Thanks,
pq


pgpwW5Z4xsooi.pgp
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Unshared IOMMU issues

2017-02-16 Thread Jan Beulich
>>> On 16.02.17 at 19:09,  wrote:
> On 16/02/17 16:34, Jan Beulich wrote:
> On 16.02.17 at 17:11,  wrote:
>>> On 16/02/17 15:52, Jan Beulich wrote:
>>> On 16.02.17 at 16:02,  wrote:
> On Thu, Feb 16, 2017 at 11:36 AM, Jan Beulich  wrote:
> On 15.02.17 at 18:43,  wrote:
>>> 1.
>>> I need:
>>> Allow P2M core on ARM to update IOMMU mapping from the first 
>>> "p2m_set_entry".
>>> I do:
>>> I explicitly set need_iommu flag for *every* guest domain during
>>> iommu_domain_init() on ARM in case if page table is not shared.
>>> At that moment I have no knowledge about will any device be assigned
>>> to this domain or not. I am just want to receive all mapping updates
>>> from P2M code. The P2M will update IOMMU mapping only when need_iommu
>>> is set and page table is not shared.
>>> I have doubts:
>>> Is it correct to just force need_iommu flag?
>>
>> No, I don't think so. This is a waste of a measurable amount of
>> resources when page tables aren't shared.
>>
>>> Or maybe another flag should be introduced?
>>
>> Not sure what you think of here. Where's the problem with building
>> IOMMU page tables at the time the first device gets assigned, just
>> like x86 does?
> OK, I have already had a look at  arch_iommu_populate_page_table() for 
> x86.
> I don't know at the moment how this solution can help me.
> There are a least two points the prevent me from doing the similar thing.
> 1. For create IOMMU mapping I need both mfn and gfn. (+ flags).
> I am able to get mfn only. How can I find corresponding gfn?

 As the x86 one shows, via mfn_to_gmfn(). If ARM doesn't have
 this, perhaps it needs to gain it?
>>>
>>> Looking at the x86 implementation, mfn_to_gmfn is using a table for that
>>> indexed by the MFN. This is requiring virtual address space that is
>>> already scarce on ARM32 and also using physical memory.
>>>
>>> I am not convinced this is the right things to do on ARM as the only
>>> user so far will be the IOMMU code.
>>>
>>> Another solution would be to go through the stage-2 page table and
>>> replicate all the mappings.
>>
>> That's certainly an option, if you want to save the memory (and
>> VA space on ARM32). It only makes the x86 model of establishing
>> the mappings slightly more compute intensive.
> 
> I made a quick calculation, ARM32 supports up 40-bit PA and IPA (e.g 
> guest address), which means 28-bits of MFN/GFN. The GFN would have to be 
> stored in a 32-bit for alignment, so we would need 2^28 * 4 = 1GiB of 
> virtual address space and potentially physical memory.
> We don't have 1GB of VA space free on 32-bit right now.

Right, you'd have to pay a performance price here. Either, as you
say, by looking the translations up from the stage-2 tables, or by
using some on demand mapping scheme for the table here.

> ARM64 currently supports up to 48-bit PA and 48-bit IPA, which means 
> 36-bits of MFN/GFN. The GFN would have to be stored in 64-bit for 
> alignment, so we would need 2^36 * 8 = 512GiB of virtual address space 
> and potentially physical memory. While virtual address space is not a 
> problem, the memory is a problem for embedded platform. We want Xen to 
> be as lean as possible.

Which then leaves the stage-2 table lookup as the only option. Of
course one might consider a hybrid model - memory constrained
systems could go the stage-2 table lookup route, but an larger
systems the cheap direct table lookup could be used.

> I though a bit more on the advantage to create the IOMMU page tables 
> later on.
> 
> For devices assigned at domain creation, we know that devices will be 
> assigned so we could let Xen and populated IOMMU while allocating the 
> memory for the domain.
> 
> For hotplug devices, this would only happen for PCI as integrated 
> devices cannot be hotplug. As we go towards emulating a root complex in 
> Xen rather than the PV approach, you would need the root complex to be 
> instantiated when the domain is created (unless we want to hotplug 
> too?). IHMO, if you assign a root complex is likely that you will want 
> to assign a PCI afterwards. So allocating page tables at that time 
> sounds sensible.
> 
> This would avoid to walk the stage-2 page tables at runtime.

Well, in the end it's your call, but I don't think this is an acceptable
model in the general case. Quite often - see the Viridian support in
x86 Xen for a very good example - distros (XenServer in this case)
enable functionality even if a guest (Linux in the case here) would
never really want to make use of it. Also you need to keep in mind
that for an admin it is better to not have to take care of all
eventualities before first starting a (perhaps long running) guest.
Granted we have a number of other limitations of that same kind,
but if such can be 

[Xen-devel] [PATCH 2/2] x86/xen: use capabilities instead of fake cpuid values

2017-02-16 Thread Juergen Gross
When running as pv domain xen_cpuid() is being used instead of
native_cpuid(). In xen_cpuid() the aperf/mperf feature is indicated
as not being present by special casing the related cpuid leaf.

Instead of delivering fake cpuid values clear the cpu capability bit
for aperf/mperf instead.

Signed-off-by: Juergen Gross 
---
 arch/x86/xen/enlighten.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 83399ce..0eebb75 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -301,9 +301,6 @@ xen_running_on_version_or_later(unsigned int major, 
unsigned int minor)
return false;
 }
 
-#define CPUID_THERM_POWER_LEAF 6
-#define APERFMPERF_PRESENT 0
-
 static __read_mostly unsigned int cpuid_leaf1_edx_mask = ~0;
 static __read_mostly unsigned int cpuid_leaf1_ecx_mask = ~0;
 
@@ -337,11 +334,6 @@ static void xen_cpuid(unsigned int *ax, unsigned int *bx,
*dx = cpuid_leaf5_edx_val;
return;
 
-   case CPUID_THERM_POWER_LEAF:
-   /* Disabling APERFMPERF for kernel usage */
-   maskecx = ~(1 << APERFMPERF_PRESENT);
-   break;
-
case 0xb:
/* Suppress extended topology stuff */
maskebx = 0;
@@ -462,6 +454,9 @@ static void __init xen_init_cpuid_mask(void)
if (xen_check_mwait())
cpuid_leaf1_ecx_set_mask = (1 << (X86_FEATURE_MWAIT % 32));
 
+   /* Disable APERFMPERF feature. */
+   setup_clear_cpu_cap(X86_FEATURE_APERFMPERF);
+
/* Disable DCA feature. */
setup_clear_cpu_cap(X86_FEATURE_DCA);
 }
-- 
2.10.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/2] x86/xen: don't indicate DCA support in pv domains

2017-02-16 Thread Juergen Gross
Xen doesn't support DCA (direct cache access) for pv domains. Clear
the corresponding capability indicator.

Signed-off-by: Juergen Gross 
---
 arch/x86/xen/enlighten.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 51ef952..83399ce 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -461,6 +461,9 @@ static void __init xen_init_cpuid_mask(void)
cpuid_leaf1_ecx_mask &= ~xsave_mask; /* disable XSAVE & OSXSAVE 
*/
if (xen_check_mwait())
cpuid_leaf1_ecx_set_mask = (1 << (X86_FEATURE_MWAIT % 32));
+
+   /* Disable DCA feature. */
+   setup_clear_cpu_cap(X86_FEATURE_DCA);
 }
 
 static void xen_set_debugreg(int reg, unsigned long val)
-- 
2.10.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 0/2] x86/xen: cpuid() cleanup

2017-02-16 Thread Juergen Gross
Reduce special casing of xen_cpuid() and disable DCA feature for pv
domains as it isn't supported under Xen.

Juergen Gross (2):
  x86/xen: don't indicate DCA support in pv domains
  x86/xen: use capabilities instead of fake cpuid values

 arch/x86/xen/enlighten.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

-- 
2.10.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 09/35] x86: Convert remaining uses of pr_warning to pr_warn

2017-02-16 Thread Joe Perches
To enable eventual removal of pr_warning

This makes pr_warn use consistent for arch/x86

Prior to this patch, there were 46 uses of pr_warning and
122 uses of pr_warn in arch/x86

Miscellanea:

o Coalesce a few formats and realign arguments
o Convert a couple of multiple line printks to single line

Signed-off-by: Joe Perches 
---
 arch/x86/kernel/amd_gart_64.c  | 12 +++--
 arch/x86/kernel/apic/apic.c| 46 --
 arch/x86/kernel/apic/apic_noop.c   |  2 +-
 arch/x86/kernel/setup_percpu.c |  4 +--
 arch/x86/kernel/tboot.c| 15 ++-
 arch/x86/kernel/tsc_sync.c |  8 +++---
 arch/x86/mm/kmmio.c|  8 +++---
 arch/x86/mm/mmio-mod.c |  5 ++--
 arch/x86/mm/numa.c | 12 -
 arch/x86/mm/numa_emulation.c   |  6 ++---
 arch/x86/mm/testmmiotrace.c|  5 ++--
 arch/x86/oprofile/op_x86_model.h   |  6 ++---
 arch/x86/platform/olpc/olpc-xo15-sci.c |  2 +-
 arch/x86/platform/sfi/sfi.c|  3 +--
 arch/x86/xen/debugfs.c |  2 +-
 arch/x86/xen/setup.c   |  2 +-
 16 files changed, 63 insertions(+), 75 deletions(-)

diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index 63ff468a7986..6bb37027cd70 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -535,10 +535,8 @@ static __init unsigned long check_iommu_size(unsigned long 
aper, u64 aper_size)
iommu_size -= round_up(a, PMD_PAGE_SIZE) - a;
 
if (iommu_size < 64*1024*1024) {
-   pr_warning(
-   "PCI-DMA: Warning: Small IOMMU %luMB."
-   " Consider increasing the AGP aperture in BIOS\n",
-   iommu_size >> 20);
+   pr_warn("PCI-DMA: Warning: Small IOMMU %luMB. Consider 
increasing the AGP aperture in BIOS\n",
+   iommu_size >> 20);
}
 
return iommu_size;
@@ -690,8 +688,7 @@ static __init int init_amd_gatt(struct agp_kern_info *info)
 
  nommu:
/* Should not happen anymore */
-   pr_warning("PCI-DMA: More than 4GB of RAM and no IOMMU\n"
-  "falling back to iommu=soft.\n");
+   pr_warn("PCI-DMA: More than 4GB of RAM and no IOMMU - falling back to 
iommu=soft\n");
return -1;
 }
 
@@ -756,8 +753,7 @@ int __init gart_iommu_init(void)
!gart_iommu_aperture ||
(no_agp && init_amd_gatt() < 0)) {
if (max_pfn > MAX_DMA32_PFN) {
-   pr_warning("More than 4GB of memory but GART IOMMU not 
available.\n");
-   pr_warning("falling back to iommu=soft.\n");
+   pr_warn("More than 4GB of memory but GART IOMMU not 
available - falling back to iommu=soft\n");
}
return 0;
}
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 4261b3282ad9..37e9129da8b3 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -685,8 +685,8 @@ calibrate_by_pmtimer(long deltapm, long *delta, long 
*deltatsc)
 
res = (((u64)deltapm) *  mult) >> 22;
do_div(res, 100);
-   pr_warning("APIC calibration not consistent "
-  "with PM-Timer: %ldms instead of 100ms\n",(long)res);
+   pr_warn("APIC calibration not consistent with PM-Timer: %ldms instead 
of 100ms\n",
+   (long)res);
 
/* Correct the lapic counter value */
res = (((u64)(*delta)) * pm_100ms);
@@ -805,7 +805,7 @@ static int __init calibrate_APIC_clock(void)
 */
if (lapic_timer_frequency < (100 / HZ)) {
local_irq_enable();
-   pr_warning("APIC frequency too slow, disabling apic timer\n");
+   pr_warn("APIC frequency too slow, disabling apic timer\n");
return -1;
}
 
@@ -848,8 +848,8 @@ static int __init calibrate_APIC_clock(void)
local_irq_enable();
 
if (levt->features & CLOCK_EVT_FEAT_DUMMY) {
-   pr_warning("APIC timer disabled due to verification failure\n");
-   return -1;
+   pr_warn("APIC timer disabled due to verification failure\n");
+   return -1;
}
 
return 0;
@@ -923,7 +923,7 @@ static void local_apic_timer_interrupt(void)
 * spurious.
 */
if (!evt->event_handler) {
-   pr_warning("Spurious LAPIC timer interrupt on cpu %d\n", cpu);
+   pr_warn("Spurious LAPIC timer interrupt on cpu %d\n", cpu);
/* Switch it off */
lapic_timer_shutdown(evt);
return;
@@ -1503,11 +1503,11 @@ static int __init setup_nox2apic(char *str)
int apicid = native_apic_msr_read(APIC_ID);
 
if (apicid >= 255) {
-   pr_warning("Apicid: %08x, cannot enforce 

[Xen-devel] [PATCH 00/35] treewide trivial patches converting pr_warning to pr_warn

2017-02-16 Thread Joe Perches
There are ~4300 uses of pr_warn and ~250 uses of the older
pr_warning in the kernel source tree.

Make the use of pr_warn consistent across all kernel files.

This excludes all files in tools/ as there is a separate
define pr_warning for that directory tree and pr_warn is
not used in tools/.

Done with 'sed s/\bpr_warning\b/pr_warn/' and some emacsing.

Miscellanea:

o Coalesce formats and realign arguments

Some files not compiled - no cross-compilers

Joe Perches (35):
  alpha: Convert remaining uses of pr_warning to pr_warn
  ARM: ep93xx: Convert remaining uses of pr_warning to pr_warn
  arm64: Convert remaining uses of pr_warning to pr_warn
  arch/blackfin: Convert remaining uses of pr_warning to pr_warn
  ia64: Convert remaining use of pr_warning to pr_warn
  powerpc: Convert remaining uses of pr_warning to pr_warn
  sh: Convert remaining uses of pr_warning to pr_warn
  sparc: Convert remaining use of pr_warning to pr_warn
  x86: Convert remaining uses of pr_warning to pr_warn
  drivers/acpi: Convert remaining uses of pr_warning to pr_warn
  block/drbd: Convert remaining uses of pr_warning to pr_warn
  gdrom: Convert remaining uses of pr_warning to pr_warn
  drivers/char: Convert remaining use of pr_warning to pr_warn
  clocksource: Convert remaining use of pr_warning to pr_warn
  drivers/crypto: Convert remaining uses of pr_warning to pr_warn
  fmc: Convert remaining use of pr_warning to pr_warn
  drivers/gpu: Convert remaining uses of pr_warning to pr_warn
  drivers/ide: Convert remaining uses of pr_warning to pr_warn
  drivers/input: Convert remaining uses of pr_warning to pr_warn
  drivers/isdn: Convert remaining uses of pr_warning to pr_warn
  drivers/macintosh: Convert remaining uses of pr_warning to pr_warn
  drivers/media: Convert remaining use of pr_warning to pr_warn
  drivers/mfd: Convert remaining uses of pr_warning to pr_warn
  drivers/mtd: Convert remaining uses of pr_warning to pr_warn
  drivers/of: Convert remaining uses of pr_warning to pr_warn
  drivers/oprofile: Convert remaining uses of pr_warning to pr_warn
  drivers/platform: Convert remaining uses of pr_warning to pr_warn
  drivers/rapidio: Convert remaining use of pr_warning to pr_warn
  drivers/scsi: Convert remaining use of pr_warning to pr_warn
  drivers/sh: Convert remaining use of pr_warning to pr_warn
  drivers/tty: Convert remaining uses of pr_warning to pr_warn
  drivers/video: Convert remaining uses of pr_warning to pr_warn
  kernel/trace: Convert remaining uses of pr_warning to pr_warn
  lib: Convert remaining uses of pr_warning to pr_warn
  sound/soc: Convert remaining uses of pr_warning to pr_warn

 arch/alpha/kernel/perf_event.c |  4 +-
 arch/arm/mach-ep93xx/core.c|  4 +-
 arch/arm64/include/asm/syscall.h   |  8 ++--
 arch/arm64/kernel/hw_breakpoint.c  |  8 ++--
 arch/arm64/kernel/smp.c|  4 +-
 arch/blackfin/kernel/nmi.c |  2 +-
 arch/blackfin/kernel/ptrace.c  |  2 +-
 arch/blackfin/mach-bf533/boards/stamp.c|  2 +-
 arch/blackfin/mach-bf537/boards/cm_bf537e.c|  2 +-
 arch/blackfin/mach-bf537/boards/cm_bf537u.c|  2 +-
 arch/blackfin/mach-bf537/boards/stamp.c|  2 +-
 arch/blackfin/mach-bf537/boards/tcm_bf537.c|  2 +-
 arch/blackfin/mach-bf561/boards/cm_bf561.c |  2 +-
 arch/blackfin/mach-bf561/boards/ezkit.c|  2 +-
 arch/blackfin/mm/isram-driver.c|  4 +-
 arch/ia64/kernel/setup.c   |  6 +--
 arch/powerpc/kernel/pci-common.c   |  4 +-
 arch/powerpc/mm/init_64.c  |  5 +--
 arch/powerpc/mm/mem.c  |  3 +-
 arch/powerpc/platforms/512x/mpc512x_shared.c   |  4 +-
 arch/powerpc/platforms/85xx/socrates_fpga_pic.c|  7 ++--
 arch/powerpc/platforms/86xx/mpc86xx_hpcn.c |  2 +-
 arch/powerpc/platforms/pasemi/dma_lib.c|  4 +-
 arch/powerpc/platforms/powernv/opal.c  |  8 ++--
 arch/powerpc/platforms/powernv/pci-ioda.c  | 10 ++---
 arch/powerpc/platforms/ps3/device-init.c   | 14 +++
 arch/powerpc/platforms/ps3/mm.c|  4 +-
 arch/powerpc/platforms/ps3/os-area.c   |  2 +-
 arch/powerpc/platforms/pseries/iommu.c |  8 ++--
 arch/powerpc/platforms/pseries/setup.c |  4 +-
 arch/powerpc/sysdev/fsl_pci.c  |  9 ++---
 arch/powerpc/sysdev/mpic.c | 10 ++---
 arch/powerpc/sysdev/xics/icp-native.c  | 10 ++---
 arch/powerpc/sysdev/xics/ics-opal.c|  4 +-
 arch/powerpc/sysdev/xics/ics-rtas.c|  4 +-
 arch/powerpc/sysdev/xics/xics-common.c |  8 ++--
 arch/sh/boards/mach-sdk7786/nmi.c  |  2 +-
 arch/sh/drivers/pci/fixups-sdk7786.c   |  2 +-
 arch/sh/kernel/io_trapped.c

Re: [Xen-devel] qemu-upstream triggering OOM killer

2017-02-16 Thread Jan Beulich
>>> On 16.02.17 at 19:38,  wrote:
> On Thu, 16 Feb 2017, Jan Beulich wrote:
>> >>> On 16.02.17 at 16:23,  wrote:
>>  On 14.02.17 at 15:56,  wrote:
>> >> On Fri, Feb 10, 2017 at 02:54:23AM -0700, Jan Beulich wrote:
>> >>> Not so far. It appears to happen when grub clears the screen
>> >>> before displaying its graphical menu, so I'd rather suspect an issue
>> >>> with a graphics related change (the one you pointed out isn't).
>> >> 
>> >> I tried to reproduce this, by limiting the amount of memory available to
>> >> qemu using cgroups, but about 44MB of memory is enough to boot a guest
>> >> (tried Ubuntu and Debian).
>> > 
>> > Okay, not a qemuu regression after all, but a libxc one. It just so
>> > happens that qemut tries to allocate a much larger amount, which
>> > triggers mmap() failure earlier and hence doesn't manage to trigger
>> > the oom killer. Patch (almost) on its way.
>> 
>> Patch sent, allowing that guest to get further (and Windows to
>> properly boot). However, now the guest is stuck right at the point
>> where X wants to switch to its designated video mode, with qemu
>> (for somewhere between half a minute and a minute) consuming
>> one full CPU's bandwidth. Once qemu's CPU consumption went
>> down, no further progress is being made though.
>> 
>> Again I'd be thankful for hints on how to debug such a situation.
> 
> I would bisect it. It's probably due to a change in the cirrus vga code
> or common vga code. It might be worth testing with stdvga=1 to narrow it
> down.

No need to bisect - I finally remembered the behavior matching a
regression I had spotted back in December with a security backport
to one of our older trees. Commit 913a87885f ("display: cirrus:
ignore source pitch value as needed in blit_is_unsafe") needs
backporting.

Considering that this has been around for a while, it raises another
question: Are regression fixes being actively looked for by the two
of you, or are we depending on people running into issues for
necessary fixes to be pulled in?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 105861: regressions - FAIL

2017-02-16 Thread osstest service owner
flight 105861 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105861/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-credit2 15 guest-start/debian.repeat fail REGR. vs. 105840

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 105840
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 105840
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 105840
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 105840
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 105840
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 105840
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 105840
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 105840

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 build-arm64   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 xen  7127d53fe891f9ea67357587a33a7aaba4b55f45
baseline version:
 xen  1e88db4701d6e2d00c04795e6aacaea942b617e6

Last test of basis   105840  2017-02-16 07:20:11 Z0 days
Testing same since   105861  2017-02-16 20:17:44 Z0 days1 attempts


People who touched revisions under test:
  George Dunlap 
  Juergen Gross 
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  fail
 build-armhf-xsm

Re: [Xen-devel] Unable to boot Xen 4.8 with iommu=0

2017-02-16 Thread Tian, Kevin
> From: Tian, Kevin
> Sent: Friday, February 17, 2017 11:35 AM
> > >>
> > >> Or wait - do you have the same issue if you use
> > >> "iommu=no,no-intremap"? In which case the problem would be
> > >> that "iommu=no" should clear more than just "iommu_enable", or
> > >> code checking iommu_intremap early (before iommu_setup()
> > >> manages to clear it in the case here) would need to made look at
> > >> both variables. Oddly enough acpi_parse_dmar() only bails if
> > >> both variables are clear, which suggests to me that
> > >> iommu_enable is intended to have two different meanings in
> > >> different contexts (master flag vs. controlling just DMA
> > >> remapping). Kevin, Feng - any thoughts here?
> > >
> > > iommu=no,no-intremap boots fine with "(XEN) Using APIC driver default"
> >
> > Thanks for confirming.
> >
> > Kevin, Feng, we now depend on your input regarding the intentions
> > with the two variables.
> >
> 
> Feng just left Intel. Let me take a look at code to understand the
> rationale behind.
> 

Jan, looks it's caused by your change back to 2012:

commit 7a8f6d0607a38c64506b4e8b473d955bf8e2a71f
Author: Jan Beulich 
Date:   Fri Nov 2 17:15:30 2012 +0100

Before that iommu_enable was the master flag consistently. I'm still
trying to understand the background and you may help elaborate if
still something in your memory.

I agree we should stick to one meaning after clearing up above issue.

Given this commit is pretty old, I'm also curious why it's only reported
on 4.8. Tamas, did you succeed with iommu=0 pre 4.8, or 4.8 happens
to be the one upon which you first tried iommu=0 on a platform supporting
interrupt remapping?

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 105871: regressions - trouble: broken/fail/pass

2017-02-16 Thread osstest service owner
flight 105871 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105871/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt 14 guest-saverestorefail REGR. vs. 105852
 test-armhf-armhf-xl 15 guest-start/debian.repeat fail REGR. vs. 105852
 test-amd64-amd64-xl-qemuu-debianhvm-i386 12 guest-saverestore fail REGR. vs. 
105852

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  378384399ed661bed711221a5d8dbdac66b8e851
baseline version:
 xen  7127d53fe891f9ea67357587a33a7aaba4b55f45

Last test of basis   105852  2017-02-16 14:01:33 Z0 days
Failing since105857  2017-02-16 16:01:30 Z0 days7 attempts
Testing same since   105862  2017-02-16 22:01:53 Z0 days4 attempts


People who touched revisions under test:
  Andrew Cooper 
  Daniel Kiper 
  Jan Beulich 
  Julien Grall 
  Stefano Stabellini 

jobs:
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-amd64-libvirt  pass
 build-arm64-pvopsfail
 test-armhf-armhf-xl  fail
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 fail
 test-amd64-amd64-libvirt fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 378384399ed661bed711221a5d8dbdac66b8e851
Author: Stefano Stabellini 
Date:   Fri Feb 10 18:05:22 2017 -0800

arm: read/write rank->vcpu atomically

We don't need a lock in vgic_get_target_vcpu anymore, solving the
following lock inversion bug: the rank lock should be taken first, then
the vgic lock. However, gic_update_one_lr is called with the vgic lock
held, and it calls vgic_get_target_vcpu, which tries to obtain the rank
lock.

Coverity-ID: 1381855
Coverity-ID: 1381853

Signed-off-by: Stefano Stabellini 
Reviewed-by: Julien Grall 

commit 79903e50dba9e7442c9b7ca424661bb020e9dbf2
Author: Jan Beulich 
Date:   Thu Feb 16 18:11:42 2017 +0100

x86emul: catch exceptions occurring in stubs

Before adding more use of stubs cloned from decoded guest insns, guard
ourselves against mistakes there: Should an exception (with the
noteworthy exception of #PF) occur inside the stub, forward it to the
guest.

Since the exception fixup table entry can't encode the address of the
faulting insn itself, attach it to the return address instead. This at
once provides a convenient place to hand the exception information
back: The return address is being overwritten by it before branching to
the recovery code.

Take the opportunity and (finally!) add symbol resolution to the
respective log messages (the new one is intentionally not being coded
that way, as it covers stub addresses only, which don't have symbols
associated).

Also take the opportunity and make search_one_extable() static again.

Suggested-by: Andrew Cooper 
Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

commit 8c935f5ff1cac422b4de21cbab69e13d2ebb25be
Author: Daniel Kiper 
Date:   Thu Feb 16 18:10:04 2017 +0100

[Xen-devel] [PATCH 11/19] tools/xen-mceinj: fix the type of cpu number

2017-02-16 Thread Haozhong Zhang
Use uint32_t rather than int to align to the type of
xen_mc_physcpuinfo.ncpus.

Signed-off-by: Haozhong Zhang 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/tests/mce-test/tools/xen-mceinj.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/tests/mce-test/tools/xen-mceinj.c 
b/tools/tests/mce-test/tools/xen-mceinj.c
index 51abc8a..5f70a61 100644
--- a/tools/tests/mce-test/tools/xen-mceinj.c
+++ b/tools/tests/mce-test/tools/xen-mceinj.c
@@ -161,7 +161,7 @@ static int flush_msr_inj(xc_interface *xc_handle)
 return xc_mca_op(xc_handle, );
 }
 
-static int mca_cpuinfo(xc_interface *xc_handle)
+static uint32_t mca_cpuinfo(xc_interface *xc_handle)
 {
 struct xen_mc mc;
 
@@ -176,16 +176,19 @@ static int mca_cpuinfo(xc_interface *xc_handle)
 return 0;
 }
 
-static int inject_cmci(xc_interface *xc_handle, int cpu_nr)
+static int inject_cmci(xc_interface *xc_handle, uint32_t cpu_nr)
 {
 struct xen_mc mc;
-int nr_cpus;
+uint32_t nr_cpus;
 
 memset(, 0, sizeof(struct xen_mc));
 
 nr_cpus = mca_cpuinfo(xc_handle);
 if (!nr_cpus)
 err(xc_handle, "Failed to get mca_cpuinfo");
+if (cpu_nr >= nr_cpus)
+err(xc_handle, "-c %"PRIu32" is larger than %"PRIu32,
+cpu_nr, nr_cpus - 1);
 
 mc.cmd = XEN_MC_inject_v2;
 mc.interface_version = XEN_MCA_INTERFACE_VERSION;
@@ -420,7 +423,7 @@ int main(int argc, char *argv[])
 int c, opt_index;
 uint32_t domid;
 xc_interface *xc_handle;
-int cpu_nr;
+uint32_t cpu_nr;
 uint64_t gaddr, max_gpa;
 
 /* Default Value */
@@ -444,7 +447,7 @@ int main(int argc, char *argv[])
 dump=1;
 break;
 case 'c':
-cpu_nr = strtol(optarg, , 10);
+cpu_nr = strtoul(optarg, , 10);
 if ( strlen(optarg) != 0 )
 err(xc_handle, "Please input a digit parameter for CPU");
 break;
-- 
2.10.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 19/19] tools/xen-mceinj: support injecting LMCE

2017-02-16 Thread Haozhong Zhang
If option '-l' or '--lmce' is specified and the host supports LMCE,
xen-mceinj will inject LMCE to CPU specified by '-c' (or CPU0 if '-c'
is not present).

Signed-off-by: Haozhong Zhang 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libxc/include/xenctrl.h   |  1 +
 tools/libxc/xc_misc.c   | 25 +++
 tools/tests/mce-test/tools/xen-mceinj.c | 57 +++--
 3 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 85d7fe5..2598952 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1968,6 +1968,7 @@ int xc_cpuid_apply_policy(xc_interface *xch,
 void xc_cpuid_to_str(const unsigned int *regs,
  char **strs); /* some strs[] may be NULL if ENOMEM */
 int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
+int xc_mca_op_cpumap(xc_interface *xch, struct xen_mc *mc, xc_cpumap_t cpumap);
 #endif
 
 struct xc_px_val {
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index 0fc6c22..24f7fdf 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -341,6 +341,31 @@ int xc_mca_op(xc_interface *xch, struct xen_mc *mc)
 xc_hypercall_bounce_post(xch, mc);
 return ret;
 }
+
+int xc_mca_op_cpumap(xc_interface *xch, struct xen_mc *mc, xc_cpumap_t cpumap)
+{
+int ret;
+DECLARE_HYPERCALL_BOUNCE(cpumap, 0, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+
+if ( cpumap )
+{
+HYPERCALL_BOUNCE_SET_SIZE(cpumap,
+  (mc->u.mc_inject_v2.cpumap.nr_bits + 7) / 8);
+if ( xc_hypercall_bounce_pre(xch, cpumap) )
+{
+PERROR("Could not bouce cpumap memory buffer");
+return -1;
+}
+set_xen_guest_handle(mc->u.mc_inject_v2.cpumap.bitmap, cpumap);
+}
+
+ret = xc_mca_op(xch, mc);
+
+if ( cpumap )
+xc_hypercall_bounce_post(xch, cpumap);
+
+return ret;
+}
 #endif
 
 int xc_perfc_reset(xc_interface *xch)
diff --git a/tools/tests/mce-test/tools/xen-mceinj.c 
b/tools/tests/mce-test/tools/xen-mceinj.c
index 5f70a61..b2eb7d3 100644
--- a/tools/tests/mce-test/tools/xen-mceinj.c
+++ b/tools/tests/mce-test/tools/xen-mceinj.c
@@ -56,6 +56,8 @@
 #define MSR_IA32_MC0_MISC0x0403
 #define MSR_IA32_MC0_CTL20x0280
 
+#define MCG_STATUS_LMCE  0x8
+
 struct mce_info {
 const char *description;
 uint8_t mcg_stat;
@@ -113,6 +115,7 @@ static struct mce_info mce_table[] = {
 #define LOGFILE stdout
 
 int dump;
+int lmce;
 struct xen_mc_msrinject msr_inj;
 
 static void Lprintf(const char *fmt, ...)
@@ -213,6 +216,42 @@ static int inject_mce(xc_interface *xc_handle, int cpu_nr)
 return xc_mca_op(xc_handle, );
 }
 
+static int inject_lmce(xc_interface *xc_handle, uint32_t cpu_nr)
+{
+struct xen_mc mc;
+uint8_t *cpumap = NULL;
+size_t cpumap_size, line, shift;
+uint32_t nr_cpus;
+int ret;
+
+nr_cpus = mca_cpuinfo(xc_handle);
+if ( !nr_cpus )
+err(xc_handle, "Failed to get mca_cpuinfo");
+if ( cpu_nr >= nr_cpus )
+err(xc_handle, "-c %"PRIu32" is larger than %"PRIu32,
+cpu_nr, nr_cpus - 1);
+
+memset(, 0, sizeof(struct xen_mc));
+mc.cmd = XEN_MC_inject_v2;
+mc.interface_version = XEN_MCA_INTERFACE_VERSION;
+mc.u.mc_inject_v2.flags |= XEN_MC_INJECT_TYPE_LMCE;
+
+cpumap_size = (nr_cpus + 7) / 8;
+cpumap = malloc(cpumap_size);
+if ( !cpumap )
+err(xc_handle, "Failed to allocate cpumap\n");
+memset(cpumap, 0, cpumap_size);
+line = cpu_nr / 8;
+shift = cpu_nr % 8;
+memset(cpumap + line, 1 << shift, 1);
+
+mc.u.mc_inject_v2.cpumap.nr_bits = cpumap_size * 8;
+ret = xc_mca_op_cpumap(xc_handle, , cpumap);
+
+free(cpumap);
+return ret;
+}
+
 static uint64_t bank_addr(int bank, int type)
 {
 uint64_t addr;
@@ -331,8 +370,15 @@ static int inject(xc_interface *xc_handle, struct mce_info 
*mce,
   uint32_t cpu_nr, uint32_t domain, uint64_t gaddr)
 {
 int ret = 0;
+uint8_t mcg_status = mce->mcg_stat;
 
-ret = inject_mcg_status(xc_handle, cpu_nr, mce->mcg_stat, domain);
+if ( lmce )
+{
+if ( mce->cmci )
+err(xc_handle, "No support to inject CMCI as LMCE");
+mcg_status |= MCG_STATUS_LMCE;
+}
+ret = inject_mcg_status(xc_handle, cpu_nr, mcg_status, domain);
 if ( ret )
 err(xc_handle, "Failed to inject MCG_STATUS MSR");
 
@@ -355,6 +401,8 @@ static int inject(xc_interface *xc_handle, struct mce_info 
*mce,
 err(xc_handle, "Failed to inject MSR");
 if ( mce->cmci )
 ret = inject_cmci(xc_handle, cpu_nr);
+else if ( lmce )
+ret = inject_lmce(xc_handle, cpu_nr);
 else
 ret = inject_mce(xc_handle, cpu_nr);
 if ( ret )
@@ -394,6 +442,7 @@ static struct option opts[] = {
 {"dump", 0, 0, 'D'},
 

[Xen-devel] [PATCH 10/19] x86/mce: always write 0 to MSR_IA32_MCG_STATUS on Intel CPU

2017-02-16 Thread Haozhong Zhang
An attemp to write to MSR_IA32_MCG_STATUS with any value other than 0
would result in #GP on Intel CPU.

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mce.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 28bf579..95a9da3 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -538,7 +538,14 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
 gstatus = mca_rdmsr(MSR_IA32_MCG_STATUS);
 if ((gstatus & MCG_STATUS_MCIP) != 0) {
 mce_printk(MCE_CRITICAL, "MCE: Clear MCIP@ last step");
-mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);
+if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+/*
+ * Intel SDM 3: An attempt to write to IA32_MCG_STATUS
+ * with any value other than 0 would result in #GP.
+ */
+mca_wrmsr(MSR_IA32_MCG_STATUS, 0);
+else
+mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);
 }
 mce_barrier_exit(_trap_bar);
 
-- 
2.10.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 08/19] x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve()

2017-02-16 Thread Haozhong Zhang
All existing calls to x86_mcinfo_reserve() are followed by statements
that set the size and the type of the reserved space, so move them into
x86_mcinfo_reserve() to simplify the code.

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mcaction.c  |  4 +---
 xen/arch/x86/cpu/mcheck/mce.c   | 16 
 xen/arch/x86/cpu/mcheck/mce.h   |  2 +-
 xen/arch/x86/cpu/mcheck/mce_amd.c   |  4 +---
 xen/arch/x86/cpu/mcheck/mce_intel.c |  6 +-
 5 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mcaction.c 
b/xen/arch/x86/cpu/mcheck/mcaction.c
index 9cf2499..cc90e7c 100644
--- a/xen/arch/x86/cpu/mcheck/mcaction.c
+++ b/xen/arch/x86/cpu/mcheck/mcaction.c
@@ -13,14 +13,12 @@ mci_action_add_pageoffline(int bank, struct mc_info *mi,
 if (!mi)
 return NULL;
 
-rec = x86_mcinfo_reserve(mi, sizeof(*rec));
+rec = x86_mcinfo_reserve(mi, sizeof(*rec), MC_TYPE_RECOVERY);
 if (!rec) {
 mi->flags |= MCINFO_FLAGS_UNCOMPLETE;
 return NULL;
 }
 
-rec->common.type = MC_TYPE_RECOVERY;
-rec->common.size = sizeof(*rec);
 rec->mc_bank = bank;
 rec->action_types = MC_ACTION_PAGE_OFFLINE;
 rec->action_info.page_retire.mfn = mfn;
diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index f682520..28bf579 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -204,7 +204,7 @@ static void mca_init_bank(enum mca_source who,
 if (!mi)
 return;
 
-mib = x86_mcinfo_reserve(mi, sizeof(*mib));
+mib = x86_mcinfo_reserve(mi, sizeof(*mib), MC_TYPE_BANK);
 if (!mib)
 {
 mi->flags |= MCINFO_FLAGS_UNCOMPLETE;
@@ -213,8 +213,6 @@ static void mca_init_bank(enum mca_source who,
 
 mib->mc_status = mca_rdmsr(MSR_IA32_MCx_STATUS(bank));
 
-mib->common.type = MC_TYPE_BANK;
-mib->common.size = sizeof (struct mcinfo_bank);
 mib->mc_bank = bank;
 
 if (mib->mc_status & MCi_STATUS_MISCV)
@@ -250,8 +248,6 @@ static int mca_init_global(uint32_t flags, struct 
mcinfo_global *mig)
 struct domain *d;
 
 /* Set global information */
-mig->common.type = MC_TYPE_GLOBAL;
-mig->common.size = sizeof (struct mcinfo_global);
 status = mca_rdmsr(MSR_IA32_MCG_STATUS);
 mig->mc_gstatus = status;
 mig->mc_domid = mig->mc_vcpuid = -1;
@@ -351,7 +347,7 @@ mcheck_mca_logout(enum mca_source who, struct mca_banks 
*bankmask,
 if ( (mctc = mctelem_reserve(which)) != NULL ) {
 mci = mctelem_dataptr(mctc);
 mcinfo_clear(mci);
-mig = x86_mcinfo_reserve(mci, sizeof(*mig));
+mig = x86_mcinfo_reserve(mci, sizeof(*mig), MC_TYPE_GLOBAL);
 /* mc_info should at least hold up the global information */
 ASSERT(mig);
 mca_init_global(mc_flags, mig);
@@ -804,7 +800,7 @@ static void mcinfo_clear(struct mc_info *mi)
 x86_mcinfo_nentries(mi) = 0;
 }
 
-void *x86_mcinfo_reserve(struct mc_info *mi, int size)
+void *x86_mcinfo_reserve(struct mc_info *mi, uint16_t size, uint16_t type)
 {
 int i;
 unsigned long end1, end2;
@@ -831,7 +827,11 @@ void *x86_mcinfo_reserve(struct mc_info *mi, int size)
 /* there's enough space. add entry. */
 x86_mcinfo_nentries(mi)++;
 
-return memset(mic_index, 0, size);
+memset(mic_index, 0, size);
+mic_index->size = size;
+mic_index->type = type;
+
+return mic_index;
 }
 
 static void x86_mcinfo_apei_save(
diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h
index 56877c1..2f4e7a4 100644
--- a/xen/arch/x86/cpu/mcheck/mce.h
+++ b/xen/arch/x86/cpu/mcheck/mce.h
@@ -146,7 +146,7 @@ typedef struct mcinfo_extended *(*x86_mce_callback_t)
 (struct mc_info *, uint16_t, uint64_t);
 extern void x86_mce_callback_register(x86_mce_callback_t);
 
-void *x86_mcinfo_reserve(struct mc_info *mi, int size);
+void *x86_mcinfo_reserve(struct mc_info *mi, uint16_t size, uint16_t type);
 void x86_mcinfo_dump(struct mc_info *mi);
 
 static inline int mce_vendor_bank_msr(const struct vcpu *v, uint32_t msr)
diff --git a/xen/arch/x86/cpu/mcheck/mce_amd.c 
b/xen/arch/x86/cpu/mcheck/mce_amd.c
index 599e465..fe51be9 100644
--- a/xen/arch/x86/cpu/mcheck/mce_amd.c
+++ b/xen/arch/x86/cpu/mcheck/mce_amd.c
@@ -218,15 +218,13 @@ amd_f10_handler(struct mc_info *mi, uint16_t bank, 
uint64_t status)
 if ( !(status & MCi_STATUS_MISCV) )
 return NULL;
 
-mc_ext = x86_mcinfo_reserve(mi, sizeof(*mc_ext));
+mc_ext = x86_mcinfo_reserve(mi, sizeof(*mc_ext), MC_TYPE_EXTENDED);
 if ( !mc_ext )
 {
 mi->flags |= MCINFO_FLAGS_UNCOMPLETE;
 return NULL;
 }
 
-mc_ext->common.type = MC_TYPE_EXTENDED;
-mc_ext->common.size = 

[Xen-devel] [PATCH 07/19] x86/vmce: include domain/vcpu id in debug messages

2017-02-16 Thread Haozhong Zhang
Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/vmce.c | 35 ---
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index 5f002e3..d83a3f2 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -110,15 +110,16 @@ static int bank_mce_rdmsr(const struct vcpu *v, uint32_t 
msr, uint64_t *val)
 case MSR_IA32_MC0_CTL:
 /* stick all 1's to MCi_CTL */
 *val = ~0UL;
-mce_printk(MCE_VERBOSE, "MCE: rd MC%u_CTL %#"PRIx64"\n", bank, *val);
+mce_printk(MCE_VERBOSE, "MCE: %pv: rd MC%u_CTL %#"PRIx64"\n",
+   v, bank, *val);
 break;
 case MSR_IA32_MC0_STATUS:
 if ( bank < GUEST_MC_BANK_NUM )
 {
 *val = v->arch.vmce.bank[bank].mci_status;
 if ( *val )
-mce_printk(MCE_VERBOSE, "MCE: rd MC%u_STATUS %#"PRIx64"\n",
-   bank, *val);
+mce_printk(MCE_VERBOSE, "MCE: %pv: rd MC%u_STATUS 
%#"PRIx64"\n",
+   v, bank, *val);
 }
 break;
 case MSR_IA32_MC0_ADDR:
@@ -126,8 +127,8 @@ static int bank_mce_rdmsr(const struct vcpu *v, uint32_t 
msr, uint64_t *val)
 {
 *val = v->arch.vmce.bank[bank].mci_addr;
 if ( *val )
-mce_printk(MCE_VERBOSE, "MCE: rd MC%u_ADDR %#"PRIx64"\n",
-   bank, *val);
+mce_printk(MCE_VERBOSE, "MCE: %pv: rd MC%u_ADDR %#"PRIx64"\n",
+   v, bank, *val);
 }
 break;
 case MSR_IA32_MC0_MISC:
@@ -135,8 +136,8 @@ static int bank_mce_rdmsr(const struct vcpu *v, uint32_t 
msr, uint64_t *val)
 {
 *val = v->arch.vmce.bank[bank].mci_misc;
 if ( *val )
-mce_printk(MCE_VERBOSE, "MCE: rd MC%u_MISC %#"PRIx64"\n",
-   bank, *val);
+mce_printk(MCE_VERBOSE, "MCE: %pv: rd MC%u_MISC %#"PRIx64"\n",
+   v, bank, *val);
 }
 break;
 default:
@@ -178,16 +179,16 @@ int vmce_rdmsr(uint32_t msr, uint64_t *val)
 *val = cur->arch.vmce.mcg_status;
 if (*val)
 mce_printk(MCE_VERBOSE,
-   "MCE: rd MCG_STATUS %#"PRIx64"\n", *val);
+   "MCE: %pv: rd MCG_STATUS %#"PRIx64"\n", cur, *val);
 break;
 case MSR_IA32_MCG_CAP:
 *val = cur->arch.vmce.mcg_cap;
-mce_printk(MCE_VERBOSE, "MCE: rd MCG_CAP %#"PRIx64"\n", *val);
+mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_CAP %#"PRIx64"\n", cur, 
*val);
 break;
 case MSR_IA32_MCG_CTL:
 if ( cur->arch.vmce.mcg_cap & MCG_CTL_P )
 *val = ~0ULL;
-mce_printk(MCE_VERBOSE, "MCE: rd MCG_CTL %#"PRIx64"\n", *val);
+mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_CTL %#"PRIx64"\n", cur, 
*val);
 break;
 default:
 ret = mce_bank_msr(cur, msr) ? bank_mce_rdmsr(cur, msr, val) : 0;
@@ -217,21 +218,24 @@ static int bank_mce_wrmsr(struct vcpu *v, uint32_t msr, 
uint64_t val)
  */
 break;
 case MSR_IA32_MC0_STATUS:
-mce_printk(MCE_VERBOSE, "MCE: wr MC%u_STATUS %#"PRIx64"\n", bank, val);
+mce_printk(MCE_VERBOSE, "MCE: %pv: wr MC%u_STATUS %#"PRIx64"\n",
+   v, bank, val);
 if ( val )
 ret = -1;
 else if ( bank < GUEST_MC_BANK_NUM )
 v->arch.vmce.bank[bank].mci_status = val;
 break;
 case MSR_IA32_MC0_ADDR:
-mce_printk(MCE_VERBOSE, "MCE: wr MC%u_ADDR %#"PRIx64"\n", bank, val);
+mce_printk(MCE_VERBOSE, "MCE: %pv: wr MC%u_ADDR %#"PRIx64"\n",
+   v, bank, val);
 if ( val )
 ret = -1;
 else if ( bank < GUEST_MC_BANK_NUM )
 v->arch.vmce.bank[bank].mci_addr = val;
 break;
 case MSR_IA32_MC0_MISC:
-mce_printk(MCE_VERBOSE, "MCE: wr MC%u_MISC %#"PRIx64"\n", bank, val);
+mce_printk(MCE_VERBOSE, "MCE: %pv: wr MC%u_MISC %#"PRIx64"\n",
+   v, bank, val);
 if ( val )
 ret = -1;
 else if ( bank < GUEST_MC_BANK_NUM )
@@ -275,7 +279,8 @@ int vmce_wrmsr(uint32_t msr, uint64_t val)
 break;
 case MSR_IA32_MCG_STATUS:
 cur->arch.vmce.mcg_status = val;
-mce_printk(MCE_VERBOSE, "MCE: wr MCG_STATUS %"PRIx64"\n", val);
+mce_printk(MCE_VERBOSE, "MCE: %pv: wr MCG_STATUS %"PRIx64"\n",
+   cur, val);
 break;
 case MSR_IA32_MCG_CAP:
 /*
@@ -283,7 +288,7 @@ int vmce_wrmsr(uint32_t msr, uint64_t val)
  * the effect of writing to the IA32_MCG_CAP is undefined. Here we
  * treat 

[Xen-devel] [PATCH 16/19] x86/vmce: enable injecting LMCE to guest on Intel host

2017-02-16 Thread Haozhong Zhang
Inject LMCE to guest if the host MCE is LMCE and the affected vcpu is
known. Otherwise, broadcast MCE to all vcpus on Intel host.

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mcaction.c | 14 --
 xen/arch/x86/cpu/mcheck/vmce.c |  9 -
 xen/arch/x86/cpu/mcheck/vmce.h |  2 +-
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mcaction.c 
b/xen/arch/x86/cpu/mcheck/mcaction.c
index 90c68ff..3410bfd 100644
--- a/xen/arch/x86/cpu/mcheck/mcaction.c
+++ b/xen/arch/x86/cpu/mcheck/mcaction.c
@@ -88,17 +88,19 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
 goto vmce_failed;
 }
 
-if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+vmce_vcpuid = global->mc_vcpuid;
+if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+ (vmce_vcpuid == -1 ||
+  global->mc_domid != bank->mc_domid ||
+  !(global->mc_gstatus & MCG_STATUS_LMCE) ||
+  !d->vcpu[vmce_vcpuid]->arch.vmce.lmce_enabled) )
 vmce_vcpuid = VMCE_INJECT_BROADCAST;
-else
-vmce_vcpuid = global->mc_vcpuid;
 
 bank->mc_addr = gfn << PAGE_SHIFT |
   (bank->mc_addr & (PAGE_SIZE -1 ));
-/* TODO: support injecting LMCE */
 if ( fill_vmsr_data(bank, d,
-global->mc_gstatus & ~MCG_STATUS_LMCE,
-vmce_vcpuid == VMCE_INJECT_BROADCAST) == 
-1 )
+global->mc_gstatus,
+vmce_vcpuid) == -1 )
 {
 mce_printk(MCE_QUIET, "Fill vMCE# data for DOM%d "
   "failed\n", bank->mc_domid);
diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index 1278839..2a4d3f0 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -444,14 +444,21 @@ static int vcpu_fill_mc_msrs(struct vcpu *v, uint64_t 
mcg_status,
 }
 
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
-   uint64_t gstatus, bool broadcast)
+   uint64_t gstatus, int vmce_vcpuid)
 {
 struct vcpu *v = d->vcpu[0];
+bool broadcast = (vmce_vcpuid == VMCE_INJECT_BROADCAST);
 int ret;
 
 if ( mc_bank->mc_domid == (uint16_t)~0 )
 return -EINVAL;
 
+if ( (gstatus & MCG_STATUS_LMCE) && !broadcast )
+v = d->vcpu[vmce_vcpuid];
+
+if ( broadcast )
+gstatus &= ~MCG_STATUS_LMCE;
+
 ret = vcpu_fill_mc_msrs(v, gstatus, mc_bank->mc_status,
 mc_bank->mc_addr, mc_bank->mc_misc);
 if ( ret || !broadcast )
diff --git a/xen/arch/x86/cpu/mcheck/vmce.h b/xen/arch/x86/cpu/mcheck/vmce.h
index 74f6381..2797e00 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.h
+++ b/xen/arch/x86/cpu/mcheck/vmce.h
@@ -17,7 +17,7 @@ int vmce_amd_rdmsr(const struct vcpu *, uint32_t msr, 
uint64_t *val);
 int vmce_amd_wrmsr(struct vcpu *, uint32_t msr, uint64_t val);
 
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
-   uint64_t gstatus, bool broadcast);
+   uint64_t gstatus, int vmce_vcpuid);
 
 #define VMCE_INJECT_BROADCAST (-1)
 int inject_vmce(struct domain *d, int vcpu);
-- 
2.10.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 04/19] xen/mce: remove unused x86_mcinfo_add()

2017-02-16 Thread Haozhong Zhang
Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mce.c | 16 
 xen/arch/x86/cpu/mcheck/mce.h |  1 -
 2 files changed, 17 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index e327eab..f682520 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -834,22 +834,6 @@ void *x86_mcinfo_reserve(struct mc_info *mi, int size)
 return memset(mic_index, 0, size);
 }
 
-void *x86_mcinfo_add(struct mc_info *mi, void *mcinfo)
-{
-struct mcinfo_common *mic, *buf;
-
-mic = (struct mcinfo_common *)mcinfo;
-buf = x86_mcinfo_reserve(mi, mic->size);
-
-if ( !buf )
-mce_printk(MCE_CRITICAL,
-   "mcinfo_add: No space left in mc_info\n");
-else
-memcpy(buf, mic, mic->size);
-
-return buf;
-}
-
 static void x86_mcinfo_apei_save(
 struct mcinfo_global *mc_global, struct mcinfo_bank *mc_bank)
 {
diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h
index e697780..56877c1 100644
--- a/xen/arch/x86/cpu/mcheck/mce.h
+++ b/xen/arch/x86/cpu/mcheck/mce.h
@@ -146,7 +146,6 @@ typedef struct mcinfo_extended *(*x86_mce_callback_t)
 (struct mc_info *, uint16_t, uint64_t);
 extern void x86_mce_callback_register(x86_mce_callback_t);
 
-void *x86_mcinfo_add(struct mc_info *mi, void *mcinfo);
 void *x86_mcinfo_reserve(struct mc_info *mi, int size);
 void x86_mcinfo_dump(struct mc_info *mi);
 
-- 
2.10.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 05/19] x86/mce: merge loops to get Intel extended MC MSR

2017-02-16 Thread Haozhong Zhang
The second loop that gets MSR_IA32_MCG_R8 to MSR_IA32_MCG_R15 was
surrounded by '#ifdef __X86_64__ ... #endif' and had to be seperated
from the first loop that gets MSR_IA32_MCG_EAX to MSR_IA32_MCG_MISC.
Because Xen had dropped support for 32-bit x86 host, these two loops
can be merged now.

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mce_intel.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c 
b/xen/arch/x86/cpu/mcheck/mce_intel.c
index 005e41d..498e8e4 100644
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
@@ -211,10 +211,7 @@ intel_get_extended_msrs(struct mcinfo_global *mig, struct 
mc_info *mi)
 mc_ext->common.type = MC_TYPE_EXTENDED;
 mc_ext->common.size = sizeof(struct mcinfo_extended);
 
-for (i = MSR_IA32_MCG_EAX; i <= MSR_IA32_MCG_MISC; i++)
-intel_get_extended_msr(mc_ext, i);
-
-for (i = MSR_IA32_MCG_R8; i <= MSR_IA32_MCG_R15; i++)
+for (i = MSR_IA32_MCG_EAX; i <= MSR_IA32_MCG_R15; i++)
 intel_get_extended_msr(mc_ext, i);
 
 return mc_ext;
-- 
2.10.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 18/19] xen/mce: add support of vLMCE injection to XEN_MC_inject_v2

2017-02-16 Thread Haozhong Zhang
Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mce.c | 16 
 xen/include/public/arch-x86/xen-mca.h |  1 +
 2 files changed, 17 insertions(+)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 2d69222..56c1f5e 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -1510,6 +1510,7 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
 {
 const cpumask_t *cpumap;
 cpumask_var_t cmv;
+int cpu_nr;
 
 if (nr_mce_banks == 0)
 return x86_mcerr("do_mca #MC", -ENODEV);
@@ -1552,6 +1553,21 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
 send_IPI_mask(cpumap, cmci_apic_vector);
 }
 break;
+case XEN_MC_INJECT_TYPE_LMCE:
+if ( !lmce_support )
+{
+ret = x86_mcerr("No LMCE support in platform", -EINVAL);
+break;
+}
+/* ensure at most one CPU is specified */
+cpu_nr = cpumask_next(cpumask_first(cpumap), cpumap);
+if ( cpu_nr < nr_cpu_ids )
+{
+ret = x86_mcerr("More than one CPU specified", -EINVAL);
+break;
+}
+on_selected_cpus(cpumap, x86_mc_mceinject, NULL, 1);
+break;
 default:
 ret = x86_mcerr("Wrong mca type\n", -EINVAL);
 break;
diff --git a/xen/include/public/arch-x86/xen-mca.h 
b/xen/include/public/arch-x86/xen-mca.h
index 9566a33..037a174 100644
--- a/xen/include/public/arch-x86/xen-mca.h
+++ b/xen/include/public/arch-x86/xen-mca.h
@@ -412,6 +412,7 @@ struct xen_mc_mceinject {
 #define XEN_MC_INJECT_TYPE_MASK 0x7
 #define XEN_MC_INJECT_TYPE_MCE  0x0
 #define XEN_MC_INJECT_TYPE_CMCI 0x1
+#define XEN_MC_INJECT_TYPE_LMCE 0x2
 
 #define XEN_MC_INJECT_CPU_BROADCAST 0x8
 
-- 
2.10.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 01/19] x86/mce: fix indentation style in xen-mca.h and mce.h

2017-02-16 Thread Haozhong Zhang
Replace tab indentation by whitespace.

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mce.h | 40 +--
 xen/include/public/arch-x86/xen-mca.h | 24 ++---
 2 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h
index e83d431..2f0fb07 100644
--- a/xen/arch/x86/cpu/mcheck/mce.h
+++ b/xen/arch/x86/cpu/mcheck/mce.h
@@ -30,11 +30,11 @@ extern int mce_verbosity;
 } while (0)
 
 enum mcheck_type {
-   mcheck_unset = -1,
-   mcheck_none,
-   mcheck_amd_famXX,
-   mcheck_amd_k8,
-   mcheck_intel
+mcheck_unset = -1,
+mcheck_none,
+mcheck_amd_famXX,
+mcheck_amd_k8,
+mcheck_intel
 };
 
 extern uint8_t cmci_apic_vector;
@@ -59,7 +59,7 @@ unsigned int mce_firstbank(struct cpuinfo_x86 *c);
 struct mc_info *x86_mcinfo_getptr(void);
 void noreturn mc_panic(char *s);
 void x86_mc_get_cpu_info(unsigned, uint32_t *, uint16_t *, uint16_t *,
-uint32_t *, uint32_t *, uint32_t *, uint32_t *);
+ uint32_t *, uint32_t *, uint32_t *, uint32_t *);
 
 /* Register a handler for machine check exceptions. */
 typedef void (*x86_mce_vector_t)(const struct cpu_user_regs *regs);
@@ -80,10 +80,10 @@ extern bool_t intpose_inval(unsigned int, uint64_t);
 
 static inline uint64_t mca_rdmsr(unsigned int msr)
 {
-   uint64_t val;
-   if (intpose_lookup(smp_processor_id(), msr, ) == NULL)
-   rdmsrl(msr, val);
-   return val;
+uint64_t val;
+if (intpose_lookup(smp_processor_id(), msr, ) == NULL)
+rdmsrl(msr, val);
+return val;
 }
 
 /* Write an MSR, invalidating any interposed value */
@@ -101,19 +101,19 @@ static inline uint64_t mca_rdmsr(unsigned int msr)
  * of the MCA data observed in the logout operation. */
 
 enum mca_source {
-   MCA_POLLER,
-   MCA_CMCI_HANDLER,
-   MCA_RESET,
-   MCA_MCE_SCAN
+MCA_POLLER,
+MCA_CMCI_HANDLER,
+MCA_RESET,
+MCA_MCE_SCAN
 };
 
 struct mca_summary {
-   uint32_terrcnt; /* number of banks with valid errors */
-   int ripv;   /* meaningful on #MC */
-   int eipv;   /* meaningful on #MC */
-   bool_t  uc; /* UC flag */
-   bool_t  pcc;/* PCC flag */
-   bool_t  recoverable; /* software error recoverable flag */
+uint32_terrcnt; /* number of banks with valid errors */
+int ripv;   /* meaningful on #MC */
+int eipv;   /* meaningful on #MC */
+bool_t  uc; /* UC flag */
+bool_t  pcc;/* PCC flag */
+bool_t  recoverable; /* software error recoverable flag */
 };
 
 DECLARE_PER_CPU(struct mca_banks *, poll_bankmask);
diff --git a/xen/include/public/arch-x86/xen-mca.h 
b/xen/include/public/arch-x86/xen-mca.h
index a97e821..9566a33 100644
--- a/xen/include/public/arch-x86/xen-mca.h
+++ b/xen/include/public/arch-x86/xen-mca.h
@@ -312,8 +312,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_mc_logical_cpu_t);
 struct mcinfo_common *_mic; \
 \
 found = 0;  \
-   (_ret) = NULL;  \
-   if (_mi == NULL) break; \
+(_ret) = NULL;  \
+if (_mi == NULL) break; \
 _mic = x86_mcinfo_first(_mi);   \
 for (i = 0; i < x86_mcinfo_nentries(_mi); i++) {\
 if (_mic->type == (_type)) {\
@@ -345,8 +345,8 @@ struct xen_mc_fetch {
 /* IN/OUT variables. */
 uint32_t flags;/* IN: XEN_MC_NONURGENT, XEN_MC_URGENT,
XEN_MC_ACK if ack'ing an earlier fetch */
-   /* OUT: XEN_MC_OK, XEN_MC_FETCHFAILED,
-  XEN_MC_NODATA, XEN_MC_NOMATCH */
+/* OUT: XEN_MC_OK, XEN_MC_FETCHFAILED,
+XEN_MC_NODATA, XEN_MC_NOMATCH */
 uint32_t _pad0;
 uint64_t fetch_id; /* OUT: id for ack, IN: id we are ack'ing */
 
@@ -378,11 +378,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_mc_notifydomain_t);
 
 #define XEN_MC_physcpuinfo 3
 struct xen_mc_physcpuinfo {
-   /* IN/OUT */
-   uint32_t ncpus;
-   uint32_t _pad0;
-   /* OUT */
-   XEN_GUEST_HANDLE(xen_mc_logical_cpu_t) info;
+/* IN/OUT */
+uint32_t ncpus;
+uint32_t _pad0;
+/* OUT */
+XEN_GUEST_HANDLE(xen_mc_logical_cpu_t) info;
 };
 
 #define XEN_MC_msrinject4
@@ -404,7 +404,7 @@ struct xen_mc_msrinject {
 
 #define XEN_MC_mceinject5
 struct xen_mc_mceinject {

[Xen-devel] [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP

2017-02-16 Thread Haozhong Zhang
If LMCE is supported by host and "lmce = 1" is present in xl config, the
LMCE capability will be exposed in guest MSR_IA32_MCG_CAP. By default,
LMCE is not exposed to guest so as to keep the backwards migration
compatibility.

Signed-off-by: Haozhong Zhang 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 docs/man/xl.cfg.pod.5.in| 18 ++
 tools/libxl/libxl_create.c  |  1 +
 tools/libxl/libxl_dom.c |  2 ++
 tools/libxl/libxl_types.idl |  1 +
 tools/libxl/xl_cmdimpl.c|  3 +++
 xen/arch/x86/cpu/mcheck/vmce.c  | 14 +-
 xen/arch/x86/hvm/hvm.c  |  7 +++
 xen/include/asm-x86/mce.h   |  1 +
 xen/include/public/hvm/params.h |  5 -
 9 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in
index 46f9caf..1cdf372 100644
--- a/docs/man/xl.cfg.pod.5.in
+++ b/docs/man/xl.cfg.pod.5.in
@@ -2021,6 +2021,24 @@ natively or via hardware backwards compatibility support.
 
 =back
 
+=head3 Intel
+
+=over 4
+
+=item 

[Xen-devel] [PATCH 09/19] x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case

2017-02-16 Thread Haozhong Zhang
The current implementation only fills MC MSRs on vcpu0 and leaves MC
MSRs on other vcpus empty in the broadcast case. When guest reads 0
from MSR_IA32_MCG_STATUS on vcpuN (N > 0), it may think it's not
possible to recover the execution on that vcpu and then get panic,
although MSR_IA32_MCG_STATUS filled on vcpu0 may imply the injected
vMCE is actually recoverable. To avoid such unnecessary guest panic,
set MSR_IA32_MCG_STATUS on vcpuN (N > 0) to MCG_STATUS_MCIP |
MCG_STATUS_RIPV.

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mcaction.c | 14 
 xen/arch/x86/cpu/mcheck/vmce.c | 67 +-
 xen/arch/x86/cpu/mcheck/vmce.h |  2 +-
 3 files changed, 53 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mcaction.c 
b/xen/arch/x86/cpu/mcheck/mcaction.c
index cc90e7c..8b2b834 100644
--- a/xen/arch/x86/cpu/mcheck/mcaction.c
+++ b/xen/arch/x86/cpu/mcheck/mcaction.c
@@ -88,21 +88,21 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
 goto vmce_failed;
 }
 
+if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+vmce_vcpuid = VMCE_INJECT_BROADCAST;
+else
+vmce_vcpuid = global->mc_vcpuid;
+
 bank->mc_addr = gfn << PAGE_SHIFT |
   (bank->mc_addr & (PAGE_SIZE -1 ));
-if ( fill_vmsr_data(bank, d,
-  global->mc_gstatus) == -1 )
+if ( fill_vmsr_data(bank, d, global->mc_gstatus,
+vmce_vcpuid == VMCE_INJECT_BROADCAST) == 
-1 )
 {
 mce_printk(MCE_QUIET, "Fill vMCE# data for DOM%d "
   "failed\n", bank->mc_domid);
 goto vmce_failed;
 }
 
-if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
-vmce_vcpuid = VMCE_INJECT_BROADCAST;
-else
-vmce_vcpuid = global->mc_vcpuid;
-
 /* We will inject vMCE to DOMU*/
 if ( inject_vmce(d, vmce_vcpuid) < 0 )
 {
diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index d83a3f2..456d6f3 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -386,36 +386,59 @@ int inject_vmce(struct domain *d, int vcpu)
 return ret;
 }
 
+static int vcpu_fill_mc_msrs(struct vcpu *v, uint64_t mcg_status,
+ uint64_t mci_status, uint64_t mci_addr,
+ uint64_t mci_misc)
+{
+if ( v->arch.vmce.mcg_status & MCG_STATUS_MCIP )
+{
+mce_printk(MCE_QUIET, "MCE: %pv: guest has not handled previous"
+   " vMCE yet!\n", v);
+return -EBUSY;
+}
+
+spin_lock(>arch.vmce.lock);
+
+v->arch.vmce.mcg_status = mcg_status;
+/*
+ * 1. Skip bank 0 to avoid 'bank 0 quirk' of old processors
+ * 2. Filter MCi_STATUS MSCOD model specific error code to guest
+ */
+v->arch.vmce.bank[1].mci_status = mci_status & MCi_STATUS_MSCOD_MASK;
+v->arch.vmce.bank[1].mci_addr = mci_addr;
+v->arch.vmce.bank[1].mci_misc = mci_misc;
+
+spin_unlock(>arch.vmce.lock);
+
+return 0;
+}
+
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
-   uint64_t gstatus)
+   uint64_t gstatus, bool broadcast)
 {
 struct vcpu *v = d->vcpu[0];
+int ret;
 
-if ( mc_bank->mc_domid != (uint16_t)~0 )
-{
-if ( v->arch.vmce.mcg_status & MCG_STATUS_MCIP )
-{
-mce_printk(MCE_QUIET, "MCE: guest has not handled previous"
-   " vMCE yet!\n");
-return -1;
-}
-
-spin_lock(>arch.vmce.lock);
+if ( mc_bank->mc_domid == (uint16_t)~0 )
+return -EINVAL;
 
-v->arch.vmce.mcg_status = gstatus;
-/*
- * 1. Skip bank 0 to avoid 'bank 0 quirk' of old processors
- * 2. Filter MCi_STATUS MSCOD model specific error code to guest
- */
-v->arch.vmce.bank[1].mci_status = mc_bank->mc_status &
-  MCi_STATUS_MSCOD_MASK;
-v->arch.vmce.bank[1].mci_addr = mc_bank->mc_addr;
-v->arch.vmce.bank[1].mci_misc = mc_bank->mc_misc;
+ret = vcpu_fill_mc_msrs(v, gstatus, mc_bank->mc_status,
+mc_bank->mc_addr, mc_bank->mc_misc);
+if ( ret || !broadcast )
+goto out;
 
-spin_unlock(>arch.vmce.lock);
+for_each_vcpu ( d, v )
+{
+if ( v == d->vcpu[0] )
+continue;
+ret = vcpu_fill_mc_msrs(v, MCG_STATUS_MCIP | MCG_STATUS_RIPV,
+0, 0, 0);
+if ( ret )
+

[Xen-devel] [PATCH 00/19] MCE code cleanup and add LMCE support

2017-02-16 Thread Haozhong Zhang
This patch series adds LMCE support to Xen, although more than half
patches are for code cleanup and bug fix.

LMCE
--
Intel Local MCE (LMCE) is a feature on Intel Skylake Server CPU that

can deliver MCE to a single processor thread instead of broadcasting

to all threads, which can reduce software's load when processing MCE

on machines with a large number of processor threads.   
 
   
The technical details of LMCE can be found in Intel SDM Vol 3, Chapter  
  
"Machine-Check Architecture" (search for 'LMCE'). Basically,

 * The capability of LMCE is indicated by bit 27 (MCG_LMCE_P) of

   MSR_IA32_MCG_CAP.
 * LMCE is enabled by setting bit 20 (MSR_IA32_FEATURE_CONTROL_LMCE)

   of MSR_IA32_FEATURE_CONTROL and bit 0 (MCG_EXT_CTL_LMCE_EN) of   
 
   MSR_IA32_MCG_EXT_CTL.
 * Software can determine if a MCE is local to the current processor

   thread by checking bit 2 (MCG_STATUS_LMCE) of MSR_IA32_MCG_STATUS.

Patch Overview
--
In this patch series,
 * Xen enables LMCE by default if it's supported by host CPU unless Xen
   boot parameter "mce_fb=1" is present.
 * Xen handles LMCE only on the affected CPU and does not need all CPUs
   to enter MCE handler.
 * A new xl config "lmce=BOOLEAN" is added to control whether LMCE is
   supported for the HVM domain. It's disabled by default. If the host
   CPU does not support LMCE, this config will be ignored.
 * For HVM domain with LMCE support, if the vcpu affected by a host
   LMCE is known, Xen will inject a vLMCE to that vcpu. If the affected
   vcpu is unknown or LMCE support is disabled for a HVM domain, a MCE
   will be broadcast to all vcpus of that domain as before.  

This patch series is organized as below:
 * Patch 1 - 8 clean up existing MCE code and make one improvement to
   debugging messages. No functional change is introduced.
 * Patch 9 - 11 fix two bugs in vMCE injection and MCE handling.
 * Patch 12 & 13 add host-side LMCE support, including detecting,
   enabling LMCE feature and handling LMCE.
 * Patch 14 - 17 add guest-side LMCE support (only HVM domain so far),
   including emulating LMCE feature and injecting LMCE to HVM domain.
 * Patch 18 & 19 add xen-mceinj support to inject LMCE for test purpose.

How to Test
--
0. This patch series can be tested either on Intel CPU w/ LMCE support
   (Skylake-EX), or in the nested virtualization environment on
   KVM/QEMU (i.e. Xen as L1 hypervisor).

   QEMU 2.7.0 and later with KVM in Linux kernel 4.8 and later can
   emulate LMCE and do not require the host hardware support LMCE. You
   can start a nested virtualization environment with LMCE support by
   the following command:
qemu-system-x86_64 -enable-kvm \
   -smp 32 -cpu kvm64,lmce=on,+vmx \
   -hda PATH-TO-DISK-IMG -m 2048

1. Build, install and boot Xen with this patch series. You can include
   "mce_verbosity=verbose" in Xen boot parameters to get more detailed
   debugging messages about MCE.

2. At boot time, if the Xen boot parameter 'mce_fb=1' is not
   present, Xen hypervisor should be able to detect and enable LMCE,
   and print the following message:

(XEN) mce_intel.c:737: MCA Capability: BCAST 1 SER 1 CMCI 1 firstbank 0 
extended MCE MSR LMCE 1

   If 'mce_fb=1' is specified, the last segment of above message will
   be "LMCE 0" which indicates Xen does not enable LMCE support.

3. Start a HVM domain with the attached config file xl.cfg. In the
   config,
* "lmce = 1" enables LMCE for the domaim.
* "cpus = [ ... ]" is helpful for the following steps to figure
  out which CPU should we inject to, and may be not a necessity.

   Run Linux kernel 4.2 or later (which has LMCE support) in the
   domain.

   Run the latest mcelog (https://www.mcelog.org/) in the domain as
   well to log MCEs injected in latter steps. Depending on the guest
   Linux distro, the log can be in /var/log/mcelog, syslog or systemd
   journal.

   Compile and run the attached claim_page.c in the domain. claim_page.c
   allocates a page of memory, prints its base (guest) physical address
   and enters an infinite loop. For example, it may print a message like

Physical address of mmaped page = 0x36d4d000
   
4. Use "xl vcpu-list" to figure out the cpu number on which
   claim_page on is running. For example, xl vcpu-list may output

Name ID  VCPU   CPU State   Time(s) Affinity (Hard / Soft)
lmce-l2   1 04   r-- 

[Xen-devel] [PATCH 14/19] x86/vmx: expose LMCE feature via guest MSR_IA32_FEATURE_CONTROL

2017-02-16 Thread Haozhong Zhang
If MCG_LMCE_P is present in guest MSR_IA32_MCG_CAP, then set LMCE and
LOCK bits in guest MSR_IA32_FEATURE_CONTROL. Intel SDM requires those
bits are set before SW can enable LMCE.

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Jun Nakajima 
Cc: Kevin Tian 
---
 xen/arch/x86/cpu/mcheck/mce_intel.c |  4 
 xen/arch/x86/hvm/vmx/vmx.c  | 10 ++
 xen/arch/x86/hvm/vmx/vvmx.c |  4 
 xen/include/asm-x86/mce.h   |  1 +
 4 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c 
b/xen/arch/x86/cpu/mcheck/mce_intel.c
index b4cc41a..81507d3 100644
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
@@ -916,3 +916,7 @@ int vmce_intel_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
 return 1;
 }
 
+bool vmce_support_lmce(const struct vcpu *v)
+{
+return !!(v->arch.vmce.mcg_cap & MCG_LMCE_P);
+}
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 42f4fbd..6947af0 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -56,6 +56,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -2634,6 +2635,9 @@ static int is_last_branch_msr(u32 ecx)
 
 static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
 {
+struct vcpu *v = current;
+struct domain *d = v->domain;
+
 HVM_DBG_LOG(DBG_LEVEL_MSR, "ecx=%#x", msr);
 
 switch ( msr )
@@ -2651,6 +2655,12 @@ static int vmx_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
 __vmread(GUEST_IA32_DEBUGCTL, msr_content);
 break;
 case MSR_IA32_FEATURE_CONTROL:
+*msr_content = IA32_FEATURE_CONTROL_LOCK;
+if ( vmce_support_lmce(v) )
+*msr_content |= IA32_FEATURE_CONTROL_LMCE_ON;
+if ( nestedhvm_enabled(d) && d->arch.cpuid->basic.vmx )
+*msr_content |= IA32_FEATURE_CONTROL_ENABLE_VMXON_OUTSIDE_SMX;
+break;
 case MSR_IA32_VMX_BASIC...MSR_IA32_VMX_VMFUNC:
 if ( !nvmx_msr_read_intercept(msr, msr_content) )
 goto gp_fault;
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index ec3b946..0060723 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -2020,10 +2020,6 @@ int nvmx_msr_read_intercept(unsigned int msr, u64 
*msr_content)
 data = gen_vmx_msr(data, VMX_ENTRY_CTLS_DEFAULT1, host_data);
 break;
 
-case MSR_IA32_FEATURE_CONTROL:
-data = IA32_FEATURE_CONTROL_LOCK |
-   IA32_FEATURE_CONTROL_ENABLE_VMXON_OUTSIDE_SMX;
-break;
 case MSR_IA32_VMX_VMCS_ENUM:
 /* The max index of VVMCS encoding is 0x1f. */
 data = 0x1f << 1;
diff --git a/xen/include/asm-x86/mce.h b/xen/include/asm-x86/mce.h
index 549bef3..6b827ef 100644
--- a/xen/include/asm-x86/mce.h
+++ b/xen/include/asm-x86/mce.h
@@ -36,6 +36,7 @@ extern void vmce_init_vcpu(struct vcpu *);
 extern int vmce_restore_vcpu(struct vcpu *, const struct hvm_vmce_vcpu *);
 extern int vmce_wrmsr(uint32_t msr, uint64_t val);
 extern int vmce_rdmsr(uint32_t msr, uint64_t *val);
+extern bool vmce_support_lmce(const struct vcpu *v);
 
 extern unsigned int nr_mce_banks;
 
-- 
2.10.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 15/19] x86/vmce: emulate MSR_IA32_MCG_EXT_CTL

2017-02-16 Thread Haozhong Zhang
If MCG_LMCE_P is present in guest MSR_IA32_MCG_CAP, then allow guest
to read/write MSR_IA32_MCG_EXT_CTL.

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/vmce.c | 32 +++-
 xen/include/asm-x86/mce.h  |  1 +
 xen/include/public/arch-x86/hvm/save.h |  2 ++
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index 456d6f3..1278839 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -90,6 +90,7 @@ int vmce_restore_vcpu(struct vcpu *v, const struct 
hvm_vmce_vcpu *ctxt)
 v->arch.vmce.mcg_cap = ctxt->caps;
 v->arch.vmce.bank[0].mci_ctl2 = ctxt->mci_ctl2_bank0;
 v->arch.vmce.bank[1].mci_ctl2 = ctxt->mci_ctl2_bank1;
+v->arch.vmce.lmce_enabled = ctxt->lmce_enabled;
 
 return 0;
 }
@@ -190,6 +191,25 @@ int vmce_rdmsr(uint32_t msr, uint64_t *val)
 *val = ~0ULL;
 mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_CTL %#"PRIx64"\n", cur, 
*val);
 break;
+case MSR_IA32_MCG_EXT_CTL:
+/*
+ * If MCG_LMCE_P is present in guest MSR_IA32_MCG_CAP, the LMCE and 
LOCK
+ * bits are always set in guest MSR_IA32_FEATURE_CONTROL by Xen , so it
+ * does not need to check them here.
+ */
+if ( !(cur->arch.vmce.mcg_cap & MCG_LMCE_P) )
+{
+ret = -1;
+mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_EXT_CTL, not 
supported\n",
+   cur);
+}
+else
+{
+*val = cur->arch.vmce.lmce_enabled ? MCG_EXT_CTL_LMCE_EN : 0;
+mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_EXT_CTL %#"PRIx64"\n",
+   cur, *val);
+}
+break;
 default:
 ret = mce_bank_msr(cur, msr) ? bank_mce_rdmsr(cur, msr, val) : 0;
 break;
@@ -290,6 +310,15 @@ int vmce_wrmsr(uint32_t msr, uint64_t val)
  */
 mce_printk(MCE_VERBOSE, "MCE: %pv: MCG_CAP is r/o\n", cur);
 break;
+case MSR_IA32_MCG_EXT_CTL:
+if ( !(cur->arch.vmce.mcg_cap & MCG_LMCE_P) ||
+ (val & ~MCG_EXT_CTL_LMCE_EN) )
+ret = -1;
+else
+cur->arch.vmce.lmce_enabled = !!(val & MCG_EXT_CTL_LMCE_EN);
+mce_printk(MCE_VERBOSE, "MCE: %pv: wr MCG_EXT_CTL %"PRIx64"%s\n",
+   cur, val, (ret == -1) ? ", not supported" : "");
+break;
 default:
 ret = mce_bank_msr(cur, msr) ? bank_mce_wrmsr(cur, msr, val) : 0;
 break;
@@ -308,7 +337,8 @@ static int vmce_save_vcpu_ctxt(struct domain *d, 
hvm_domain_context_t *h)
 struct hvm_vmce_vcpu ctxt = {
 .caps = v->arch.vmce.mcg_cap,
 .mci_ctl2_bank0 = v->arch.vmce.bank[0].mci_ctl2,
-.mci_ctl2_bank1 = v->arch.vmce.bank[1].mci_ctl2
+.mci_ctl2_bank1 = v->arch.vmce.bank[1].mci_ctl2,
+.lmce_enabled = v->arch.vmce.lmce_enabled,
 };
 
 err = hvm_save_entry(VMCE_VCPU, v->vcpu_id, h, );
diff --git a/xen/include/asm-x86/mce.h b/xen/include/asm-x86/mce.h
index 6b827ef..525a9e8 100644
--- a/xen/include/asm-x86/mce.h
+++ b/xen/include/asm-x86/mce.h
@@ -29,6 +29,7 @@ struct vmce {
 uint64_t mcg_status;
 spinlock_t lock;
 struct vmce_bank bank[GUEST_MC_BANK_NUM];
+bool lmce_enabled;
 };
 
 /* Guest vMCE MSRs virtualization */
diff --git a/xen/include/public/arch-x86/hvm/save.h 
b/xen/include/public/arch-x86/hvm/save.h
index 8d73b51..2d62ec3 100644
--- a/xen/include/public/arch-x86/hvm/save.h
+++ b/xen/include/public/arch-x86/hvm/save.h
@@ -599,6 +599,8 @@ struct hvm_vmce_vcpu {
 uint64_t caps;
 uint64_t mci_ctl2_bank0;
 uint64_t mci_ctl2_bank1;
+uint8_t  lmce_enabled;
+uint8_t  _pad[7];
 };
 
 DECLARE_HVM_SAVE_TYPE(VMCE_VCPU, 18, struct hvm_vmce_vcpu);
-- 
2.10.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 06/19] x86/mce: merge intel_default_mce_dhandler/uhandler()

2017-02-16 Thread Haozhong Zhang
Implementations of these two functions are effectively the same, so
unify them by a common intel_default_mce_handler().

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mce_intel.c | 27 +++
 1 file changed, 3 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c 
b/xen/arch/x86/cpu/mcheck/mce_intel.c
index 498e8e4..b5ee8b8 100644
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
@@ -342,7 +342,7 @@ static int intel_default_check(uint64_t status)
 return 1;
 }
 
-static void intel_default_mce_dhandler(
+static void intel_default_mce_handler(
  struct mca_binfo *binfo,
  enum mce_result *result,
  const struct cpu_user_regs * regs)
@@ -361,32 +361,11 @@ static void intel_default_mce_dhandler(
 static const struct mca_error_handler intel_mce_dhandlers[] = {
 {intel_srao_check, intel_srao_dhandler},
 {intel_srar_check, intel_srar_dhandler},
-{intel_default_check, intel_default_mce_dhandler}
+{intel_default_check, intel_default_mce_handler}
 };
 
-static void intel_default_mce_uhandler(
- struct mca_binfo *binfo,
- enum mce_result *result,
- const struct cpu_user_regs *regs)
-{
-uint64_t status = binfo->mib->mc_status;
-enum intel_mce_type type;
-
-type = intel_check_mce_type(status);
-
-switch (type)
-{
-case intel_mce_fatal:
-*result = MCER_RESET;
-break;
-default:
-*result = MCER_CONTINUE;
-break;
-}
-}
-
 static const struct mca_error_handler intel_mce_uhandlers[] = {
-{intel_default_check, intel_default_mce_uhandler}
+{intel_default_check, intel_default_mce_handler}
 };
 
 /* According to MCA OS writer guide, CMCI handler need to clear bank when
-- 
2.10.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 13/19] x86/mce_intel: detect and enable LMCE on Intel host

2017-02-16 Thread Haozhong Zhang
Enable LMCE if it's supported by the host CPU. If Xen boot parameter
"mce_fb = 1" is present, LMCE will be disabled forcibly.

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mce.h   |  1 +
 xen/arch/x86/cpu/mcheck/mce_intel.c | 44 -
 xen/arch/x86/cpu/mcheck/x86_mca.h   |  5 +
 xen/include/asm-x86/msr-index.h |  2 ++
 4 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h
index 2c033af..461141a 100644
--- a/xen/arch/x86/cpu/mcheck/mce.h
+++ b/xen/arch/x86/cpu/mcheck/mce.h
@@ -38,6 +38,7 @@ enum mcheck_type {
 };
 
 extern uint8_t cmci_apic_vector;
+extern bool lmce_support;
 
 /* Init functions */
 enum mcheck_type amd_mcheck_init(struct cpuinfo_x86 *c);
diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c 
b/xen/arch/x86/cpu/mcheck/mce_intel.c
index 9e5ee3d..b4cc41a 100644
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
@@ -29,6 +29,9 @@ boolean_param("mce_fb", mce_force_broadcast);
 
 static int __read_mostly nr_intel_ext_msrs;
 
+/* If mce_force_broadcast == 1, lmce_support will be disabled forcibly. */
+bool __read_mostly lmce_support = 0;
+
 /* Intel SDM define bit15~bit0 of IA32_MCi_STATUS as the MC error code */
 #define INTEL_MCCOD_MASK 0x
 
@@ -677,10 +680,34 @@ static int mce_is_broadcast(struct cpuinfo_x86 *c)
 return 0;
 }
 
+static bool intel_enable_lmce(void)
+{
+uint64_t msr_content;
+
+/*
+ * Section "Enabling Local Machine Check" in Intel SDM Vol 3
+ * requires software must ensure the LOCK bit and LMCE_ON bit
+ * of MSR_IA32_FEATURE_CONTROL are set before setting
+ * MSR_IA32_MCG_EXT_CTL.LMCE_EN.
+ */
+
+if ( rdmsr_safe(MSR_IA32_FEATURE_CONTROL, msr_content) )
+return 0;
+
+if ( msr_content &
+ (IA32_FEATURE_CONTROL_LOCK | IA32_FEATURE_CONTROL_LMCE_ON) )
+{
+wrmsrl(MSR_IA32_MCG_EXT_CTL, MCG_EXT_CTL_LMCE_EN);
+return 1;
+}
+
+return 0;
+}
+
 /* Check and init MCA */
 static void intel_init_mca(struct cpuinfo_x86 *c)
 {
-bool_t broadcast, cmci = 0, ser = 0;
+bool_t broadcast, cmci = 0, ser = 0, lmce = 0;
 int ext_num = 0, first;
 uint64_t msr_content;
 
@@ -700,26 +727,31 @@ static void intel_init_mca(struct cpuinfo_x86 *c)
 
 first = mce_firstbank(c);
 
+if ( !mce_force_broadcast && (msr_content & MCG_LMCE_P) )
+lmce = intel_enable_lmce();
+
 if (smp_processor_id() == 0)
 {
 dprintk(XENLOG_INFO, "MCA Capability: BCAST %x SER %x"
-" CMCI %x firstbank %x extended MCE MSR %x\n",
-broadcast, ser, cmci, first, ext_num);
+" CMCI %x firstbank %x extended MCE MSR %x LMCE %x\n",
+broadcast, ser, cmci, first, ext_num, lmce);
 
 mce_broadcast = broadcast;
 cmci_support = cmci;
 ser_support = ser;
 nr_intel_ext_msrs = ext_num;
 firstbank = first;
+lmce_support = lmce;
 }
 else if (cmci != cmci_support || ser != ser_support ||
  broadcast != mce_broadcast ||
- first != firstbank || ext_num != nr_intel_ext_msrs)
+ first != firstbank || ext_num != nr_intel_ext_msrs ||
+ lmce != lmce_support)
 {
 dprintk(XENLOG_WARNING,
-"CPU %u has different MCA capability (%x,%x,%x,%x,%x)"
+"CPU %u has different MCA capability (%x,%x,%x,%x,%x,%x)"
 " than BSP, may cause undetermined result!!!\n",
-smp_processor_id(), broadcast, ser, cmci, first, ext_num);
+smp_processor_id(), broadcast, ser, cmci, first, ext_num, 
lmce);
 }
 }
 
diff --git a/xen/arch/x86/cpu/mcheck/x86_mca.h 
b/xen/arch/x86/cpu/mcheck/x86_mca.h
index 322b7d4..3b5060e 100644
--- a/xen/arch/x86/cpu/mcheck/x86_mca.h
+++ b/xen/arch/x86/cpu/mcheck/x86_mca.h
@@ -36,6 +36,7 @@
 #define MCG_TES_P   (1ULL<<11) /* Intel specific */
 #define MCG_EXT_CNT 16 /* Intel specific */
 #define MCG_SER_P   (1ULL<<24) /* Intel specific */
+#define MCG_LMCE_P  (1ULL<<27) /* Intel specific */
 /* Other bits are reserved */
 
 /* Bitfield of the MSR_IA32_MCG_STATUS register */
@@ -46,6 +47,10 @@
 /* Bits 3-63 are reserved on CPU not supporting LMCE */
 /* Bits 4-63 are reserved on CPU supporting LMCE */
 
+/* Bitfield of MSR_IA32_MCG_EXT_CTL register (Intel Specific) */
+#define MCG_EXT_CTL_LMCE_EN (1ULL<<0)
+/* Other bits are reserved */
+
 /* Bitfield of MSR_K8_MCi_STATUS registers */
 /* MCA error code */
 #define MCi_STATUS_MCA  0xULL
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 98dbff1..f0bc574 100644

[Xen-devel] [PATCH 12/19] x86/mce: handle LMCE locally

2017-02-16 Thread Haozhong Zhang
LMCE is sent to only one CPU thread, so MCE handler, barriers and
softirq handler should go without waiting for other CPUs, when
handling LMCE. Note LMCE is still broadcast to all vcpus as regular
MCE on Intel CPU right now.

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/barrier.c  |  4 ++--
 xen/arch/x86/cpu/mcheck/mcaction.c |  4 +++-
 xen/arch/x86/cpu/mcheck/mce.c  | 25 ++---
 xen/arch/x86/cpu/mcheck/mce.h  |  3 +++
 xen/arch/x86/cpu/mcheck/x86_mca.h  |  4 +++-
 5 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/barrier.c 
b/xen/arch/x86/cpu/mcheck/barrier.c
index 5dce1fb..869fd20 100644
--- a/xen/arch/x86/cpu/mcheck/barrier.c
+++ b/xen/arch/x86/cpu/mcheck/barrier.c
@@ -20,7 +20,7 @@ void mce_barrier_enter(struct mce_softirq_barrier *bar)
 {
 int gen;
 
-if (!mce_broadcast)
+if ( !mce_broadcast || __get_cpu_var(lmce_in_process) )
 return;
 atomic_inc(>ingen);
 gen = atomic_read(>outgen);
@@ -38,7 +38,7 @@ void mce_barrier_exit(struct mce_softirq_barrier *bar)
 {
 int gen;
 
-if ( !mce_broadcast )
+if ( !mce_broadcast || __get_cpu_var(lmce_in_process) )
 return;
 atomic_inc(>outgen);
 gen = atomic_read(>ingen);
diff --git a/xen/arch/x86/cpu/mcheck/mcaction.c 
b/xen/arch/x86/cpu/mcheck/mcaction.c
index 8b2b834..90c68ff 100644
--- a/xen/arch/x86/cpu/mcheck/mcaction.c
+++ b/xen/arch/x86/cpu/mcheck/mcaction.c
@@ -95,7 +95,9 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
 
 bank->mc_addr = gfn << PAGE_SHIFT |
   (bank->mc_addr & (PAGE_SIZE -1 ));
-if ( fill_vmsr_data(bank, d, global->mc_gstatus,
+/* TODO: support injecting LMCE */
+if ( fill_vmsr_data(bank, d,
+global->mc_gstatus & ~MCG_STATUS_LMCE,
 vmce_vcpuid == VMCE_INJECT_BROADCAST) == 
-1 )
 {
 mce_printk(MCE_QUIET, "Fill vMCE# data for DOM%d "
diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 95a9da3..2d69222 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -42,6 +42,17 @@ DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, 
poll_bankmask);
 DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, no_cmci_banks);
 DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, mce_clear_banks);
 
+/*
+ * Flag to indicate whether the current MCE on this CPU is a LMCE.
+ *
+ * The MCE handler should set/clear this flag before entering any MCE
+ * barriers and raising MCE softirq. MCE barriers rely on this flag to
+ * decide whether they need to wait for other CPUs. MCE softirq handler
+ * relies on this flag to decide whether it needs to handle pending
+ * MCEs on other CPUs.
+ */
+DEFINE_PER_CPU(bool, lmce_in_process);
+
 static void intpose_init(void);
 static void mcinfo_clear(struct mc_info *);
 struct mca_banks *mca_allbanks;
@@ -399,6 +410,7 @@ mcheck_mca_logout(enum mca_source who, struct mca_banks 
*bankmask,
 sp->errcnt = errcnt;
 sp->ripv = (gstatus & MCG_STATUS_RIPV) != 0;
 sp->eipv = (gstatus & MCG_STATUS_EIPV) != 0;
+sp->lmce = (gstatus & MCG_STATUS_LMCE) != 0;
 sp->uc = uc;
 sp->pcc = pcc;
 sp->recoverable = recover;
@@ -462,6 +474,7 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
 uint64_t gstatus;
 mctelem_cookie_t mctc = NULL;
 struct mca_summary bs;
+bool *lmce_in_process = &__get_cpu_var(lmce_in_process);
 
 mce_spin_lock(_logout_lock);
 
@@ -505,6 +518,8 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
 }
 mce_spin_unlock(_logout_lock);
 
+*lmce_in_process = bs.lmce;
+
 mce_barrier_enter(_trap_bar);
 if ( mctc != NULL && mce_urgent_action(regs, mctc))
 cpumask_set_cpu(smp_processor_id(), _fatal_cpus);
@@ -1709,6 +1724,7 @@ static void mce_softirq(void)
 {
 int cpu = smp_processor_id();
 unsigned int workcpu;
+bool lmce = per_cpu(lmce_in_process, cpu);
 
 mce_printk(MCE_VERBOSE, "CPU%d enter softirq\n", cpu);
 
@@ -1738,9 +1754,12 @@ static void mce_softirq(void)
 /* Step1: Fill DOM0 LOG buffer, vMCE injection buffer and
  * vMCE MSRs virtualization buffer
  */
-for_each_online_cpu(workcpu) {
-mctelem_process_deferred(workcpu, mce_delayed_action);
-}
+if ( lmce )
+mctelem_process_deferred(cpu, mce_delayed_action);
+else
+for_each_online_cpu(workcpu) {
+mctelem_process_deferred(workcpu, mce_delayed_action);
+}
 
 /* Step2: Send Log to DOM0 through vIRQ */
 if (dom0_vmce_enabled()) {
diff --git 

[Xen-devel] [PATCH 03/19] x86/mce: remove unnecessary braces around intel_get_extended_msrs()

2017-02-16 Thread Haozhong Zhang
Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mce.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 2695b0c..e327eab 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -356,11 +356,8 @@ mcheck_mca_logout(enum mca_source who, struct mca_banks 
*bankmask,
 ASSERT(mig);
 mca_init_global(mc_flags, mig);
 /* A hook here to get global extended msrs */
-{
-if (boot_cpu_data.x86_vendor ==
-X86_VENDOR_INTEL)
-intel_get_extended_msrs(mig, mci);
-}
+if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
+intel_get_extended_msrs(mig, mci);
 }
 }
 
-- 
2.10.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 02/19] x86/mce: remove declarations of non-existing functions in mce.h

2017-02-16 Thread Haozhong Zhang
Remove declarations of functions
intel_mcheck_timer()
mce_intel_feature_init()
mce_cap_init()
x86_mcinfo_getptr()
whose definitions had been removed long time ago.

Signed-off-by: Haozhong Zhang 
---
Cc: Christoph Egger 
Cc: Liu Jinsong 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/cpu/mcheck/mce.h | 4 
 1 file changed, 4 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h
index 2f0fb07..e697780 100644
--- a/xen/arch/x86/cpu/mcheck/mce.h
+++ b/xen/arch/x86/cpu/mcheck/mce.h
@@ -43,11 +43,8 @@ extern uint8_t cmci_apic_vector;
 enum mcheck_type amd_mcheck_init(struct cpuinfo_x86 *c);
 enum mcheck_type intel_mcheck_init(struct cpuinfo_x86 *c, bool_t bsp);
 
-void intel_mcheck_timer(struct cpuinfo_x86 *c);
-void mce_intel_feature_init(struct cpuinfo_x86 *c);
 void amd_nonfatal_mcheck_init(struct cpuinfo_x86 *c);
 
-uint64_t mce_cap_init(void);
 extern unsigned int firstbank;
 
 struct mcinfo_extended *intel_get_extended_msrs(
@@ -56,7 +53,6 @@ struct mcinfo_extended *intel_get_extended_msrs(
 int mce_available(struct cpuinfo_x86 *c);
 unsigned int mce_firstbank(struct cpuinfo_x86 *c);
 /* Helper functions used for collecting error telemetry */
-struct mc_info *x86_mcinfo_getptr(void);
 void noreturn mc_panic(char *s);
 void x86_mc_get_cpu_info(unsigned, uint32_t *, uint16_t *, uint16_t *,
  uint32_t *, uint32_t *, uint32_t *, uint32_t *);
-- 
2.10.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] JSON format of >xl list< does not provide state information

2017-02-16 Thread Robert Urban
Hello Folks,

Sorry, forgot to mention the version:

> # xl info
> [...]
> release: 4.9.3-200.186.fc25.x86_64


plain "xl list" produces:

> # xl list
> NameID   Mem VCPUsStateTime(s)
> Domain-0 0  2048 1 r- 
> 108.2

and "xl list -l" produces:

> # xl list -l
> [
> {
> "domid": 0,
> "config": {
> "c_info": {
> "type": "pv",
> "name": "Domain-0"
> },
> "b_info": {
> "max_memkb": 2097152,
> "target_memkb": 2097151,
> "sched_params": {
>
> },
> "type.pv": {
>
> },
> "arch_arm": {
>
> }
> }
> }
> }
> ]

why doesn't "xl list -l" provide the "State" information that the plain "xl
list" provides? I would have thought the JSON output would be complete.

cheers,

Robert Urban


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] JSON format of >xl list< does not provide state information

2017-02-16 Thread Robert Urban
Hello Folks,

plain "xl list" produces:

> # xl list
> NameID   Mem VCPUsStateTime(s)
> Domain-0 0  2048 1 r- 
> 108.2

and "xl list -l" produces:

> # xl list -l
> [
> {
> "domid": 0,
> "config": {
> "c_info": {
> "type": "pv",
> "name": "Domain-0"
> },
> "b_info": {
> "max_memkb": 2097152,
> "target_memkb": 2097151,
> "sched_params": {
>
> },
> "type.pv": {
>
> },
> "arch_arm": {
>
> }
> }
> }
> }
> ]

why doesn't "xl list -l" provide the "State" information that the plain "xl
list" provides? I would have thought the JSON output would be complete.

cheers,

Robert Urban


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 105866: regressions - trouble: broken/fail/pass

2017-02-16 Thread osstest service owner
flight 105866 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105866/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt 14 guest-saverestorefail REGR. vs. 105852
 test-armhf-armhf-xl 15 guest-start/debian.repeat fail REGR. vs. 105852
 test-amd64-amd64-xl-qemuu-debianhvm-i386 12 guest-saverestore fail REGR. vs. 
105852

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  378384399ed661bed711221a5d8dbdac66b8e851
baseline version:
 xen  7127d53fe891f9ea67357587a33a7aaba4b55f45

Last test of basis   105852  2017-02-16 14:01:33 Z0 days
Failing since105857  2017-02-16 16:01:30 Z0 days6 attempts
Testing same since   105862  2017-02-16 22:01:53 Z0 days3 attempts


People who touched revisions under test:
  Andrew Cooper 
  Daniel Kiper 
  Jan Beulich 
  Julien Grall 
  Stefano Stabellini 

jobs:
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-amd64-libvirt  pass
 build-arm64-pvopsfail
 test-armhf-armhf-xl  fail
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 fail
 test-amd64-amd64-libvirt fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 378384399ed661bed711221a5d8dbdac66b8e851
Author: Stefano Stabellini 
Date:   Fri Feb 10 18:05:22 2017 -0800

arm: read/write rank->vcpu atomically

We don't need a lock in vgic_get_target_vcpu anymore, solving the
following lock inversion bug: the rank lock should be taken first, then
the vgic lock. However, gic_update_one_lr is called with the vgic lock
held, and it calls vgic_get_target_vcpu, which tries to obtain the rank
lock.

Coverity-ID: 1381855
Coverity-ID: 1381853

Signed-off-by: Stefano Stabellini 
Reviewed-by: Julien Grall 

commit 79903e50dba9e7442c9b7ca424661bb020e9dbf2
Author: Jan Beulich 
Date:   Thu Feb 16 18:11:42 2017 +0100

x86emul: catch exceptions occurring in stubs

Before adding more use of stubs cloned from decoded guest insns, guard
ourselves against mistakes there: Should an exception (with the
noteworthy exception of #PF) occur inside the stub, forward it to the
guest.

Since the exception fixup table entry can't encode the address of the
faulting insn itself, attach it to the return address instead. This at
once provides a convenient place to hand the exception information
back: The return address is being overwritten by it before branching to
the recovery code.

Take the opportunity and (finally!) add symbol resolution to the
respective log messages (the new one is intentionally not being coded
that way, as it covers stub addresses only, which don't have symbols
associated).

Also take the opportunity and make search_one_extable() static again.

Suggested-by: Andrew Cooper 
Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

commit 8c935f5ff1cac422b4de21cbab69e13d2ebb25be
Author: Daniel Kiper 
Date:   Thu Feb 16 18:10:04 2017 +0100

[Xen-devel] [xen-4.7-testing test] 105855: tolerable FAIL - PUSHED

2017-02-16 Thread osstest service owner
flight 105855 xen-4.7-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105855/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  9 debian-install   fail REGR. vs. 105661
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 105661
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 105661
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 105661
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 105661
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 105661
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 105661

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  758378233b0b5d79a29735d95dc72410ef2f19aa
baseline version:
 xen  d31a0a29f5d7563b7361f7096316fd9e611d8673

Last test of basis   105661  2017-02-09 09:43:17 Z7 days
Testing same since   105819  2017-02-15 12:44:07 Z1 days3 attempts


People who touched revisions under test:
  Jan Beulich 
  Julien Grall 
  Oleksandr Tyshchenko 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  fail
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf 

Re: [Xen-devel] [PATCH v2 2/2] x86: package up context switch hook pointers

2017-02-16 Thread Tian, Kevin
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Thursday, February 16, 2017 7:16 PM
> 
> They're all solely dependent on guest type, so we don't need to repeat
> all the same three pointers in every vCPU control structure. Instead use
> static const structures, and store pointers to them in the domain
> control structure.
> 
> Since touching it anyway, take the opportunity and expand
> schedule_tail() in the only two places invoking it, allowing the macro
> to be dropped.
> 
> Signed-off-by: Jan Beulich 
> Reviewed-by: Boris Ostrovsky 

Reviewed-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths

2017-02-16 Thread Tian, Kevin
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Thursday, February 16, 2017 8:36 PM
> 
> >>> On 16.02.17 at 13:27,  wrote:
> > On 16/02/17 11:15, Jan Beulich wrote:
> >> When __context_switch() is being bypassed during original context
> >> switch handling, the vCPU "owning" the VMCS partially loses control of
> >> it: It will appear non-running to remote CPUs, and hence their attempt
> >> to pause the owning vCPU will have no effect on it (as it already
> >> looks to be paused). At the same time the "owning" CPU will re-enable
> >> interrupts eventually (the lastest when entering the idle loop) and
> >> hence becomes subject to IPIs from other CPUs requesting access to the
> >> VMCS. As a result, when __context_switch() finally gets run, the CPU
> >> may no longer have the VMCS loaded, and hence any accesses to it would
> >> fail. Hence we may need to re-load the VMCS in vmx_ctxt_switch_from().
> >>
> >> Similarly, when __context_switch() is being bypassed also on the second
> >> (switch-in) path, VMCS ownership may have been lost and hence needs
> >> re-establishing. Since there's no existing hook to put this in, add a
> >> new one.
> >>
> >> Reported-by: Kevin Mayer 
> >> Reported-by: Anshul Makkar 
> >> Signed-off-by: Jan Beulich 
> >
> > Reviewed-by: Andrew Cooper 
> >
> > Although I would certainly prefer if we can get another round of testing
> > on this series for confidence.
> 
> Sure, I'd certainly like to stick a Tested-by on it. Plus VMX maintainer
> feedback will need waiting for anyway.
> 

logic looks clean to me:

Acked-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Unable to boot Xen 4.8 with iommu=0

2017-02-16 Thread Tian, Kevin
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Thursday, February 16, 2017 5:32 PM
> 
> >>> On 15.02.17 at 20:30,  wrote:
> > On Wed, Feb 15, 2017 at 3:03 AM, Jan Beulich  wrote:
> > On 15.02.17 at 00:21,  wrote:
> >>> On 14/02/2017 22:47, Tamas K Lengyel wrote:
>  (XEN) Switched to APIC driver x2apic_cluster.
>  (XEN) XSM Framework v1.0.0 initialized
>  (XEN) Flask: 128 avtab hash slots, 394 rules.
>  (XEN) Flask: 128 avtab hash slots, 394 rules.
>  (XEN) Flask:  5 users, 3 roles, 43 types, 2 bools
>  (XEN) Flask:  12 classes, 394 rules
>  (XEN) Flask:  Starting in enforcing mode.
>  (XEN) xstate: size: 0x340 and states: 0x7
>  (XEN) Intel machine check reporting enabled
>  (XEN) Using scheduler: SMP Credit Scheduler (credit)
>  (XEN) Platform timer is 14.318MHz HPET
>  (XEN) Detected 3392.326 MHz processor.
>  (XEN) Initing memory sharing.
>  (XEN) alt table 82d0802d3f38 -> 82d0802d5564
>  (XEN) PCI: MCFG configuration 0: base e000 segment  buses 00 - 3f
>  (XEN) PCI: Not using MCFG for segment  bus 00-3f
>  (XEN)
>  (XEN) 
>  (XEN) Panic on CPU 0:
>  (XEN) Couldn't enable IOMMU and iommu=required/force
>  (XEN) 
>  (XEN)
>  (XEN) Reboot in five seconds...
>  (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.
> 
>  As seen in the command line iommu is not set to required or forced.
>  Yet Xen thinks it is set to required/force. Has anyone else run into
>  this issue?
> >>>
> >>> This area is a rats nest :(
> >>>
> >>> The problem is that the APIC setup has chosen to use the x2apic_cluster
> >>> driver, which in the Xen code depends on x2APIC, which depends on
> >>> interrupt remapping, which depends on an IOMMU.
> >>>
> >>> (I could have sworn we fixed this before), but the bug is that the APIC
> >>> setup shouldn't choose any of the x2apic modes if there isn't an
> >>> interrupt remapping capable IOMMU.
> >>
> >> And from going over the code I can't see how this would happen,
> >> so Tamas, it would be nice if you could add some verbosity to the
> >> functions involved. In particular x2apic_bsp_setup() must be
> >> getting success (zero) from iommu_enable_x2apic_IR(), yet that
> >> function calls iommu_supports_eim(), which ought to produce
> >> false through its very first exit path in your case.
> >>
> >> Or wait - do you have the same issue if you use
> >> "iommu=no,no-intremap"? In which case the problem would be
> >> that "iommu=no" should clear more than just "iommu_enable", or
> >> code checking iommu_intremap early (before iommu_setup()
> >> manages to clear it in the case here) would need to made look at
> >> both variables. Oddly enough acpi_parse_dmar() only bails if
> >> both variables are clear, which suggests to me that
> >> iommu_enable is intended to have two different meanings in
> >> different contexts (master flag vs. controlling just DMA
> >> remapping). Kevin, Feng - any thoughts here?
> >
> > iommu=no,no-intremap boots fine with "(XEN) Using APIC driver default"
> 
> Thanks for confirming.
> 
> Kevin, Feng, we now depend on your input regarding the intentions
> with the two variables.
> 

Feng just left Intel. Let me take a look at code to understand the
rationale behind.

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] x86/shadow: Correct guest behaviour when creating PTEs above maxphysaddr

2017-02-16 Thread Tian, Kevin
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: Thursday, February 16, 2017 11:46 PM
> 
> XSA-173 (c/s 8b1764833) introduces gfn_bits, and an upper limit which might be
> lower than the real maxphysaddr, to avoid overflowing the superpage shadow
> backpointer.
> 
> However, plenty of hardware has a physical address width less that 44 bits,
> and the code added in shadow_domain_init() is a straight assignment.  This
> causes gfn_bits to be increased beyond the physical address width on most
> Intel consumer hardware (typically a width of 39, which is the number reported
> to the guest via CPUID).
> 
> If the guest intentionally creates a PTE referencing a physical address
> between 39 and 44 bits, the result should be #PF[RSVD] for using the virtual
> address.  However, the shadow code accepts the PTE, shadows it, and the
> virtual address works normally.
> 
> Introduce paging_max_paddr_bits() to calculate the largest guest physical
> address supportable by the paging infrastructure, and update
> recalculate_cpuid_policy() to take this into account when clamping the guests
> cpuid_policy to reality.
> 
> There is an existing gfn_valid() in guest_pt.h but it is unused in the
> codebase.  Repurpose it to perform a guest-specific maxphysaddr check, which
> replaces the users of gfn_bits.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 105853: regressions - FAIL

2017-02-16 Thread osstest service owner
flight 105853 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105853/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 9 windows-install fail REGR. vs. 
105796

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 105796
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 105796
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 105796
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 105796
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 105796
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 105796

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 build-arm64   5 xen-buildfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuuca5266de6cde132b647b44a108f6c0c009785fdd
baseline version:
 qemuu5dae13cd71f0755a1395b5a4cde635b8a6ee3f58

Last test of basis   105796  2017-02-14 16:14:51 Z2 days
Testing same since   105853  2017-02-16 14:12:58 Z0 days1 attempts


People who touched revisions under test:
  Jason Wang 
  Li Qiang 
  Li Qiang 
  Li Zhijian 
  Paolo Bonzini 
  Peter Maydell 
  Prasad J Pandit 
  Thomas Huth 
  Zhang Chen 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  

[Xen-devel] [linux-linus test] 105845: regressions - FAIL

2017-02-16 Thread osstest service owner
flight 105845 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105845/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail REGR. vs. 59254
 test-armhf-armhf-xl-arndale   6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-libvirt  6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-xl   6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-xl-credit2   6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-xl-xsm   6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-libvirt-xsm  6 xen-boot  fail REGR. vs. 59254

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 15 
guest-localmigrate/x10 fail in 105795 pass in 105845
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 15 guest-localmigrate/x10 fail 
in 105824 pass in 105795
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 13 guest-localmigrate fail pass 
in 105824

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds  6 xen-boot  fail REGR. vs. 59254
 test-amd64-amd64-xl-rtds  9 debian-installfail REGR. vs. 59254
 test-armhf-armhf-libvirt-raw  6 xen-bootfail baseline untested
 test-armhf-armhf-xl-vhd   6 xen-bootfail baseline untested
 test-amd64-amd64-rumprun-amd64 16 rumprun-demo-xenstorels/xenstorels.repeat 
fail baseline untested
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59254
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 59254
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 59254

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 build-arm64   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass

version targeted for testing:
 linux747ae0a96f1a78b35c5a3d93ad37a16655e16340
baseline version:
 linux45820c294fe1b1a9df495d57f40585ef2d069a39

Last test of basis59254  2015-07-09 04:20:48 Z  588 days
Failing since 59348  2015-07-10 04:24:05 Z  587 days  278 attempts
Testing same since   105795  2017-02-14 15:48:07 Z2 days5 attempts


7580 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  fail
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  blocked 
 build-armhf-libvirt  

[Xen-devel] [xen-4.4-testing baseline-only test] 68569: regressions - FAIL

2017-02-16 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 68569 xen-4.4-testing real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/68569/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-pygrub  13 guest-saverestore fail REGR. vs. 68467
 test-amd64-amd64-amd64-pvgrub 10 guest-start  fail REGR. vs. 68467
 test-amd64-amd64-xl-qemut-winxpsp3  9 windows-install fail REGR. vs. 68467
 test-amd64-amd64-xl-qemuu-winxpsp3 15 guest-localmigrate/x10 fail REGR. vs. 
68467

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xend-qemut-winxpsp3  9 windows-install fail like 68265
 test-xtf-amd64-amd64-4   53 leak-check/check fail   like 68467
 test-amd64-amd64-xl  19 guest-start/debian.repeatfail   like 68467
 test-xtf-amd64-amd64-3   53 leak-check/check fail   like 68467
 test-xtf-amd64-amd64-5   53 leak-check/check fail   like 68467
 test-xtf-amd64-amd64-2   53 leak-check/check fail   like 68467
 test-xtf-amd64-amd64-1   53 leak-check/check fail   like 68467
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 68467
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 68467
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 68467

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumprun-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-rumprun-i386  1 build-check(1)   blocked  n/a
 test-xtf-amd64-amd64-4   10 xtf-fep  fail   never pass
 test-armhf-armhf-xl-vhd   9 debian-di-installfail   never pass
 test-armhf-armhf-libvirt-raw  9 debian-di-installfail   never pass
 test-xtf-amd64-amd64-3   10 xtf-fep  fail   never pass
 test-xtf-amd64-amd64-5   10 xtf-fep  fail   never pass
 test-xtf-amd64-amd64-2   10 xtf-fep  fail   never pass
 test-xtf-amd64-amd64-1   10 xtf-fep  fail   never pass
 test-armhf-armhf-libvirt 11 guest-start  fail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-midway   12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-xtf-amd64-amd64-4   52 xtf/test-hvm64-xsa-195   fail   never pass
 build-amd64-rumprun   7 xen-buildfail   never pass
 build-i386-rumprun7 xen-buildfail   never pass
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail never pass
 test-xtf-amd64-amd64-3   52 xtf/test-hvm64-xsa-195   fail   never pass
 test-xtf-amd64-amd64-5   52 xtf/test-hvm64-xsa-195   fail   never pass
 test-xtf-amd64-amd64-2   52 xtf/test-hvm64-xsa-195   fail   never pass
 test-xtf-amd64-amd64-1   52 xtf/test-hvm64-xsa-195   fail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass

version targeted for testing:
 xen  b648113f8a77b09c7c52cfd6f1594987e0f33d22
baseline version:
 xen  394ddc2de62cbcaf9d28cc7373fde175e6ba3a5d

Last test of basis68467  2017-01-25 03:20:23 Z   22 days
Testing same since68569  2017-02-16 15:43:36 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 
  Julien Grall 
  Oleksandr Tyshchenko 

jobs:
 build-amd64-xend pass
 build-i386-xend  pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass

[Xen-devel] [xtf test] 105859: all pass - PUSHED

2017-02-16 Thread osstest service owner
flight 105859 xtf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105859/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 xtf  2b8c78575cb534908ccc8824d76904376b9c38a5
baseline version:
 xtf  01b0192030c01dc8af02dca6b92d720cf3908b80

Last test of basis   105823  2017-02-15 16:48:40 Z1 days
Testing same since   105859  2017-02-16 18:13:16 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-amd64-xtf  pass
 build-amd64  pass
 build-amd64-pvopspass
 test-xtf-amd64-amd64-1   pass
 test-xtf-amd64-amd64-2   pass
 test-xtf-amd64-amd64-3   pass
 test-xtf-amd64-amd64-4   pass
 test-xtf-amd64-amd64-5   pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xtf
+ revision=2b8c78575cb534908ccc8824d76904376b9c38a5
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xtf 
2b8c78575cb534908ccc8824d76904376b9c38a5
+ branch=xtf
+ revision=2b8c78575cb534908ccc8824d76904376b9c38a5
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xtf
+ xenbranch=xen-unstable
+ '[' xxtf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.8-testing
+ '[' x2b8c78575cb534908ccc8824d76904376b9c38a5 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git
++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/xen/git/linux-pvops.git
++ : git://xenbits.xen.org/linux-pvops.git
++ : tested/linux-3.14
++ : tested/linux-arm-xen
++ '[' 

[Xen-devel] [PATCH V3 0/7] COLO-Proxy: Make Xen COLO use userspace colo-proxy

2017-02-16 Thread Zhang Chen
Because of some reason, We no longer support COLO kernel proxy.
So we send this patch set to make Xen use userspace colo-proxy in qemu.

Below is a COLO userspace proxy ascii figure:

 Primary qemu   
Secondary qemu
+--+   
++
| +--+ |   |  
+---+ |
| |  | |   |  | 
  | |
| |guest | |   |  | 
   guest  | |
| |  | |   |  | 
  | |
| +---^--+---+ |   |  
+-+++ |
| |  | |   |
^|  |
| |  | |   |
||  |
| |  +--+  |
||  |
|netfilter|  |   | ||  |   
netfilter||  |
| +--+ ++  ||  |  
+---+ |
| |   |  |   |  |out   ||  |  | 
||  filter excute order   | |
| |   |  |  +-+||  |  | 
|| +--->  | |
| |   |  |  ||  | |||  |  | 
||   TCP  | |
| | +-+--+-+  +-v+ +-v+ |pri +++sec||  |  | 
++  +---++---v+rewriter++  ++ | |
| | |  |  |  | |  | |in  | |in ||  |  | |   
 |  ||  |  || | |
| | |  filter  |  |  filter  | |  filter  +-->  colo   <--+ +>  
filter   +--> adjust |   adjust +-->   filter   | | |
| | |  mirror  |  |redirector| |redirector| || compare |   |  ||  | | 
redirector |  | ack|   seq|  | redirector | | |
| | |  |  |  | |  | || |   |  ||  | |   
 |  ||  |  || | |
| | +^-+  ++-+ +--+ |+-+   |  ||  | 
++  ++--+  +---++ | |
| |  |   tx|   rx   rx  |  |  ||  | 
   txall   |  rx  | |
| |  | ||  |  ||  
+---+ |
| |  | +--+ |  |  ||
   ||
| |  |   filter excute order  | |  |  ||
   ||
| |  |  +>| |  |  
++|
| +-+  |   |
|
||||   |
|
+--+   
++
 |guest receive   | guest send
 ||
++v+
|  |
  NOTE: filter direction is rx/tx/all
| tap  |
  rx:receive packets sent to the netdev
|  |
  tx:receive packets sent by the netdev
+--+

You can know the detail from here:

http://wiki.qemu.org/Features/COLO
https://github.com/qemu/qemu/blob/master/docs/colo-proxy.txt

V3:
   - remove the 'RFC' tag.
   - fix some bug in patch 7/7.
   - fix codestyle.

V2:
   - 

[Xen-devel] [PATCH V3 1/7] COLO-Proxy: Add remus command to open userspace proxy

2017-02-16 Thread Zhang Chen
Add remus '-p' to enable userspace colo proxy(in qemu).

Signed-off-by: Zhang Chen 
---
 docs/man/xl.pod.1.in  |  5 +
 tools/libxl/libxl.h   |  6 ++
 tools/libxl/libxl_colo.h  |  5 +
 tools/libxl/libxl_colo_save.c |  2 ++
 tools/libxl/libxl_types.idl   | 17 +
 tools/libxl/xl_cmdimpl.c  | 13 -
 tools/libxl/xl_cmdtable.c |  3 ++-
 7 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in
index 09c1faa..4260777 100644
--- a/docs/man/xl.pod.1.in
+++ b/docs/man/xl.pod.1.in
@@ -553,6 +553,11 @@ Disable disk replication. Requires enabling unsafe mode.
 Enable COLO HA. This conflicts with B<-i> and B<-b>, and memory
 checkpoint compression must be disabled.
 
+=item B<-p>
+
+Use userspace COLO Proxy. This option must be used in conjunction
+with B<-c>.
+
 =back
 
 =item B I
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 3924464..fce7fab 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -870,6 +870,12 @@ typedef struct libxl__ctx libxl_ctx;
  */
 #define LIBXL_HAVE_REMUS 1
 
+/*
+ * LIBXL_HAVE_COLO_USERSPACE_PROXY
+ * If this is defined, then libxl supports COLO userspace proxy.
+ */
+#define LIBXL_HAVE_COLO_USERSPACE_PROXY 1
+
 typedef uint8_t libxl_mac[6];
 #define LIBXL_MAC_FMT "%02hhx:%02hhx:%02hhx:%02hhx:%02hhx:%02hhx"
 #define LIBXL_MAC_FMTLEN ((2*6)+5) /* 6 hex bytes plus 5 colons */
diff --git a/tools/libxl/libxl_colo.h b/tools/libxl/libxl_colo.h
index 682275c..4746d8c 100644
--- a/tools/libxl/libxl_colo.h
+++ b/tools/libxl/libxl_colo.h
@@ -64,6 +64,11 @@ struct libxl__colo_proxy_state {
 
 int sock_fd;
 int index;
+/*
+ * Private, True means use userspace colo proxy
+ *  False means use kernel colo proxy.
+ */
+bool is_userspace_proxy;
 };
 
 struct libxl__colo_save_state {
diff --git a/tools/libxl/libxl_colo_save.c b/tools/libxl/libxl_colo_save.c
index 620..eb8336c 100644
--- a/tools/libxl/libxl_colo_save.c
+++ b/tools/libxl/libxl_colo_save.c
@@ -101,6 +101,8 @@ void libxl__colo_save_setup(libxl__egc *egc, 
libxl__colo_save_state *css)
 css->qdisk_setuped = false;
 css->qdisk_used = false;
 libxl__ev_child_init(>child);
+css->cps.is_userspace_proxy =
+libxl_defbool_val(dss->remus->userspace_colo_proxy);
 
 if (dss->remus->netbufscript)
 css->colo_proxy_script = libxl__strdup(gc, dss->remus->netbufscript);
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index a612d1f..1bd2057 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -844,14 +844,15 @@ libxl_sched_credit2_params = 
Struct("sched_credit2_params", [
 ], dispose_fn=None)
 
 libxl_domain_remus_info = Struct("domain_remus_info",[
-("interval", integer),
-("allow_unsafe", libxl_defbool),
-("blackhole",libxl_defbool),
-("compression",  libxl_defbool),
-("netbuf",   libxl_defbool),
-("netbufscript", string),
-("diskbuf",  libxl_defbool),
-("colo", libxl_defbool)
+("interval", integer),
+("allow_unsafe", libxl_defbool),
+("blackhole",libxl_defbool),
+("compression",  libxl_defbool),
+("netbuf",   libxl_defbool),
+("netbufscript", string),
+("diskbuf",  libxl_defbool),
+("colo", libxl_defbool),
+("userspace_colo_proxy", libxl_defbool)
 ])
 
 libxl_event_type = Enumeration("event_type", [
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 7e8a8ae..99baeef 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -8893,7 +8893,7 @@ int main_remus(int argc, char **argv)
 
 memset(_info, 0, sizeof(libxl_domain_remus_info));
 
-SWITCH_FOREACH_OPT(opt, "Fbundi:s:N:ec", NULL, "remus", 2) {
+SWITCH_FOREACH_OPT(opt, "Fbundi:s:N:ecp", NULL, "remus", 2) {
 case 'i':
 r_info.interval = atoi(optarg);
 break;
@@ -8923,6 +8923,9 @@ int main_remus(int argc, char **argv)
 break;
 case 'c':
 libxl_defbool_set(_info.colo, true);
+break;
+case 'p':
+libxl_defbool_set(_info.userspace_colo_proxy, true);
 }
 
 domid = find_domain(argv[optind]);
@@ -8931,9 +8934,17 @@ int main_remus(int argc, char **argv)
 /* Defaults */
 libxl_defbool_setdefault(_info.blackhole, false);
 libxl_defbool_setdefault(_info.colo, false);
+libxl_defbool_setdefault(_info.userspace_colo_proxy, false);
+
 if (!libxl_defbool_val(r_info.colo) && !r_info.interval)
 r_info.interval = 200;
 
+if (libxl_defbool_val(r_info.userspace_colo_proxy) &&
+!libxl_defbool_val(r_info.colo)) {
+fprintf(stderr, "Option -p must be used in conjunction with -c");
+exit(-1);
+}
+
 if (libxl_defbool_val(r_info.colo)) {
 if (r_info.interval || 

[Xen-devel] [PATCH V3 3/7] tools/libxl: refactor do_domain_create()

2017-02-16 Thread Zhang Chen
We use params->colo_proxy_script to make do_domain_create()
doesn't take "colo_proxy_script" anymore.

Signed-off-by: Zhang Chen 
---
 tools/libxl/libxl_create.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index e3bc257..e741b9a 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1608,7 +1608,6 @@ static void domain_create_cb(libxl__egc *egc,
 static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
 uint32_t *domid, int restore_fd, int send_back_fd,
 const libxl_domain_restore_params *params,
-const char *colo_proxy_script,
 const libxl_asyncop_how *ao_how,
 const libxl_asyncprogress_how *aop_console_how)
 {
@@ -1632,7 +1631,14 @@ static int do_domain_create(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 }
 cdcs->dcs.callback = domain_create_cb;
 cdcs->dcs.domid_soft_reset = INVALID_DOMID;
-cdcs->dcs.colo_proxy_script = colo_proxy_script;
+
+if (cdcs->dcs.restore_params.checkpointed_stream ==
+LIBXL_CHECKPOINTED_STREAM_COLO)
+cdcs->dcs.colo_proxy_script =
+cdcs->dcs.restore_params.colo_proxy_script;
+else
+cdcs->dcs.colo_proxy_script = NULL;
+
 libxl__ao_progress_gethow(>dcs.aop_console_how, aop_console_how);
 cdcs->domid_out = domid;
 
@@ -1820,7 +1826,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 const libxl_asyncprogress_how *aop_console_how)
 {
 unset_disk_colo_restore(d_config);
-return do_domain_create(ctx, d_config, domid, -1, -1, NULL, NULL,
+return do_domain_create(ctx, d_config, domid, -1, -1, NULL,
 ao_how, aop_console_how);
 }
 
@@ -1831,17 +1837,14 @@ int libxl_domain_create_restore(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 const libxl_asyncop_how *ao_how,
 const libxl_asyncprogress_how *aop_console_how)
 {
-char *colo_proxy_script = NULL;
-
 if (params->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_COLO) {
-colo_proxy_script = params->colo_proxy_script;
 set_disk_colo_restore(d_config);
 } else {
 unset_disk_colo_restore(d_config);
 }
 
 return do_domain_create(ctx, d_config, domid, restore_fd, send_back_fd,
-params, colo_proxy_script, ao_how, 
aop_console_how);
+params, ao_how, aop_console_how);
 }
 
 int libxl_domain_soft_reset(libxl_ctx *ctx,
-- 
2.7.4




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH V3 4/7] COLO-Proxy: Setup userspace colo-proxy on secondary side

2017-02-16 Thread Zhang Chen
In this patch we add a function to close COLO kernel Proxy on secondary side.

Signed-off-by: Zhang Chen 
---
 tools/libxl/libxl_colo_restore.c |  8 ++--
 tools/libxl/libxl_create.c   |  8 ++--
 tools/libxl/libxl_types.idl  |  1 +
 tools/libxl/xl_cmdimpl.c | 18 +++---
 4 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_colo_restore.c b/tools/libxl/libxl_colo_restore.c
index 6a96328..c6d239a 100644
--- a/tools/libxl/libxl_colo_restore.c
+++ b/tools/libxl/libxl_colo_restore.c
@@ -774,8 +774,12 @@ static void colo_setup_checkpoint_devices(libxl__egc *egc,
 
 STATE_AO_GC(crs->ao);
 
-cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VIF) |
- (1 << LIBXL__DEVICE_KIND_VBD);
+if (crs->cps.is_userspace_proxy)
+cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VBD);
+else
+cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VIF) |
+ (1 << LIBXL__DEVICE_KIND_VBD);
+
 cds->callback = colo_restore_setup_cds_done;
 cds->ao = ao;
 cds->domid = crs->domid;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index e741b9a..409945a 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1633,11 +1633,15 @@ static int do_domain_create(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 cdcs->dcs.domid_soft_reset = INVALID_DOMID;
 
 if (cdcs->dcs.restore_params.checkpointed_stream ==
-LIBXL_CHECKPOINTED_STREAM_COLO)
+LIBXL_CHECKPOINTED_STREAM_COLO) {
 cdcs->dcs.colo_proxy_script =
 cdcs->dcs.restore_params.colo_proxy_script;
-else
+cdcs->dcs.crs.cps.is_userspace_proxy =
+libxl_defbool_val(cdcs->dcs.restore_params.userspace_colo_proxy);
+} else {
 cdcs->dcs.colo_proxy_script = NULL;
+cdcs->dcs.crs.cps.is_userspace_proxy = false;
+}
 
 libxl__ao_progress_gethow(>dcs.aop_console_how, aop_console_how);
 cdcs->domid_out = domid;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 1bd2057..89c2c9d 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -390,6 +390,7 @@ libxl_domain_restore_params = 
Struct("domain_restore_params", [
 ("checkpointed_stream", integer),
 ("stream_version", uint32, {'init_val': '1'}),
 ("colo_proxy_script", string),
+("userspace_colo_proxy", libxl_defbool),
 ])
 
 libxl_sched_params = Struct("sched_params",[
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 99baeef..b286d47 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -162,6 +162,7 @@ struct domain_create {
 char *extra_config; /* extra config string */
 const char *restore_file;
 char *colo_proxy_script;
+bool userspace_colo_proxy;
 int migrate_fd; /* -1 means none */
 int send_back_fd; /* -1 means none */
 char **migration_domname_r; /* from malloc */
@@ -3024,6 +3025,8 @@ start:
 params.stream_version =
 (hdr.mandatory_flags & XL_MANDATORY_FLAG_STREAMv2) ? 2 : 1;
 params.colo_proxy_script = dom_info->colo_proxy_script;
+libxl_defbool_set(_colo_proxy,
+  dom_info->userspace_colo_proxy);
 
 ret = libxl_domain_create_restore(ctx, _config,
   , restore_fd,
@@ -4824,7 +4827,8 @@ static void migrate_receive(int debug, int daemonize, int 
monitor,
 int pause_after_migration,
 int send_fd, int recv_fd,
 libxl_checkpointed_stream checkpointed,
-char *colo_proxy_script)
+char *colo_proxy_script,
+bool userspace_colo_proxy)
 {
 uint32_t domid;
 int rc, rc2;
@@ -4852,6 +4856,7 @@ static void migrate_receive(int debug, int daemonize, int 
monitor,
 dom_info.migration_domname_r = _domname;
 dom_info.checkpointed_stream = checkpointed;
 dom_info.colo_proxy_script = colo_proxy_script;
+dom_info.userspace_colo_proxy = userspace_colo_proxy;
 
 rc = create_domain(_info);
 if (rc < 0) {
@@ -5051,11 +5056,13 @@ int main_migrate_receive(int argc, char **argv)
 int debug = 0, daemonize = 1, monitor = 1, pause_after_migration = 0;
 libxl_checkpointed_stream checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;
 int opt;
+bool userspace_colo_proxy = false;
 char *script = NULL;
 static struct option opts[] = {
 {"colo", 0, 0, 0x100},
 /* It is a shame that the management code for disk is not here. */
 {"coloft-script", 1, 0, 0x200},
+{"userspace-colo-proxy", 0, 0, 0x300},
 COMMON_LONG_OPTS
 };
 
@@ -5079,6 +5086,9 @@ int main_migrate_receive(int argc, char **argv)
 case 0x200:
 script = optarg;
 break;
+case 0x300:
+  

[Xen-devel] [PATCH V3 7/7] COLO-Proxy: Use socket to get checkpoint event.

2017-02-16 Thread Zhang Chen
We use kernel colo proxy's way to get the checkpoint event
from qemu colo-compare.
Qemu colo-compare need add a API to support this(I will add this in qemu).

Signed-off-by: Zhang Chen 
---
 tools/libxl/libxl_colo.h |  2 +
 tools/libxl/libxl_colo_proxy.c   | 85 +---
 tools/libxl/libxl_colo_restore.c | 11 --
 tools/libxl/libxl_colo_save.c| 22 +++
 tools/libxl/libxl_nic.c  |  4 ++
 tools/libxl/libxl_types.idl  |  4 +-
 tools/libxl/xl_cmdimpl.c |  4 ++
 7 files changed, 114 insertions(+), 18 deletions(-)

diff --git a/tools/libxl/libxl_colo.h b/tools/libxl/libxl_colo.h
index 4746d8c..6c01b55 100644
--- a/tools/libxl/libxl_colo.h
+++ b/tools/libxl/libxl_colo.h
@@ -69,6 +69,8 @@ struct libxl__colo_proxy_state {
  *  False means use kernel colo proxy.
  */
 bool is_userspace_proxy;
+const char *checkpoint_host;
+const char *checkpoint_port;
 };
 
 struct libxl__colo_save_state {
diff --git a/tools/libxl/libxl_colo_proxy.c b/tools/libxl/libxl_colo_proxy.c
index dd902fc..a6436b0 100644
--- a/tools/libxl/libxl_colo_proxy.c
+++ b/tools/libxl/libxl_colo_proxy.c
@@ -18,9 +18,13 @@
 #include "libxl_internal.h"
 
 #include 
+#include 
+#include 
+#include 
 
 /* Consistent with the new COLO netlink channel in kernel side */
 #define NETLINK_COLO 28
+#define COLO_DEFAULT_WAIT_TIME 50
 
 enum colo_netlink_op {
 COLO_QUERY_CHECKPOINT = (NLMSG_MIN_TYPE + 1),
@@ -76,6 +80,26 @@ static int colo_proxy_send(libxl__colo_proxy_state *cps, 
uint8_t *buff,
 return ret;
 }
 
+static int colo_userspace_proxy_recv(libxl__colo_proxy_state *cps,
+ char *buff,
+ unsigned int timeout_us)
+{
+struct timeval tv;
+int ret;
+
+STATE_AO_GC(cps->ao);
+
+if (timeout_us) {
+tv.tv_sec = timeout_us / 100;
+tv.tv_usec = timeout_us % 100;
+setsockopt(cps->sock_fd, SOL_SOCKET, SO_RCVTIMEO, , sizeof(tv));
+}
+
+ret = recv(cps->sock_fd, buff, sizeof(buff),0);
+
+return ret;
+}
+
 /* error: return -1, otherwise return 0 */
 static int64_t colo_proxy_recv(libxl__colo_proxy_state *cps, uint8_t **buff,
unsigned int timeout_us)
@@ -153,8 +177,45 @@ int colo_proxy_setup(libxl__colo_proxy_state *cps)
 STATE_AO_GC(cps->ao);
 
 /* If enable userspace proxy mode, we don't need setup kernel proxy */
-if (cps->is_userspace_proxy)
+if (cps->is_userspace_proxy) {
+struct sockaddr_in addr;
+int port;
+char recvbuff[1024];
+char sendbuf[] = "COLO_USERSPACE_PROXY_INIT";
+
+memset(, 0, sizeof(addr));
+port = atoi(cps->checkpoint_port);
+addr.sin_family = AF_INET;
+addr.sin_port = htons(port);
+addr.sin_addr.s_addr = inet_addr(cps->checkpoint_host);
+
+skfd = socket(AF_INET, SOCK_STREAM, 0);
+if (skfd < 0) {
+LOGD(ERROR, ao->domid, "can not create a TCP socket: %s",
+ strerror(errno));
+goto out;
+}
+
+cps->sock_fd = skfd;
+
+if (connect(skfd, (struct sockaddr *), sizeof(addr)) < 0) {
+LOGD(ERROR, ao->domid, "connect error");
+goto out;
+}
+
+ret = send(skfd, sendbuf, strlen(sendbuf),0);
+if (ret < 0)
+goto out;
+
+ret = colo_userspace_proxy_recv(cps, recvbuff, COLO_DEFAULT_WAIT_TIME);
+if (ret < 0) {
+LOGD(ERROR, ao->domid, "Can't recv msg from qemu colo-compare: %s",
+ strerror(errno));
+goto out;
+}
+
 return 0;
+}
 
 skfd = socket(PF_NETLINK, SOCK_RAW, NETLINK_COLO);
 if (skfd < 0) {
@@ -247,8 +308,11 @@ void colo_proxy_preresume(libxl__colo_proxy_state *cps)
  * If enable userspace proxy mode,
  * we don't need preresume kernel proxy
  */
-if (cps->is_userspace_proxy)
+if (cps->is_userspace_proxy) {
+char sendbuf[] = "COLO_CHECKPOINT";
+send(cps->sock_fd, sendbuf, strlen(sendbuf),0);
 return;
+}
 
 colo_proxy_send(cps, NULL, 0, COLO_CHECKPOINT);
 /* TODO: need to handle if the call fails... */
@@ -277,16 +341,25 @@ int colo_proxy_checkpoint(libxl__colo_proxy_state *cps,
 struct nlmsghdr *h;
 struct colo_msg *m;
 int ret = -1;
+char recvbuff[1024];
 
 STATE_AO_GC(cps->ao);
 
 /*
- * enable userspace proxy mode, tmp sleep.
- * then we will add qemu API support this func.
+ * enable userspace proxy mode.
+ * Then we will add qemu API support for this func.
  */
 if (cps->is_userspace_proxy) {
-sleep(timeout_us / 100);
-return 0;
+ret = colo_userspace_proxy_recv(cps, recvbuff, timeout_us);
+if (ret <= 0)
+return 0;
+
+if (!strcmp(recvbuff, "DO_CHECKPOINT")) {
+return 1;
+} else 

[Xen-devel] [PATCH V3 5/7] COLO-Proxy: Add primary userspace colo proxy start args

2017-02-16 Thread Zhang Chen
Qemu need this args to start userspace colo-proxy.

Signed-off-by: Zhang Chen 
---
 tools/libxl/libxl_dm.c  | 98 +
 tools/libxl/libxl_nic.c | 78 
 tools/libxl/libxl_types.idl | 31 +-
 tools/libxl/xl_cmdimpl.c| 58 +++
 4 files changed, 264 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 281058d..abd4edd 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -1244,7 +1244,105 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
nics[i].devid, ifname,
libxl_tapif_script(gc),
libxl_tapif_script(gc)));
+
+/* Userspace COLO Proxy need this */
+#define APPEND_COLO_SOCK_SERVER(sock_id, sock_ip, sock_port) ({ \
+if (nics[i].colo_##sock_id &&   \
+nics[i].colo_##sock_ip &&   \
+nics[i].colo_##sock_port) { \
+flexarray_append(dm_args, "-chardev");  \
+flexarray_append(dm_args,   \
+GCSPRINTF("socket,id=%s,host=%s,port=%s,server,nowait", \
+  nics[i].colo_##sock_id,   \
+  nics[i].colo_##sock_ip,   \
+  nics[i].colo_##sock_port));   \
+}   \
+})
+
+#define APPEND_COLO_SOCK_CLIENT(sock_id, sock_ip, sock_port) ({ \
+if (nics[i].colo_##sock_id &&   \
+nics[i].colo_##sock_ip &&   \
+nics[i].colo_##sock_port) { \
+flexarray_append(dm_args, "-chardev");  \
+flexarray_append(dm_args,   \
+GCSPRINTF("socket,id=%s,host=%s,port=%s",   \
+  nics[i].colo_##sock_id,   \
+  nics[i].colo_##sock_ip,   \
+  nics[i].colo_##sock_port));   \
+}   \
+})
+
+if (state->saved_state) {
+/* secondary colo run */
+} else {
+/* primary colo run */
+
+APPEND_COLO_SOCK_SERVER(sock_mirror_id,
+sock_mirror_ip,
+sock_mirror_port);
+
+APPEND_COLO_SOCK_SERVER(sock_compare_pri_in_id,
+sock_compare_pri_in_ip,
+sock_compare_pri_in_port);
+
+APPEND_COLO_SOCK_SERVER(sock_compare_sec_in_id,
+sock_compare_sec_in_ip,
+sock_compare_sec_in_port);
+
+APPEND_COLO_SOCK_SERVER(sock_redirector0_id,
+sock_redirector0_ip,
+sock_redirector0_port);
+
+APPEND_COLO_SOCK_CLIENT(sock_redirector1_id,
+sock_redirector1_ip,
+sock_redirector1_port);
+
+APPEND_COLO_SOCK_CLIENT(sock_redirector2_id,
+sock_redirector2_ip,
+sock_redirector2_port);
+
+if (nics[i].colo_filter_mirror_queue &&
+nics[i].colo_filter_mirror_outdev) {
+flexarray_append(dm_args, "-object");
+flexarray_append(dm_args,
+   
GCSPRINTF("filter-mirror,id=m1,netdev=net%d,queue=%s,outdev=%s",
+ nics[i].devid,
+ nics[i].colo_filter_mirror_queue,
+ nics[i].colo_filter_mirror_outdev));
+}
+if (nics[i].colo_filter_redirector0_queue &&
+nics[i].colo_filter_redirector0_indev) {
+flexarray_append(dm_args, "-object");
+flexarray_append(dm_args,
+   
GCSPRINTF("filter-redirector,id=r1,netdev=net%d,queue=%s,indev=%s",
+ 

[Xen-devel] [PATCH V3 6/7] COLO-Proxy: Add secondary userspace colo-proxy start args

2017-02-16 Thread Zhang Chen
Qemu need this args to start userspace colo-proxy.

Signed-off-by: Zhang Chen 
---
 tools/libxl/libxl_dm.c  | 34 ++
 tools/libxl/libxl_nic.c | 27 +++
 tools/libxl/libxl_types.idl | 15 ++-
 tools/libxl/xl_cmdimpl.c| 27 +++
 4 files changed, 102 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index abd4edd..0fabd64 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -1274,6 +1274,40 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 
 if (state->saved_state) {
 /* secondary colo run */
+
+APPEND_COLO_SOCK_CLIENT(sock_sec_redirector0_id,
+sock_sec_redirector0_ip,
+sock_sec_redirector0_port);
+
+APPEND_COLO_SOCK_CLIENT(sock_sec_redirector1_id,
+sock_sec_redirector1_ip,
+sock_sec_redirector1_port);
+
+if (nics[i].colo_filter_sec_redirector0_queue &&
+nics[i].colo_filter_sec_redirector0_indev) {
+flexarray_append(dm_args, "-object");
+flexarray_append(dm_args,
+   
GCSPRINTF("filter-redirector,id=rs1,netdev=net%d,queue=%s,indev=%s",
+ nics[i].devid,
+ nics[i].colo_filter_sec_redirector0_queue,
+ 
nics[i].colo_filter_sec_redirector0_indev));
+}
+if (nics[i].colo_filter_sec_redirector1_queue &&
+nics[i].colo_filter_sec_redirector1_indev) {
+flexarray_append(dm_args, "-object");
+flexarray_append(dm_args,
+   
GCSPRINTF("filter-redirector,id=rs2,netdev=net%d,queue=%s,outdev=%s",
+ nics[i].devid,
+ nics[i].colo_filter_sec_redirector1_queue,
+ 
nics[i].colo_filter_sec_redirector1_outdev));
+}
+if (nics[i].colo_filter_sec_rewriter0_queue) {
+flexarray_append(dm_args, "-object");
+flexarray_append(dm_args,
+   
GCSPRINTF("filter-rewriter,id=rs3,netdev=net%d,queue=%s",
+ nics[i].devid,
+ nics[i].colo_filter_sec_rewriter0_queue));
+}
 } else {
 /* primary colo run */
 
diff --git a/tools/libxl/libxl_nic.c b/tools/libxl/libxl_nic.c
index 7c57bcf..5e1fecd 100644
--- a/tools/libxl/libxl_nic.c
+++ b/tools/libxl/libxl_nic.c
@@ -233,6 +233,20 @@ static void libxl__device_nic_add(libxl__egc *egc, 
uint32_t domid,
 MAYBE_ADD_COLO_ARGS(compare_sec_in);
 MAYBE_ADD_COLO_ARGS(compare_out);
 
+MAYBE_ADD_COLO_ARGS(sock_sec_redirector0_id);
+MAYBE_ADD_COLO_ARGS(sock_sec_redirector0_ip);
+MAYBE_ADD_COLO_ARGS(sock_sec_redirector0_port);
+MAYBE_ADD_COLO_ARGS(sock_sec_redirector1_id);
+MAYBE_ADD_COLO_ARGS(sock_sec_redirector1_ip);
+MAYBE_ADD_COLO_ARGS(sock_sec_redirector1_port);
+MAYBE_ADD_COLO_ARGS(filter_sec_redirector0_queue);
+MAYBE_ADD_COLO_ARGS(filter_sec_redirector0_indev);
+MAYBE_ADD_COLO_ARGS(filter_sec_redirector0_outdev);
+MAYBE_ADD_COLO_ARGS(filter_sec_redirector1_queue);
+MAYBE_ADD_COLO_ARGS(filter_sec_redirector1_indev);
+MAYBE_ADD_COLO_ARGS(filter_sec_redirector1_outdev);
+MAYBE_ADD_COLO_ARGS(filter_sec_rewriter0_queue);
+
 #undef MAYBE_ADD_COLO_ARGS
 
 flexarray_append(back, "mac");
@@ -424,6 +438,19 @@ static int libxl__device_nic_from_xenstore(libxl__gc *gc,
 CHECK_COLO_ARGS(compare_pri_in);
 CHECK_COLO_ARGS(compare_sec_in);
 CHECK_COLO_ARGS(compare_out);
+CHECK_COLO_ARGS(sock_sec_redirector0_id);
+CHECK_COLO_ARGS(sock_sec_redirector0_ip);
+CHECK_COLO_ARGS(sock_sec_redirector0_port);
+CHECK_COLO_ARGS(sock_sec_redirector1_id);
+CHECK_COLO_ARGS(sock_sec_redirector1_ip);
+CHECK_COLO_ARGS(sock_sec_redirector1_port);
+CHECK_COLO_ARGS(filter_sec_redirector0_queue);
+CHECK_COLO_ARGS(filter_sec_redirector0_indev);
+CHECK_COLO_ARGS(filter_sec_redirector0_outdev);
+CHECK_COLO_ARGS(filter_sec_redirector1_queue);
+CHECK_COLO_ARGS(filter_sec_redirector1_indev);
+CHECK_COLO_ARGS(filter_sec_redirector1_outdev);
+CHECK_COLO_ARGS(filter_sec_rewriter0_queue);
 
 #undef CHECK_COLO_ARGS
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 07ce345..47e96b1 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -658,7 

[Xen-devel] [PATCH V3 2/7] COLO-Proxy: Setup userspace colo-proxy on primary side

2017-02-16 Thread Zhang Chen
In this patch we close kernel COLO-Proxy on primary side.

Signed-off-by: Zhang Chen 
---
 tools/libxl/libxl_colo_proxy.c | 27 +++
 tools/libxl/libxl_colo_save.c  |  9 +++--
 2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_colo_proxy.c b/tools/libxl/libxl_colo_proxy.c
index 0983f42..dd902fc 100644
--- a/tools/libxl/libxl_colo_proxy.c
+++ b/tools/libxl/libxl_colo_proxy.c
@@ -152,6 +152,10 @@ int colo_proxy_setup(libxl__colo_proxy_state *cps)
 
 STATE_AO_GC(cps->ao);
 
+/* If enable userspace proxy mode, we don't need setup kernel proxy */
+if (cps->is_userspace_proxy)
+return 0;
+
 skfd = socket(PF_NETLINK, SOCK_RAW, NETLINK_COLO);
 if (skfd < 0) {
 LOGD(ERROR, ao->domid, "can not create a netlink socket: %s", 
strerror(errno));
@@ -222,6 +226,13 @@ out:
 
 void colo_proxy_teardown(libxl__colo_proxy_state *cps)
 {
+/*
+ * If enable userspace proxy mode,
+ * we don't need teardown kernel proxy
+ */
+if (cps->is_userspace_proxy)
+return;
+
 if (cps->sock_fd >= 0) {
 close(cps->sock_fd);
 cps->sock_fd = -1;
@@ -232,6 +243,13 @@ void colo_proxy_teardown(libxl__colo_proxy_state *cps)
 
 void colo_proxy_preresume(libxl__colo_proxy_state *cps)
 {
+/*
+ * If enable userspace proxy mode,
+ * we don't need preresume kernel proxy
+ */
+if (cps->is_userspace_proxy)
+return;
+
 colo_proxy_send(cps, NULL, 0, COLO_CHECKPOINT);
 /* TODO: need to handle if the call fails... */
 }
@@ -262,6 +280,15 @@ int colo_proxy_checkpoint(libxl__colo_proxy_state *cps,
 
 STATE_AO_GC(cps->ao);
 
+/*
+ * enable userspace proxy mode, tmp sleep.
+ * then we will add qemu API support this func.
+ */
+if (cps->is_userspace_proxy) {
+sleep(timeout_us / 100);
+return 0;
+}
+
 size = colo_proxy_recv(cps, , timeout_us);
 
 /* timeout, return no checkpoint message. */
diff --git a/tools/libxl/libxl_colo_save.c b/tools/libxl/libxl_colo_save.c
index eb8336c..91e3fce 100644
--- a/tools/libxl/libxl_colo_save.c
+++ b/tools/libxl/libxl_colo_save.c
@@ -110,8 +110,13 @@ void libxl__colo_save_setup(libxl__egc *egc, 
libxl__colo_save_state *css)
 css->colo_proxy_script = GCSPRINTF("%s/colo-proxy-setup",
libxl__xen_script_dir_path());
 
-cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VIF) |
- (1 << LIBXL__DEVICE_KIND_VBD);
+/* If enable userspace proxy mode, we don't need VIF */
+if (css->cps.is_userspace_proxy)
+cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VBD);
+else
+cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VIF) |
+ (1 << LIBXL__DEVICE_KIND_VBD);
+
 cds->ops = colo_ops;
 cds->callback = colo_save_setup_done;
 cds->ao = ao;
-- 
2.7.4




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-4.5-testing test] 105847: tolerable FAIL - PUSHED

2017-02-16 Thread osstest service owner
flight 105847 xen-4.5-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105847/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl7 host-ping-check-xen fail in 105827 pass in 105847
 test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail in 105827 
pass in 105847
 test-amd64-amd64-xl-qemuu-winxpsp3  9 windows-install  fail pass in 105827

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  6 xen-boot fail  like 104590
 test-xtf-amd64-amd64-2   53 leak-check/check fail  like 104590
 test-xtf-amd64-amd64-3   53 leak-check/check fail  like 104590
 test-xtf-amd64-amd64-4   53 leak-check/check fail  like 104590
 test-xtf-amd64-amd64-5   53 leak-check/check fail  like 104590
 test-xtf-amd64-amd64-1   53 leak-check/check fail  like 104590
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 104590
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 15 guest-localmigrate/x10 fail like 
104590
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 104590
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 104590
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 104590

Tests which did not succeed, but are not blocking:
 test-xtf-amd64-amd64-2   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-2 30 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-2 36 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-2   40 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-4   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-5   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-1   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-4 30 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-4 36 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-4   40 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-5 30 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-1 30 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-5 36 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-1 36 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-5   40 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-1   40 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-xtf-amd64-amd64-2   52 xtf/test-hvm64-xsa-195   fail   never pass
 test-xtf-amd64-amd64-3   52 xtf/test-hvm64-xsa-195   fail   never pass
 test-xtf-amd64-amd64-4   52 xtf/test-hvm64-xsa-195   fail   never pass
 test-xtf-amd64-amd64-5   52 xtf/test-hvm64-xsa-195   fail   never pass
 test-xtf-amd64-amd64-1   52 xtf/test-hvm64-xsa-195   fail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  10 guest-start  fail   never pass
 test-armhf-armhf-libvirt-raw 10 guest-start  fail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  43d06efb724d32a70b1cc9973d7cdcbbb5d96105
baseline version:
 xen  2b17bf45470bf742d78a22116e3b7ec1a3213c45

Last test of basis   

[Xen-devel] [xen-unstable-smoke test] 105864: regressions - trouble: broken/fail/pass

2017-02-16 Thread osstest service owner
flight 105864 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105864/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt 14 guest-saverestorefail REGR. vs. 105852
 test-armhf-armhf-xl 15 guest-start/debian.repeat fail REGR. vs. 105852
 test-amd64-amd64-xl-qemuu-debianhvm-i386 12 guest-saverestore fail REGR. vs. 
105852

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  378384399ed661bed711221a5d8dbdac66b8e851
baseline version:
 xen  7127d53fe891f9ea67357587a33a7aaba4b55f45

Last test of basis   105852  2017-02-16 14:01:33 Z0 days
Failing since105857  2017-02-16 16:01:30 Z0 days5 attempts
Testing same since   105862  2017-02-16 22:01:53 Z0 days2 attempts


People who touched revisions under test:
  Andrew Cooper 
  Daniel Kiper 
  Jan Beulich 
  Julien Grall 
  Stefano Stabellini 

jobs:
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-amd64-libvirt  pass
 build-arm64-pvopsfail
 test-armhf-armhf-xl  fail
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 fail
 test-amd64-amd64-libvirt fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 378384399ed661bed711221a5d8dbdac66b8e851
Author: Stefano Stabellini 
Date:   Fri Feb 10 18:05:22 2017 -0800

arm: read/write rank->vcpu atomically

We don't need a lock in vgic_get_target_vcpu anymore, solving the
following lock inversion bug: the rank lock should be taken first, then
the vgic lock. However, gic_update_one_lr is called with the vgic lock
held, and it calls vgic_get_target_vcpu, which tries to obtain the rank
lock.

Coverity-ID: 1381855
Coverity-ID: 1381853

Signed-off-by: Stefano Stabellini 
Reviewed-by: Julien Grall 

commit 79903e50dba9e7442c9b7ca424661bb020e9dbf2
Author: Jan Beulich 
Date:   Thu Feb 16 18:11:42 2017 +0100

x86emul: catch exceptions occurring in stubs

Before adding more use of stubs cloned from decoded guest insns, guard
ourselves against mistakes there: Should an exception (with the
noteworthy exception of #PF) occur inside the stub, forward it to the
guest.

Since the exception fixup table entry can't encode the address of the
faulting insn itself, attach it to the return address instead. This at
once provides a convenient place to hand the exception information
back: The return address is being overwritten by it before branching to
the recovery code.

Take the opportunity and (finally!) add symbol resolution to the
respective log messages (the new one is intentionally not being coded
that way, as it covers stub addresses only, which don't have symbols
associated).

Also take the opportunity and make search_one_extable() static again.

Suggested-by: Andrew Cooper 
Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

commit 8c935f5ff1cac422b4de21cbab69e13d2ebb25be
Author: Daniel Kiper 
Date:   Thu Feb 16 18:10:04 2017 +0100

Re: [Xen-devel] [PATCH] ACPICA: ACPI 6.0: Add support for IORT table.

2017-02-16 Thread Stefano Stabellini
On Wed, 15 Feb 2017, Sameer Goel wrote:
> From: Lv Zheng 
> 
> ACPICA commit 5de82757aef5d6163e37064033aacbce193abbca
> 
> This patch adds support for IORT (IO Remapping Table) in iasl.
> 
> Note that some field names are modified to shrink their length or the
> decompiled IORT ASL will contain fields with ugly ":" alignment.
> 
> The IORT contains field definitions around "Memory Access Properties". This
> patch also adds support to encode/decode it using inline table.
> 
> This patch doesn't add inline table support for the SMMU interrupt fields
> due to a limitation in current ACPICA data table support. Lv Zheng.
> 
> Link: https://github.com/acpica/acpica/commit/5de82757
> Signed-off-by: Lv Zheng 
> Signed-off-by: Bob Moore 
> Signed-off-by: Rafael J. Wysocki 
> [Linux commit 874f6a723e56d0da9e481629b17482bcd3801ecf]
> [only port the IORT changes]
> Signed-off-by: Sameer Goel 

if this is a straight port from linux, then please don't make other
changes on top of it such as:


> @@ -5,7 +5,7 @@
>   
> */
>  
>  /*
> - * Copyright (C) 2000 - 2011, Intel Corp.
> + * Copyright (C) 2000 - 2016, Intel Corp.
>   * All rights reserved.
>   *
>   * Redistribution and use in source and binary forms, with or without
> @@ -67,7 +67,8 @@
>  #define ACPI_SIG_DBGP   "DBGP"   /* Debug Port table */
>  #define ACPI_SIG_DMAR   "DMAR"   /* DMA Remapping table */
>  #define ACPI_SIG_HPET   "HPET"   /* High Precision Event Timer 
> table */
> -#define ACPI_SIG_IBFT   "IBFT"   /* i_sCSI Boot Firmware Table */
> +#define ACPI_SIG_IBFT   "IBFT"   /* iSCSI Boot Firmware Table */

[...]

> +enum acpi_iort_node_type {
> + ACPI_IORT_NODE_ITS_GROUP = 0x00,
> + ACPI_IORT_NODE_NAMED_COMPONENT = 0x01,
> + ACPI_IORT_NODE_PCI_ROOT_COMPLEX = 0x02,
> + ACPI_IORT_NODE_SMMU = 0x03,
> + ACPI_IORT_NODE_SMMU_V3 = 0x04

SMMU_V3 wasn't present in 874f6a723e56d0da9e481629b17482bcd3801ecf 


> +struct acpi_iort_smmu_v3 {
> + u64 base_address;   /* SMMUv3 base address */
> + u32 flags;
> + u32 reserved;
> + u64 vatos_address;
> + u32 model;  /* O: generic SMMUv3 */
> + u32 event_gsiv;
> + u32 pri_gsiv;
> + u32 gerr_gsiv;
> + u32 sync_gsiv;
> +};
> +
> +/* Masks for Flags field above */
> +
> +#define ACPI_IORT_SMMU_V3_COHACC_OVERRIDE   (1)
> +#define ACPI_IORT_SMMU_V3_HTTU_OVERRIDE (1<<1)

Same here.

I think you also need 4ac78baf88d85c49883fcc87d31198ebe408e54d

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/4] x86/mm: Adapt MODULES_END based on Fixmap section size

2017-02-16 Thread kbuild test robot
Hi Thomas,

[auto build test ERROR on next-20170216]
[also build test ERROR on v4.10-rc8]
[cannot apply to tip/x86/core kvm/linux-next tip/auto-latest v4.9-rc8 v4.9-rc7 
v4.9-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Thomas-Garnier/x86-mm-Adapt-MODULES_END-based-on-Fixmap-section-size/20170217-072759
config: x86_64-randconfig-s4-02170325 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   In file included from arch/x86/include/asm/pgtable_types.h:240:0,
from arch/x86/include/asm/paravirt_types.h:44,
from arch/x86/include/asm/ptrace.h:71,
from arch/x86/include/asm/math_emu.h:4,
from arch/x86/include/asm/processor.h:11,
from arch/x86/include/asm/cpufeature.h:4,
from arch/x86/include/asm/thread_info.h:52,
from include/linux/thread_info.h:25,
from arch/x86/include/asm/preempt.h:6,
from include/linux/preempt.h:59,
from include/linux/spinlock.h:50,
from include/linux/wait.h:8,
from include/linux/fs.h:5,
from include/linux/debugfs.h:18,
from arch/x86/mm/dump_pagetables.c:15:
>> arch/x86/include/asm/pgtable_64_types.h:71:23: error: implicit declaration 
>> of function '__fix_to_virt' [-Werror=implicit-function-declaration]
#define MODULES_END   __fix_to_virt(__end_of_fixed_addresses + 1)
  ^
   arch/x86/mm/dump_pagetables.c:87:4: note: in expansion of macro 'MODULES_END'
 { MODULES_END,  "End Modules" },
   ^~~
   arch/x86/include/asm/pgtable_64_types.h:71:37: error: 
'__end_of_fixed_addresses' undeclared here (not in a function)
#define MODULES_END   __fix_to_virt(__end_of_fixed_addresses + 1)
^
   arch/x86/mm/dump_pagetables.c:87:4: note: in expansion of macro 'MODULES_END'
 { MODULES_END,  "End Modules" },
   ^~~
   cc1: some warnings being treated as errors

vim +/__fix_to_virt +71 arch/x86/include/asm/pgtable_64_types.h

65  #define VMALLOC_START   __VMALLOC_BASE
66  #define VMEMMAP_START   __VMEMMAP_BASE
67  #endif /* CONFIG_RANDOMIZE_MEMORY */
68  #define VMALLOC_END (VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 
1, UL))
69  #define MODULES_VADDR(__START_KERNEL_map + KERNEL_IMAGE_SIZE)
70  /* The module sections ends with the start of the fixmap */
  > 71  #define MODULES_END   __fix_to_virt(__end_of_fixed_addresses + 1)
72  #define MODULES_LEN   (MODULES_END - MODULES_VADDR)
73  #define ESPFIX_PGD_ENTRY _AC(-2, UL)
74  #define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << PGDIR_SHIFT)

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/4] x86/mm: Adapt MODULES_END based on Fixmap section size

2017-02-16 Thread kbuild test robot
Hi Thomas,

[auto build test ERROR on next-20170216]
[also build test ERROR on v4.10-rc8]
[cannot apply to tip/x86/core kvm/linux-next tip/auto-latest v4.9-rc8 v4.9-rc7 
v4.9-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Thomas-Garnier/x86-mm-Adapt-MODULES_END-based-on-Fixmap-section-size/20170217-072759
config: x86_64-randconfig-h0-02170640 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All error/warnings (new ones prefixed by >>):

>> arch/x86/mm/dump_pagetables.c:87:2: error: implicit declaration of function 
>> '__fix_to_virt' [-Werror=implicit-function-declaration]
 { MODULES_END,  "End Modules" },
 ^
   In file included from arch/x86/include/asm/pgtable_types.h:240:0,
from arch/x86/include/asm/paravirt_types.h:44,
from arch/x86/include/asm/ptrace.h:71,
from arch/x86/include/asm/math_emu.h:4,
from arch/x86/include/asm/processor.h:11,
from arch/x86/include/asm/cpufeature.h:4,
from arch/x86/include/asm/thread_info.h:52,
from include/linux/thread_info.h:25,
from arch/x86/include/asm/preempt.h:6,
from include/linux/preempt.h:59,
from include/linux/spinlock.h:50,
from include/linux/wait.h:8,
from include/linux/fs.h:5,
from include/linux/debugfs.h:18,
from arch/x86/mm/dump_pagetables.c:15:
>> arch/x86/include/asm/pgtable_64_types.h:71:37: error: 
>> '__end_of_fixed_addresses' undeclared here (not in a function)
#define MODULES_END   __fix_to_virt(__end_of_fixed_addresses + 1)
^
>> arch/x86/mm/dump_pagetables.c:87:4: note: in expansion of macro 'MODULES_END'
 { MODULES_END,  "End Modules" },
   ^
   cc1: some warnings being treated as errors

vim +/__fix_to_virt +87 arch/x86/mm/dump_pagetables.c

926e5392b Arjan van de Ven 2008-04-17   9   * This program is free software; 
you can redistribute it and/or
926e5392b Arjan van de Ven 2008-04-17  10   * modify it under the terms of the 
GNU General Public License
926e5392b Arjan van de Ven 2008-04-17  11   * as published by the Free Software 
Foundation; version 2
926e5392b Arjan van de Ven 2008-04-17  12   * of the License.
926e5392b Arjan van de Ven 2008-04-17  13   */
926e5392b Arjan van de Ven 2008-04-17  14  
fe770bf03 H. Peter Anvin   2008-04-17 @15  #include 
fe770bf03 H. Peter Anvin   2008-04-17  16  #include 
84e629b66 Paul Gortmaker   2016-07-13  17  #include 
146fbb766 Andrey Ryabinin  2017-02-10  18  #include 
926e5392b Arjan van de Ven 2008-04-17  19  #include 
926e5392b Arjan van de Ven 2008-04-17  20  
926e5392b Arjan van de Ven 2008-04-17  21  #include 
926e5392b Arjan van de Ven 2008-04-17  22  
926e5392b Arjan van de Ven 2008-04-17  23  /*
926e5392b Arjan van de Ven 2008-04-17  24   * The dumper groups pagetable 
entries of the same type into one, and for
926e5392b Arjan van de Ven 2008-04-17  25   * that it needs to keep some state 
when walking, and flush this state
926e5392b Arjan van de Ven 2008-04-17  26   * when a "break" in the continuity 
is found.
926e5392b Arjan van de Ven 2008-04-17  27   */
926e5392b Arjan van de Ven 2008-04-17  28  struct pg_state {
926e5392b Arjan van de Ven 2008-04-17  29   int level;
926e5392b Arjan van de Ven 2008-04-17  30   pgprot_t current_prot;
926e5392b Arjan van de Ven 2008-04-17  31   unsigned long start_address;
926e5392b Arjan van de Ven 2008-04-17  32   unsigned long current_address;
fe770bf03 H. Peter Anvin   2008-04-17  33   const struct addr_marker 
*marker;
3891a04aa H. Peter Anvin   2014-04-29  34   unsigned long lines;
ef6bea6dd Borislav Petkov  2014-01-18  35   bool to_dmesg;
e1a58320a Stephen Smalley  2015-10-05  36   bool check_wx;
e1a58320a Stephen Smalley  2015-10-05  37   unsigned long wx_pages;
926e5392b Arjan van de Ven 2008-04-17  38  };
926e5392b Arjan van de Ven 2008-04-17  39  
fe770bf03 H. Peter Anvin   2008-04-17  40  struct addr_marker {
fe770bf03 H. Peter Anvin   2008-04-17  41   unsigned long start_address;
fe770bf03 H. Peter Anvin   2008-04-17  42   const char *name;
3891a04aa H. Peter Anvin   2014-04-29  43   unsigned long max_lines;
fe770bf03 H. Peter Anvin   2008-04-17  44  };
926e5392b Arjan van de Ven 2008-04-17  45  
92851e2fc Andres Salomon   2010-07-20  46  /* indices for address_markers; keep 
sync'd w/ address_markers below */
92851e2fc Andres Salomon   2010-07-20  47  enum address_markers_idx {
92851e2fc Andres Salomon   2010-07-20  48   USER_SPACE_NR = 0,
92851e2fc Andres Salomon   20

[Xen-devel] [ovmf test] 105854: all pass - PUSHED

2017-02-16 Thread osstest service owner
flight 105854 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105854/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf b173ad78519b2ade309019614b52e1453727e20d
baseline version:
 ovmf fd12acdeff7a04ad34ccb95103eb6204b8901749

Last test of basis   105837  2017-02-16 03:14:57 Z0 days
Testing same since   105854  2017-02-16 15:15:35 Z0 days1 attempts


People who touched revisions under test:
  Haojian Zhuang 
  Jiaxin Wu 
  Ruiyu Ni 
  Wu Jiaxin 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=b173ad78519b2ade309019614b52e1453727e20d
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
b173ad78519b2ade309019614b52e1453727e20d
+ branch=ovmf
+ revision=b173ad78519b2ade309019614b52e1453727e20d
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.8-testing
+ '[' xb173ad78519b2ade309019614b52e1453727e20d = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git
++ 

[Xen-devel] [xen-unstable-smoke test] 105862: regressions - trouble: broken/fail/pass

2017-02-16 Thread osstest service owner
flight 105862 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105862/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt 14 guest-saverestorefail REGR. vs. 105852
 test-armhf-armhf-xl 15 guest-start/debian.repeat fail REGR. vs. 105852
 test-amd64-amd64-xl-qemuu-debianhvm-i386 12 guest-saverestore fail REGR. vs. 
105852

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  378384399ed661bed711221a5d8dbdac66b8e851
baseline version:
 xen  7127d53fe891f9ea67357587a33a7aaba4b55f45

Last test of basis   105852  2017-02-16 14:01:33 Z0 days
Failing since105857  2017-02-16 16:01:30 Z0 days4 attempts
Testing same since   105862  2017-02-16 22:01:53 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Daniel Kiper 
  Jan Beulich 
  Julien Grall 
  Stefano Stabellini 

jobs:
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-amd64-libvirt  pass
 build-arm64-pvopsfail
 test-armhf-armhf-xl  fail
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 fail
 test-amd64-amd64-libvirt fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 378384399ed661bed711221a5d8dbdac66b8e851
Author: Stefano Stabellini 
Date:   Fri Feb 10 18:05:22 2017 -0800

arm: read/write rank->vcpu atomically

We don't need a lock in vgic_get_target_vcpu anymore, solving the
following lock inversion bug: the rank lock should be taken first, then
the vgic lock. However, gic_update_one_lr is called with the vgic lock
held, and it calls vgic_get_target_vcpu, which tries to obtain the rank
lock.

Coverity-ID: 1381855
Coverity-ID: 1381853

Signed-off-by: Stefano Stabellini 
Reviewed-by: Julien Grall 

commit 79903e50dba9e7442c9b7ca424661bb020e9dbf2
Author: Jan Beulich 
Date:   Thu Feb 16 18:11:42 2017 +0100

x86emul: catch exceptions occurring in stubs

Before adding more use of stubs cloned from decoded guest insns, guard
ourselves against mistakes there: Should an exception (with the
noteworthy exception of #PF) occur inside the stub, forward it to the
guest.

Since the exception fixup table entry can't encode the address of the
faulting insn itself, attach it to the return address instead. This at
once provides a convenient place to hand the exception information
back: The return address is being overwritten by it before branching to
the recovery code.

Take the opportunity and (finally!) add symbol resolution to the
respective log messages (the new one is intentionally not being coded
that way, as it covers stub addresses only, which don't have symbols
associated).

Also take the opportunity and make search_one_extable() static again.

Suggested-by: Andrew Cooper 
Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

commit 8c935f5ff1cac422b4de21cbab69e13d2ebb25be
Author: Daniel Kiper 
Date:   Thu Feb 16 18:10:04 2017 +0100

Re: [Xen-devel] [PATCH v4 2/2] arm: proper ordering for correct execution of gic_update_one_lr and vgic_store_itargetsr

2017-02-16 Thread Stefano Stabellini
On Thu, 16 Feb 2017, Julien Grall wrote:
> On 16/02/2017 22:10, Stefano Stabellini wrote:
> > On Thu, 16 Feb 2017, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 11/02/17 02:05, Stefano Stabellini wrote:
> > > > Concurrent execution of gic_update_one_lr and vgic_store_itargetsr can
> > > > result in the wrong pcpu being set as irq target, see
> > > > http://marc.info/?l=xen-devel=148218667104072.
> > > > 
> > > > To solve the issue, add barriers, remove an irq from the inflight
> > > > queue, only after the affinity has been set. On the other end, write the
> > > > new vcpu target, before checking GIC_IRQ_GUEST_MIGRATING and inflight.
> > > > 
> > > > Signed-off-by: Stefano Stabellini 
> > > > ---
> > > >  xen/arch/arm/gic.c | 3 ++-
> > > >  xen/arch/arm/vgic-v2.c | 4 ++--
> > > >  xen/arch/arm/vgic-v3.c | 4 +++-
> > > >  3 files changed, 7 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> > > > index a5348f2..bb52959 100644
> > > > --- a/xen/arch/arm/gic.c
> > > > +++ b/xen/arch/arm/gic.c
> > > > @@ -503,12 +503,13 @@ static void gic_update_one_lr(struct vcpu *v, int
> > > > i)
> > > >   !test_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
> > > >  gic_raise_guest_irq(v, irq, p->priority);
> > > >  else {
> > > > -list_del_init(>inflight);
> > > >  if ( test_and_clear_bit(GIC_IRQ_GUEST_MIGRATING,
> > > > >status) )
> > > >  {
> > > >  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
> > > >  irq_set_affinity(p->desc,
> > > > cpumask_of(v_target->processor));
> > > >  }
> > > > +smp_mb();
> > > > +list_del_init(>inflight);
> > > 
> > > I don't understand why you remove from the inflight list afterwards. If
> > > you do
> > > that you introduce that same problem as discussed in
> > > <7a78c859-fa6f-ba10-b574-d8edd46ea...@arm.com>
> > > 
> > > As long as the interrupt is routed to the pCPU running gic_update_one_lr,
> > > the
> > > interrupt cannot fired because the interrupts are masked.
> > 
> > This is not accurate: it is possible to receive a second interrupt
> > notification while the first one is still active.
> 
> Can you detail here? Because if you look at how gic_update_one_lr is called
> from gic_clear_lrs, interrupts are masked.
> 
> So it cannot be received by Xen while you are in gic_update_one_lr and before
> irq_set_affinity is called.

Yes, you are right, I meant generally. In this case, it is as you say.


> > > However, as soon as irq_set_affinity is called the interrupt may fire
> > > on the other pCPU.
> > 
> > This is true.
> > 
> > 
> > > However, list_del_init is not atomic and not protected by any lock. So
> > > vgic_vcpu_inject_irq may see a corrupted version of {p,n}->inflight.
> > > 
> > > Did I miss anything?
> > 
> > Moving list_del_init later ensures that there are no conflicts between
> > gic_update_one_lr and vgic_store_itargetsr (more specifically,
> > vgic_migrate_irq). If you look at the implementation of
> > vgic_migrate_irq, all checks depends on list_empty(>inflight). If we
> > don't move list_del_init later, given that vgic_migrate_irq can be
> > called with a different vgic lock taken than gic_update_one_lr, the
> > following scenario can happen:
> > 
> > 
> > 
> >   CPU0: gic_update_one_lr   CPU1: vgic_store_itargetsr
> >   --
> >   remove from inflight
> >   clear GIC_IRQ_GUEST_MIGRATING
> >   read rank->vcpu (intermediate)
> 
> It is only true if vgic_store_itargetsr is testing GIC_IRQ_GUEST_MIGRATING
> here and it was clear.

Right, that's the scenario, see the right colum. If you meant to say
something else, I couldn't understand, sorry.

I have been playing with rearranging these few lines of code in
gic_update_one_lr and vgic_store_itargetsr/vgic_migrate_irq quite a bit,
but I couldn't figure out a way to solve all races. This patch is one of
the best outcomes I found. If you can figure out a way to rearrange this
code to not be racy, but still lockless, let me know!


> > set rank->vcpu (final)
> > vgic_migrate_irq
> >   if (!inflight) irq_set_affinity
> > (final)
> >   irq_set_affinity (intermediate)
> > 
> > 
> > As a result, the irq affinity is set to the wrong cpu. With this patch,
> > this problem doesn't occur.
> > 
> > However, you are right that both in the case of gic_update_one_lr and
> > vgic_migrate_irq, as well as the case of gic_update_one_lr and
> > vgic_vcpu_inject_irq that you mentioned, list_del_init (from
> > gic_update_one_lr) is potentially run as the same time as list_empty
> > (from vgic_migrate_irq or from vgic_vcpu_inject_irq), and they are not
> > atomic.
> > 
> > Also see this other potential issue:
> > 

Re: [Xen-devel] [PATCH v4 2/2] arm: proper ordering for correct execution of gic_update_one_lr and vgic_store_itargetsr

2017-02-16 Thread Julien Grall



On 16/02/2017 22:10, Stefano Stabellini wrote:

On Thu, 16 Feb 2017, Julien Grall wrote:

Hi Stefano,

On 11/02/17 02:05, Stefano Stabellini wrote:

Concurrent execution of gic_update_one_lr and vgic_store_itargetsr can
result in the wrong pcpu being set as irq target, see
http://marc.info/?l=xen-devel=148218667104072.

To solve the issue, add barriers, remove an irq from the inflight
queue, only after the affinity has been set. On the other end, write the
new vcpu target, before checking GIC_IRQ_GUEST_MIGRATING and inflight.

Signed-off-by: Stefano Stabellini 
---
 xen/arch/arm/gic.c | 3 ++-
 xen/arch/arm/vgic-v2.c | 4 ++--
 xen/arch/arm/vgic-v3.c | 4 +++-
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index a5348f2..bb52959 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -503,12 +503,13 @@ static void gic_update_one_lr(struct vcpu *v, int i)
  !test_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
 gic_raise_guest_irq(v, irq, p->priority);
 else {
-list_del_init(>inflight);
 if ( test_and_clear_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
 {
 struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
 irq_set_affinity(p->desc, cpumask_of(v_target->processor));
 }
+smp_mb();
+list_del_init(>inflight);


I don't understand why you remove from the inflight list afterwards. If you do
that you introduce that same problem as discussed in
<7a78c859-fa6f-ba10-b574-d8edd46ea...@arm.com>

As long as the interrupt is routed to the pCPU running gic_update_one_lr, the
interrupt cannot fired because the interrupts are masked.


This is not accurate: it is possible to receive a second interrupt
notification while the first one is still active.


Can you detail here? Because if you look at how gic_update_one_lr is 
called from gic_clear_lrs, interrupts are masked.


So it cannot be received by Xen while you are in gic_update_one_lr and 
before irq_set_affinity is called.






However, as soon as irq_set_affinity is called the interrupt may fire
on the other pCPU.


This is true.



However, list_del_init is not atomic and not protected by any lock. So
vgic_vcpu_inject_irq may see a corrupted version of {p,n}->inflight.

Did I miss anything?


Moving list_del_init later ensures that there are no conflicts between
gic_update_one_lr and vgic_store_itargetsr (more specifically,
vgic_migrate_irq). If you look at the implementation of
vgic_migrate_irq, all checks depends on list_empty(>inflight). If we
don't move list_del_init later, given that vgic_migrate_irq can be
called with a different vgic lock taken than gic_update_one_lr, the
following scenario can happen:



  CPU0: gic_update_one_lr   CPU1: vgic_store_itargetsr
  --
  remove from inflight
  clear GIC_IRQ_GUEST_MIGRATING
  read rank->vcpu (intermediate)


It is only true if vgic_store_itargetsr is testing 
GIC_IRQ_GUEST_MIGRATING here and it was clear.



set rank->vcpu (final)
vgic_migrate_irq
  if (!inflight) irq_set_affinity 
(final)
  irq_set_affinity (intermediate)


As a result, the irq affinity is set to the wrong cpu. With this patch,
this problem doesn't occur.

However, you are right that both in the case of gic_update_one_lr and
vgic_migrate_irq, as well as the case of gic_update_one_lr and
vgic_vcpu_inject_irq that you mentioned, list_del_init (from
gic_update_one_lr) is potentially run as the same time as list_empty
(from vgic_migrate_irq or from vgic_vcpu_inject_irq), and they are not
atomic.

Also see this other potential issue: 
http://marc.info/?l=xen-devel=148703220714075

All these concurrent accesses are difficult to understand and to deal
with. This is why my original suggestion was to use the old vcpu vgic
lock, rather then try to ensure safe concurrent accesses everywhere.
That option is still open and would solve both problems.
We only need to:

- store the vcpu to which an irq is currently injected
http://marc.info/?l=xen-devel=148237295020488
- check the new irq->vcpu field, and take the right vgic lock
something like http://marc.info/?l=xen-devel=148237295920492=2, but
would need improvements

Much simpler, right?


Would not it be easier to just take the desc->lock to protect the 
concurrent access?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-4.6-testing baseline-only test] 68567: regressions - FAIL

2017-02-16 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 68567 xen-4.6-testing real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/68567/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-xtf-amd64-amd64-320 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 68545
 test-xtf-amd64-amd64-3 32 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 68545
 test-xtf-amd64-amd64-343 xtf/test-hvm64-invlpg~shadow fail REGR. vs. 68545
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-boot  fail REGR. vs. 68545
 test-armhf-armhf-libvirt 15 guest-start/debian.repeat fail REGR. vs. 68545

Regressions which are regarded as allowable (not blocking):
 test-xtf-amd64-amd64-1   20 xtf/test-hvm32-invlpg~shadow fail   like 68545
 test-xtf-amd64-amd64-1  32 xtf/test-hvm32pae-invlpg~shadow fail like 68545
 test-xtf-amd64-amd64-1   43 xtf/test-hvm64-invlpg~shadow fail   like 68545
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail   like 68545
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail   like 68545
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail   like 68545
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 68545
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 68545
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 68545
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail like 68545

Tests which did not succeed, but are not blocking:
 test-xtf-amd64-amd64-4   62 xtf/test-pv32pae-xsa-194 fail   never pass
 test-xtf-amd64-amd64-5   62 xtf/test-pv32pae-xsa-194 fail   never pass
 test-xtf-amd64-amd64-1   62 xtf/test-pv32pae-xsa-194 fail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-xtf-amd64-amd64-2   62 xtf/test-pv32pae-xsa-194 fail   never pass
 test-xtf-amd64-amd64-3   62 xtf/test-pv32pae-xsa-194 fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 xen  8e04cb25d9634995fe9b37d05c63cdb4ce8c205e
baseline version:
 xen  576f319a804bce8c9a7fb70a042f873f5eaf0151

Last test of basis68545  2017-02-10 20:13:19 Z6 days
Testing same since68567  2017-02-16 13:47:22 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 
  Julien Grall 
  Oleksandr Tyshchenko 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 

Re: [Xen-devel] Xen on ARM IRQ latency and scheduler overhead

2017-02-16 Thread Stefano Stabellini
On Thu, 16 Feb 2017, Stefano Stabellini wrote:
> > > > >     AVG MIN MAX WARM MAX
> > > > > 
> > > > > NODEBUG no WFI  1890    1800    3170    2070
> > > > > NODEBUG WFI 4850    4810    7030    4980
> > > > > NODEBUG no WFI credit2  2217    2090    3420    2650
> > > > > NODEBUG WFI credit2 8080    7890    10320   8300
> > > > > 
> > > > > DEBUG no WFI    2252    2080    3320    2650
> > > > > DEBUG WFI   6500    6140    8520    8130
> > > > > DEBUG WFI, credit2  8050    7870    10680   8450
> > > > > 
> > > > > DEBUG means Xen DEBUG build.
> > > > > 
> > [...]
> > > > > As you can see, depending on whether the guest issues a WFI or
> > > > > not
> > > > > while
> > > > > waiting for interrupts, the results change significantly.
> > > > > Interestingly,
> > > > > credit2 does worse than credit1 in this area.
> > > > > 
> > > > This is with current staging right? 
> > > 
> > > That's right.
> > > 
> > So, when you have the chance, can I see the output of
> > 
> >  xl debug-key r
> >  xl dmesg
> > 
> > Both under Credit1 and Credit2?
> 
> I'll see what I can do.

See attached.(XEN) Xen version 4.9-unstable (sstabellini@) (aarch64-linux-gnu-gcc (Linaro 
GCC 2014.05) 4.9.1 20140422 (prerelease)) debug=y  Thu Feb 16 14:39:21 PST 2017
(XEN) Latest ChangeSet: Thu Feb 16 14:39:16 2017 -0800 git:4e0ef4d
(XEN) Processor: 410fd034: "ARM Limited", variant: 0x0, part 0xd03, rev 0x4
(XEN) 64-bit Execution:
(XEN)   Processor Features:  
(XEN) Exception Levels: EL3:64+32 EL2:64+32 EL1:64+32 EL0:64+32
(XEN) Extensions: FloatingPoint AdvancedSIMD
(XEN)   Debug Features: 10305106 
(XEN)   Auxiliary Features:  
(XEN)   Memory Model Features: 1122 
(XEN)   ISA Features:  00011120 
(XEN) 32-bit Execution:
(XEN)   Processor Features: 0131:00011011
(XEN) Instruction Sets: AArch32 A32 Thumb Thumb-2 Jazelle
(XEN) Extensions: GenericTimer Security
(XEN)   Debug Features: 03010066
(XEN)   Auxiliary Features: 
(XEN)   Memory Model Features: 10201105 4000 0126 02102211
(XEN)  ISA Features: 02101110 13112111 21232042 01112131 00011142 00011121
(XEN) Using PSCI-1.0 for SMP bringup
(XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27 Freq: 8 KHz
(XEN) GICv2 initialization:
(XEN) gic_dist_addr=f901
(XEN) gic_cpu_addr=f902
(XEN) gic_hyp_addr=f904
(XEN) gic_vcpu_addr=f906
(XEN) gic_maintenance_irq=25
(XEN) GICv2: Adjusting CPU interface base to 0xf902f000
(XEN) GICv2: 192 lines, 4 cpus, secure (IID 0200143b).
(XEN) GICv2: WARNING: CPU0: Failed to configure IRQ26 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) GICv2: WARNING: CPU0: Failed to configure IRQ27 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) GICv2: WARNING: CPU0: Failed to configure IRQ30 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) Could not find scheduler: credit1
(XEN) Using 'SMP Credit Scheduler' (credit)
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Allocated console ring of 32 KiB.
(XEN) Bringing up CPU1
(XEN) GICv2: WARNING: CPU1: Failed to configure IRQ26 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) GICv2: WARNING: CPU1: Failed to configure IRQ27 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) GICv2: WARNING: CPU1: Failed to configure IRQ30 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) CPU 1 booted.
(XEN) Bringing up CPU2
(XEN) GICv2: WARNING: CPU2: Failed to configure IRQ26 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) GICv2: WARNING: CPU2: Failed to configure IRQ27 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) GICv2: WARNING: CPU2: Failed to configure IRQ30 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) CPU 2 booted.
(XEN) Bringing up CPU3
(XEN) GICv2: WARNING: CPU3: Failed to configure IRQ26 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) GICv2: WARNING: CPU3: Failed to configure IRQ27 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) GICv2: WARNING: CPU3: Failed to configure IRQ30 as Edge-triggered. H/w 
forces to Level-triggered.
(XEN) CPU 3 booted.
(XEN) Brought up 4 CPUs
(XEN) P2M: 40-bit IPA with 40-bit PA and 8-bit VMID
(XEN) P2M: 3 levels with order-1 root, VTCR 0x80023558
(XEN) smmu: /amba/smmu@fd80: probing hardware configuration...
(XEN) smmu: /amba/smmu@fd80: SMMUv2 with:
(XEN) smmu: /amba/smmu@fd80:stage 2 translation
(XEN) smmu: /amba/smmu@fd80:stream matching with 48 register 
groups, mask 0x7fff
(XEN) smmu: /amba/smmu@fd80:16 context banks (0 stage-2 only)
(XEN) smmu: /amba/smmu@fd80:Stage-2: 40-bit IPA -> 48-bit PA
(XEN) smmu: /amba/smmu@fd80: registered 26 master devices
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed

[Xen-devel] [PATCH v2] xen/arm: increase default dom0_mem to 512M

2017-02-16 Thread Stefano Stabellini
The default dom0_mem is 128M which is not sufficient to boot an Ubuntu
based Dom0. Increase it to 512M.

Signed-off-by: Stefano Stabellini 

---

Changes in v2: use MB macro

---
 xen/arch/arm/domain_build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index c97a1f5..210cb98 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -31,7 +31,7 @@ integer_param("dom0_max_vcpus", opt_dom0_max_vcpus);
 
 int dom0_11_mapping = 1;
 
-#define DOM0_MEM_DEFAULT 0x800 /* 128 MiB */
+#define DOM0_MEM_DEFAULT MB(512) /* 512 MiB */
 static u64 __initdata dom0_mem = DOM0_MEM_DEFAULT;
 
 static void __init parse_dom0_mem(const char *s)
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] (resend) qemu crashes during VCPU hotplug

2017-02-16 Thread Stefano Stabellini
On Thu, 16 Feb 2017, Boris Ostrovsky wrote:
> On 02/16/2017 04:19 PM, Stefano Stabellini wrote:
> > On Thu, 16 Feb 2017, Boris Ostrovsky wrote:
> > > On 02/15/2017 11:20 PM, Boris Ostrovsky wrote:
> > > > (Now with correct address for Stefano)
> > > > 
> > > > Upstream qemu appears to be crashing during VCPU hotplug. I think this
> > > > is something relatively new since I have been doing this a few week ago.
> > > > 
> > > > I reproduced this on two different setups. Haven't had a chance to look
> > > > any further but e3cadac073 looks suspicious.
> > > 
> > > Yes, this is the offending commit.
> > > 
> > > For Xen guests qemu never sets pcms->fw_cfg.
> > 
> > Thanks for narrowing it down. Are you using qemu-xen/staging?
> 
> 
> Yes.
> 
> 
> > It looks
> > like it has been fixed in qemu.org by
> > 
> > commit 26ef65beab852caf2b1ef4976e3473f2d525164d
> > Author: Igor Mammedov 
> > Date:   Fri Dec 30 15:33:11 2016 +0100
> > 
> > pc: fix crash in rtc_set_memory() if initial cpu is marked as hotplugged
> > 
> > can you confirm?
> 
> 
> Yes, this fixes it.

I backported it to qemu-xen/staging


> -boris
> 
> > 
> > 
> > 
> > > -boris
> > > 
> > > > 
> > > > The crash happens in fw_cfg_modify_bytes_read() when we pass in NULL
> > > > pointer as first argument. The stack is below:
> > > > 
> > > > 
> > > > (gdb) where
> > > > #0  0x561d762d64d4 in fw_cfg_modify_bytes_read (s=0x0, key=5,
> > > > data=0x561d787031d0, len=2) at hw/nvram/fw_cfg.c:614
> > > > #1  0x561d762d6730 in fw_cfg_modify_i16 (s=0x0, key=5, value=2) at
> > > > hw/nvram/fw_cfg.c:656
> > > > #2  0x561d761195b3 in pc_cpu_plug (hotplug_dev=0x561d770f9810,
> > > > dev=0x561d7712a7e0, errp=0x7ffe8f75f2b0) at
> > > > /root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1823
> > > > #3  0x561d76119fc0 in pc_machine_device_plug_cb
> > > > (hotplug_dev=0x561d770f9810, dev=0x561d7712a7e0, errp=0x7ffe8f75f2b0) at
> > > > /root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1993
> > > > #4  0x561d76239cba in hotplug_handler_plug
> > > > (plug_handler=0x561d770f9810, plugged_dev=0x561d7712a7e0,
> > > > errp=0x7ffe8f75f2b0) at hw/core/hotplug.c:34
> > > > #5  0x561d7623584d in device_set_realized (obj=0x561d7712a7e0,
> > > > value=true, errp=0x7ffe8f75f468) at hw/core/qdev.c:928
> > > > #6  0x561d763e22a3 in property_set_bool (obj=0x561d7712a7e0,
> > > > v=0x561d78702090, name=0x561d764fd9d0 "realized", opaque=0x561d785aea00,
> > > > errp=0x7ffe8f75f468) at qom/object.c:1854
> > > > #7  0x561d763e07aa in object_property_set (obj=0x561d7712a7e0,
> > > > v=0x561d78702090, name=0x561d764fd9d0 "realized", errp=0x7ffe8f75f468)
> > > > at qom/object.c:1088
> > > > #8  0x561d763e3609 in object_property_set_qobject
> > > > (obj=0x561d7712a7e0, value=0x561d773869c0, name=0x561d764fd9d0
> > > > "realized", errp=0x7ffe8f75f468) at qom/qom-qobject.c:27
> > > > #9  0x561d763e0a40 in object_property_set_bool (obj=0x561d7712a7e0,
> > > > value=true, name=0x561d764fd9d0 "realized", errp=0x7ffe8f75f468) at
> > > > qom/object.c:1157
> > > > #10 0x561d76117304 in pc_new_cpu (typename=0x561d7707c880
> > > > "qemu32-i386-cpu", apic_id=1, errp=0x7ffe8f75f4c0) at
> > > > /root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1099
> > > > #11 0x561d761174cc in pc_hot_add_cpu (id=1, errp=0x7ffe8f75f558) at
> > > > /root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1131
> > > > #12 0x561d761cb7b3 in qmp_cpu_add (id=1, errp=0x7ffe8f75f558) at
> > > > qmp.c:126
> > > > #13 0x561d761bdc60 in qmp_marshal_cpu_add (args=0x561d7711a1b0,
> > > > ret=0x7ffe8f75f5b0, errp=0x7ffe8f75f5a8) at qmp-marshal.c:1274
> > > > #14 0x561d764b2f13 in do_qmp_dispatch (request=0x561d77129360,
> > > > errp=0x7ffe8f75f610) at qapi/qmp-dispatch.c:98
> > > > #15 0x561d764b3042 in qmp_dispatch (request=0x561d77129360) at
> > > > qapi/qmp-dispatch.c:125
> > > > #16 0x561d76084d39 in handle_qmp_command (parser=0x561d771288b0,
> > > > tokens=0x561d770f8cc0) at /root/xen/tools/qemu-xen-dir/monitor.c:3758
> > > > #17 0x561d764ba402 in json_message_process_token
> > > > (lexer=0x561d771288b8, input=0x561d770f9040, type=JSON_RCURLY, x=1,
> > > > y=11) at qobject/json-streamer.c:105
> > > > #18 0x561d764dd5dc in json_lexer_feed_char (lexer=0x561d771288b8,
> > > > ch=125 '}', flush=false) at qobject/json-lexer.c:319
> > > > #19 0x561d764dd71c in json_lexer_feed (lexer=0x561d771288b8,
> > > > buffer=0x7ffe8f75f880 "}\224Dx\035V", size=1) at
> > > > qobject/json-lexer.c:369
> > > > #20 0x561d764ba4a2 in json_message_parser_feed
> > > > (parser=0x561d771288b0, buffer=0x7ffe8f75f880 "}\224Dx\035V", size=1) at
> > > > qobject/json-streamer.c:124
> > > > #21 0x561d76084e53 in monitor_qmp_read (opaque=0x561d77128830,
> > > > buf=0x7ffe8f75f880 "}\224Dx\035V", size=1) at
> > > > /root/xen/tools/qemu-xen-dir/monitor.c:3788
> > > > #22 0x561d761a3b2d in qemu_chr_be_write_impl (s=0x561d77107020,
> > > > buf=0x7ffe8f75f880 "}\224Dx\035V", len=1) at 

Re: [Xen-devel] [PATCH v15 05/10] x86: add multiboot2 protocol support for EFI platforms

2017-02-16 Thread Daniel Kiper
On Thu, Feb 16, 2017 at 03:56:21PM -0600, Doug Goldstein wrote:
> On 2/16/17 3:49 PM, Daniel Kiper wrote:
> > On Thu, Feb 16, 2017 at 02:29:45AM -0700, Jan Beulich wrote:
> > On 15.02.17 at 22:53,  wrote:
> >>> On Wed, Feb 15, 2017 at 03:22:02AM -0700, Jan Beulich wrote:
> >>> On 14.02.17 at 19:38,  wrote:
> > --- a/xen/arch/x86/boot/head.S
> > +++ b/xen/arch/x86/boot/head.S
> > @@ -394,10 +394,18 @@ __start:
> >
> >  /* EFI IA-32 platforms are not supported. */
> >  cmpl$MULTIBOOT2_TAG_TYPE_EFI32,MB2_tag_type(%ecx)
> > +/*
> > + * Here we should zap vga_text_buffer. However, we can disable
> > + * VGA updates in simpler and more reliable way later.
> > + */
> >  je  .Lmb2_efi_ia_32
> >
> >  /* Bootloader shutdown EFI x64 boot services. */
> >  cmpl$MULTIBOOT2_TAG_TYPE_EFI64,MB2_tag_type(%ecx)
> > +/*
> > + * Here we should zap vga_text_buffer. However, we can disable
> > + * VGA updates in simpler and more reliable way later.
> > + */
> >  je  .Lmb2_no_bs
> 
>  I'm afraid I don't view these comments as helpful in understanding
>  the whole situation. That's partly because I don't follow both the
>  "simpler" and "more reliable" parts (using just the information here,
> >>>
> >>> OK, I will clarify it.
> >>>
>  i.e. leaving aside what you've given as explanation earlier, albeit I
>  don't think that was fully clarifying things either), and partly
>  because I continue to think that the explanation should go where
>  the labels are (which is what I had meant to suggest with my
>  comment placement in reply to v14). Nor does the adjustment
> >>>
> >>> OK.
> >>>
>  above help (me) understand the correctness of the dual use of
>  .Lmb2_no_bs.
> >>>
> >>> What do you mean by "dual use of .Lmb2_no_bs."? I would like to be sure.
> >>
> >> As said in v14 review, it's being jumped to from two rather different
> >> places, and hence the VGA aspect isn't obviously the same for both.
> >
> > OK, I will try to clarify. If a bootloader called us using __efi64_mb2_start
> > we are sure that we are running on EFI platform and there is no VGA there.
> > It means that we can safely zap vga_text_buffer unconditionally in first 
> > steps
> > (we do that in second instruction). Then we do not need to take care about
> > that in case of error. And one of these errors is lack of 
> > MULTIBOOT2_TAG_TYPE_EFI_BS
> > tag. It means that EFI boot services are shutdown. So, we are in black hole.
> > We have to inform user about that and halt the system. And that is why we
>
> Not looking at the code but the words here. If ExitBootServices() has
> been called we should be able to still boot if the memory map was passed
> along. Are we deferring that use case to a follow on?

In theory yes but, IIRC, we have to significantly refactor Xen EFI boot code 
then
due to lack of EFI boot services. I tried to do that once but quickly realized
that it does not pays.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf baseline-only test] 68568: all pass

2017-02-16 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 68568 ovmf real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/68568/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf fd12acdeff7a04ad34ccb95103eb6204b8901749
baseline version:
 ovmf cb8674999c6bf94cdb3be18df3746131aac6386b

Last test of basis68564  2017-02-15 14:20:57 Z1 days
Testing same since68568  2017-02-16 15:17:39 Z0 days1 attempts


People who touched revisions under test:
  Liming Gao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


commit fd12acdeff7a04ad34ccb95103eb6204b8901749
Author: Liming Gao 
Date:   Wed Feb 15 16:40:07 2017 +0800

MdeModulePkg UefiBootManagerLib: Correct usages of GUID and Protocol

https://bugzilla.tianocore.org/show_bug.cgi?id=316

Cc: Ruiyu Ni 
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao 
Reviewed-by: Ruiyu Ni 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 2/2] arm: proper ordering for correct execution of gic_update_one_lr and vgic_store_itargetsr

2017-02-16 Thread Stefano Stabellini
On Thu, 16 Feb 2017, Julien Grall wrote:
> Hi Stefano,
> 
> On 11/02/17 02:05, Stefano Stabellini wrote:
> > Concurrent execution of gic_update_one_lr and vgic_store_itargetsr can
> > result in the wrong pcpu being set as irq target, see
> > http://marc.info/?l=xen-devel=148218667104072.
> > 
> > To solve the issue, add barriers, remove an irq from the inflight
> > queue, only after the affinity has been set. On the other end, write the
> > new vcpu target, before checking GIC_IRQ_GUEST_MIGRATING and inflight.
> > 
> > Signed-off-by: Stefano Stabellini 
> > ---
> >  xen/arch/arm/gic.c | 3 ++-
> >  xen/arch/arm/vgic-v2.c | 4 ++--
> >  xen/arch/arm/vgic-v3.c | 4 +++-
> >  3 files changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> > index a5348f2..bb52959 100644
> > --- a/xen/arch/arm/gic.c
> > +++ b/xen/arch/arm/gic.c
> > @@ -503,12 +503,13 @@ static void gic_update_one_lr(struct vcpu *v, int i)
> >   !test_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
> >  gic_raise_guest_irq(v, irq, p->priority);
> >  else {
> > -list_del_init(>inflight);
> >  if ( test_and_clear_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
> >  {
> >  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
> >  irq_set_affinity(p->desc, cpumask_of(v_target->processor));
> >  }
> > +smp_mb();
> > +list_del_init(>inflight);
> 
> I don't understand why you remove from the inflight list afterwards. If you do
> that you introduce that same problem as discussed in
> <7a78c859-fa6f-ba10-b574-d8edd46ea...@arm.com>
>
> As long as the interrupt is routed to the pCPU running gic_update_one_lr, the
> interrupt cannot fired because the interrupts are masked.  

This is not accurate: it is possible to receive a second interrupt
notification while the first one is still active.


> However, as soon as irq_set_affinity is called the interrupt may fire
> on the other pCPU.

This is true.


> However, list_del_init is not atomic and not protected by any lock. So
> vgic_vcpu_inject_irq may see a corrupted version of {p,n}->inflight.
> 
> Did I miss anything?

Moving list_del_init later ensures that there are no conflicts between
gic_update_one_lr and vgic_store_itargetsr (more specifically,
vgic_migrate_irq). If you look at the implementation of
vgic_migrate_irq, all checks depends on list_empty(>inflight). If we
don't move list_del_init later, given that vgic_migrate_irq can be
called with a different vgic lock taken than gic_update_one_lr, the
following scenario can happen:



  CPU0: gic_update_one_lr   CPU1: vgic_store_itargetsr
  --
  remove from inflight
  clear GIC_IRQ_GUEST_MIGRATING
  read rank->vcpu (intermediate)
set rank->vcpu (final)
vgic_migrate_irq
  if (!inflight) irq_set_affinity 
(final)
  irq_set_affinity (intermediate)


As a result, the irq affinity is set to the wrong cpu. With this patch,
this problem doesn't occur.

However, you are right that both in the case of gic_update_one_lr and
vgic_migrate_irq, as well as the case of gic_update_one_lr and
vgic_vcpu_inject_irq that you mentioned, list_del_init (from
gic_update_one_lr) is potentially run as the same time as list_empty
(from vgic_migrate_irq or from vgic_vcpu_inject_irq), and they are not
atomic.

Also see this other potential issue: 
http://marc.info/?l=xen-devel=148703220714075 

All these concurrent accesses are difficult to understand and to deal
with. This is why my original suggestion was to use the old vcpu vgic
lock, rather then try to ensure safe concurrent accesses everywhere.
That option is still open and would solve both problems.

We only need to:

- store the vcpu to which an irq is currently injected
http://marc.info/?l=xen-devel=148237295020488
- check the new irq->vcpu field, and take the right vgic lock
something like http://marc.info/?l=xen-devel=148237295920492=2, but
would need improvements

Much simpler, right?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v15 05/10] x86: add multiboot2 protocol support for EFI platforms

2017-02-16 Thread Doug Goldstein
On 2/16/17 3:49 PM, Daniel Kiper wrote:
> On Thu, Feb 16, 2017 at 02:29:45AM -0700, Jan Beulich wrote:
> On 15.02.17 at 22:53,  wrote:
>>> On Wed, Feb 15, 2017 at 03:22:02AM -0700, Jan Beulich wrote:
>>> On 14.02.17 at 19:38,  wrote:
> --- a/xen/arch/x86/boot/head.S
> +++ b/xen/arch/x86/boot/head.S
> @@ -394,10 +394,18 @@ __start:
>
>  /* EFI IA-32 platforms are not supported. */
>  cmpl$MULTIBOOT2_TAG_TYPE_EFI32,MB2_tag_type(%ecx)
> +/*
> + * Here we should zap vga_text_buffer. However, we can disable
> + * VGA updates in simpler and more reliable way later.
> + */
>  je  .Lmb2_efi_ia_32
>
>  /* Bootloader shutdown EFI x64 boot services. */
>  cmpl$MULTIBOOT2_TAG_TYPE_EFI64,MB2_tag_type(%ecx)
> +/*
> + * Here we should zap vga_text_buffer. However, we can disable
> + * VGA updates in simpler and more reliable way later.
> + */
>  je  .Lmb2_no_bs

 I'm afraid I don't view these comments as helpful in understanding
 the whole situation. That's partly because I don't follow both the
 "simpler" and "more reliable" parts (using just the information here,
>>>
>>> OK, I will clarify it.
>>>
 i.e. leaving aside what you've given as explanation earlier, albeit I
 don't think that was fully clarifying things either), and partly
 because I continue to think that the explanation should go where
 the labels are (which is what I had meant to suggest with my
 comment placement in reply to v14). Nor does the adjustment
>>>
>>> OK.
>>>
 above help (me) understand the correctness of the dual use of
 .Lmb2_no_bs.
>>>
>>> What do you mean by "dual use of .Lmb2_no_bs."? I would like to be sure.
>>
>> As said in v14 review, it's being jumped to from two rather different
>> places, and hence the VGA aspect isn't obviously the same for both.
> 
> OK, I will try to clarify. If a bootloader called us using __efi64_mb2_start
> we are sure that we are running on EFI platform and there is no VGA there.
> It means that we can safely zap vga_text_buffer unconditionally in first steps
> (we do that in second instruction). Then we do not need to take care about
> that in case of error. And one of these errors is lack of 
> MULTIBOOT2_TAG_TYPE_EFI_BS
> tag. It means that EFI boot services are shutdown. So, we are in black hole.
> We have to inform user about that and halt the system. And that is why we

Not looking at the code but the words here. If ExitBootServices() has
been called we should be able to still boot if the memory map was passed
along. Are we deferring that use case to a follow on?

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 4/4] KVM: VMX: Simplify segment_base

2017-02-16 Thread Thomas Garnier
The KVM segment_base function is confusing. This patch replaces integers
with appropriate flags, simplify constructs and add comments.

Signed-off-by: Thomas Garnier 
---
Based on next-20170213
---
 arch/x86/kvm/vmx.c | 30 --
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 99167f20bc34..91e619269128 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2060,27 +2060,37 @@ static bool update_transition_efer(struct vcpu_vmx 
*vmx, int efer_offset)
 static unsigned long segment_base(u16 selector)
 {
struct desc_struct *d;
-   unsigned long table_base;
+   struct desc_struct *table_base;
unsigned long v;
+   u32 high32;
 
-   if (!(selector & ~3))
+   if (!(selector & ~SEGMENT_RPL_MASK))
return 0;
 
-   table_base = get_current_gdt_rw_vaddr();
-
-   if (selector & 4) {   /* from ldt */
+   /* LDT selector */
+   if ((selector & SEGMENT_TI_MASK) == SEGMENT_LDT) {
u16 ldt_selector = kvm_read_ldt();
 
-   if (!(ldt_selector & ~3))
+   if (!(ldt_selector & ~SEGMENT_RPL_MASK))
return 0;
 
-   table_base = segment_base(ldt_selector);
+   table_base = (struct desc_struct *)segment_base(ldt_selector);
+   } else {
+   table_base = get_current_gdt_rw();
}
-   d = (struct desc_struct *)(table_base + (selector & ~7));
+
+   d = table_base + (selector >> 3);
v = get_desc_base(d);
 #ifdef CONFIG_X86_64
-   if (d->s == 0 && (d->type == 2 || d->type == 9 || d->type == 11))
-   v |= ((unsigned long)((struct ldttss_desc64 *)d)->base3) << 32;
+   /*
+* Extend the virtual address if we have a system descriptor entry for
+* LDT or TSS (available or busy).
+*/
+   if (d->s == 0 && (d->type == DESC_LDT || d->type == DESC_TSS ||
+ d->type == 11/*Busy TSS */)) {
+   high32 = ((struct ldttss_desc64 *)d)->base3;
+   v |= (u64)high32 << 32;
+   }
 #endif
return v;
 }
-- 
2.11.0.483.g087da7b7c-goog


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 2/4] x86: Remap GDT tables in the Fixmap section

2017-02-16 Thread Thomas Garnier
Each processor holds a GDT in its per-cpu structure. The sgdt
instruction gives the base address of the current GDT. This address can
be used to bypass KASLR memory randomization. With another bug, an
attacker could target other per-cpu structures or deduce the base of
the main memory section (PAGE_OFFSET).

This patch relocates the GDT table for each processor inside the
Fixmap section. The space is reserved based on number of supported
processors.

For consistency, the remapping is done by default on 32 and 64-bit.

Each processor switches to its remapped GDT at the end of
initialization. For hibernation, the main processor returns with the
original GDT and switches back to the remapping at completion.

This patch was tested on both architectures. Hibernation and KVM were
both tested specially for their usage of the GDT.

Signed-off-by: Thomas Garnier 
---
Based on next-20170213
---
 arch/x86/entry/vdso/vma.c |  2 +-
 arch/x86/include/asm/desc.h   | 33 +
 arch/x86/include/asm/fixmap.h |  4 
 arch/x86/include/asm/processor.h  |  1 +
 arch/x86/include/asm/stackprotector.h |  2 +-
 arch/x86/kernel/acpi/sleep.c  |  2 +-
 arch/x86/kernel/apm_32.c  |  6 +++---
 arch/x86/kernel/cpu/common.c  | 26 --
 arch/x86/kernel/setup_percpu.c|  2 +-
 arch/x86/kernel/smpboot.c |  2 +-
 arch/x86/platform/efi/efi_32.c|  4 ++--
 arch/x86/power/cpu.c  |  7 +--
 arch/x86/xen/enlighten.c  |  2 +-
 arch/x86/xen/smp.c|  2 +-
 drivers/lguest/x86/core.c |  6 +++---
 drivers/pnp/pnpbios/bioscalls.c   | 10 +-
 16 files changed, 83 insertions(+), 28 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 572cee3fccff..9c8bd4cfcc6e 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -353,7 +353,7 @@ static void vgetcpu_cpu_init(void *arg)
d.p = 1;/* Present */
d.d = 1;/* 32-bit */
 
-   write_gdt_entry(get_cpu_gdt_table(cpu), GDT_ENTRY_PER_CPU, , 
DESCTYPE_S);
+   write_gdt_entry(get_cpu_gdt_rw(cpu), GDT_ENTRY_PER_CPU, , DESCTYPE_S);
 }
 
 static int vgetcpu_online(unsigned int cpu)
diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index 12080d87da3b..5d4ba1311737 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -45,11 +46,35 @@ struct gdt_page {
 
 DECLARE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page);
 
-static inline struct desc_struct *get_cpu_gdt_table(unsigned int cpu)
+/* Provide the original GDT */
+static inline struct desc_struct *get_cpu_gdt_rw(unsigned int cpu)
 {
return per_cpu(gdt_page, cpu).gdt;
 }
 
+static inline unsigned long get_cpu_gdt_rw_vaddr(unsigned int cpu)
+{
+   return (unsigned long)get_cpu_gdt_rw(cpu);
+}
+
+/* Get the fixmap index for a specific processor */
+static inline unsigned int get_cpu_gdt_ro_index(int cpu)
+{
+   return FIX_GDT_REMAP_BEGIN + cpu;
+}
+
+/* Provide the fixmap address of the remapped GDT */
+static inline struct desc_struct *get_cpu_gdt_ro(int cpu)
+{
+   unsigned int idx = get_cpu_gdt_ro_index(cpu);
+   return (struct desc_struct *)__fix_to_virt(idx);
+}
+
+static inline unsigned long get_cpu_gdt_ro_vaddr(int cpu)
+{
+   return (unsigned long)get_cpu_gdt_ro(cpu);
+}
+
 #ifdef CONFIG_X86_64
 
 static inline void pack_gate(gate_desc *gate, unsigned type, unsigned long 
func,
@@ -174,7 +199,7 @@ static inline void set_tssldt_descriptor(void *d, unsigned 
long addr, unsigned t
 
 static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr)
 {
-   struct desc_struct *d = get_cpu_gdt_table(cpu);
+   struct desc_struct *d = get_cpu_gdt_rw(cpu);
tss_desc tss;
 
/*
@@ -202,7 +227,7 @@ static inline void native_set_ldt(const void *addr, 
unsigned int entries)
 
set_tssldt_descriptor(, (unsigned long)addr, DESC_LDT,
  entries * LDT_ENTRY_SIZE - 1);
-   write_gdt_entry(get_cpu_gdt_table(cpu), GDT_ENTRY_LDT,
+   write_gdt_entry(get_cpu_gdt_rw(cpu), GDT_ENTRY_LDT,
, DESC_LDT);
asm volatile("lldt %w0"::"q" (GDT_ENTRY_LDT*8));
}
@@ -244,7 +269,7 @@ static inline unsigned long native_store_tr(void)
 
 static inline void native_load_tls(struct thread_struct *t, unsigned int cpu)
 {
-   struct desc_struct *gdt = get_cpu_gdt_table(cpu);
+   struct desc_struct *gdt = get_cpu_gdt_rw(cpu);
unsigned int i;
 
for (i = 0; i < GDT_ENTRY_TLS_ENTRIES; i++)
diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 8554f960e21b..b65155cc3760 100644
--- a/arch/x86/include/asm/fixmap.h
+++ 

[Xen-devel] [PATCH v4 1/4] x86/mm: Adapt MODULES_END based on Fixmap section size

2017-02-16 Thread Thomas Garnier
This patch aligns MODULES_END to the beginning of the Fixmap section.
It optimizes the space available for both sections. The address is
pre-computed based on the number of pages required by the Fixmap
section.

It will allow GDT remapping in the Fixmap section. The current
MODULES_END static address does not provide enough space for the kernel
to support a large number of processors.

Signed-off-by: Thomas Garnier 
---
Based on next-20170213
---
 Documentation/x86/x86_64/mm.txt | 5 -
 arch/x86/include/asm/pgtable_64_types.h | 3 ++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index 5724092db811..ee3f9c30957c 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -19,7 +19,7 @@ ff00 - ff7f (=39 bits) %esp fixup 
stacks
 ffef - fffe (=64 GB) EFI region mapping space
 ... unused hole ...
 8000 - 9fff (=512 MB)  kernel text mapping, from phys 0
-a000 - ff5f (=1526 MB) module mapping space
+a000 - ff5f (=1526 MB) module mapping space (variable)
 ff60 - ffdf (=8 MB) vsyscalls
 ffe0 -  (=2 MB) unused hole
 
@@ -39,6 +39,9 @@ memory window (this size is arbitrary, it can be raised later 
if needed).
 The mappings are not part of any other kernel PGD and are only available
 during EFI runtime calls.
 
+The module mapping space size changes based on the CONFIG requirements for the
+following fixmap section.
+
 Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
 physical memory, vmalloc/ioremap space and virtual memory map are randomized.
 Their order is preserved but their base will be offset early at boot time.
diff --git a/arch/x86/include/asm/pgtable_64_types.h 
b/arch/x86/include/asm/pgtable_64_types.h
index 3a264200c62f..bb05e21cf3c7 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -67,7 +67,8 @@ typedef struct { pteval_t pte; } pte_t;
 #endif /* CONFIG_RANDOMIZE_MEMORY */
 #define VMALLOC_END(VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL))
 #define MODULES_VADDR(__START_KERNEL_map + KERNEL_IMAGE_SIZE)
-#define MODULES_END  _AC(0xff00, UL)
+/* The module sections ends with the start of the fixmap */
+#define MODULES_END   __fix_to_virt(__end_of_fixed_addresses + 1)
 #define MODULES_LEN   (MODULES_END - MODULES_VADDR)
 #define ESPFIX_PGD_ENTRY _AC(-2, UL)
 #define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << PGDIR_SHIFT)
-- 
2.11.0.483.g087da7b7c-goog


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 3/4] x86: Make the GDT remapping read-only on 64-bit

2017-02-16 Thread Thomas Garnier
This patch makes the GDT remapped pages read-only to prevent corruption.
This change is done only on 64-bit.

The native_load_tr_desc function was adapted to correctly handle a
read-only GDT. The LTR instruction always writes to the GDT TSS entry.
This generates a page fault if the GDT is read-only. This change checks
if the current GDT is a remap and swap GDTs as needed. This function was
tested by booting multiple machines and checking hibernation works
properly.

KVM SVM and VMX were adapted to use the writeable GDT. On VMX, the
per-cpu variable was removed for functions to fetch the original GDT.
Instead of reloading the previous GDT, VMX will reload the fixmap GDT as
expected. For testing, VMs were started and restored on multiple
configurations.

Signed-off-by: Thomas Garnier 
---
Based on next-20170213
---
 arch/x86/include/asm/desc.h  | 51 
 arch/x86/include/asm/processor.h |  1 +
 arch/x86/kernel/cpu/common.c | 28 +-
 arch/x86/kvm/svm.c   |  4 +---
 arch/x86/kvm/vmx.c   | 15 
 5 files changed, 75 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index 5d4ba1311737..15b2a86c9267 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -57,6 +57,17 @@ static inline unsigned long get_cpu_gdt_rw_vaddr(unsigned 
int cpu)
return (unsigned long)get_cpu_gdt_rw(cpu);
 }
 
+/* Provide the current original GDT */
+static inline struct desc_struct *get_current_gdt_rw(void)
+{
+   return this_cpu_ptr(_page)->gdt;
+}
+
+static inline unsigned long get_current_gdt_rw_vaddr(void)
+{
+   return (unsigned long)get_current_gdt_rw();
+}
+
 /* Get the fixmap index for a specific processor */
 static inline unsigned int get_cpu_gdt_ro_index(int cpu)
 {
@@ -233,11 +244,6 @@ static inline void native_set_ldt(const void *addr, 
unsigned int entries)
}
 }
 
-static inline void native_load_tr_desc(void)
-{
-   asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8));
-}
-
 static inline void native_load_gdt(const struct desc_ptr *dtr)
 {
asm volatile("lgdt %0"::"m" (*dtr));
@@ -258,6 +264,41 @@ static inline void native_store_idt(struct desc_ptr *dtr)
asm volatile("sidt %0":"=m" (*dtr));
 }
 
+/*
+ * The LTR instruction marks the TSS GDT entry as busy. On 64-bit, the GDT is
+ * a read-only remapping. To prevent a page fault, the GDT is switched to the
+ * original writeable version when needed.
+ */
+#ifdef CONFIG_X86_64
+static inline void native_load_tr_desc(void)
+{
+   struct desc_ptr gdt;
+   int cpu = raw_smp_processor_id();
+   bool restore = 0;
+   struct desc_struct *fixmap_gdt;
+
+   native_store_gdt();
+   fixmap_gdt = get_cpu_gdt_ro(cpu);
+
+   /*
+* If the current GDT is the read-only fixmap, swap to the original
+* writeable version. Swap back at the end.
+*/
+   if (gdt.address == (unsigned long)fixmap_gdt) {
+   load_direct_gdt(cpu);
+   restore = 1;
+   }
+   asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8));
+   if (restore)
+   load_fixmap_gdt(cpu);
+}
+#else
+static inline void native_load_tr_desc(void)
+{
+   asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8));
+}
+#endif
+
 static inline unsigned long native_store_tr(void)
 {
unsigned long tr;
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index c441d1f7e275..6ea9e419a856 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -706,6 +706,7 @@ extern struct desc_ptr  early_gdt_descr;
 
 extern void cpu_set_gdt(int);
 extern void switch_to_new_gdt(int);
+extern void load_direct_gdt(int);
 extern void load_fixmap_gdt(int);
 extern void load_percpu_segment(int);
 extern void cpu_init(void);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 2853a42ded2d..bdf521383900 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -444,13 +444,31 @@ void load_percpu_segment(int cpu)
load_stack_canary_segment();
 }
 
+/* On 64-bit the GDT remapping is read-only */
+#ifdef CONFIG_X86_64
+#define PAGE_FIXMAP_GDT PAGE_KERNEL_RO
+#else
+#define PAGE_FIXMAP_GDT PAGE_KERNEL
+#endif
+
 /* Setup the fixmap mapping only once per-processor */
 static inline void setup_fixmap_gdt(int cpu)
 {
__set_fixmap(get_cpu_gdt_ro_index(cpu),
-__pa(get_cpu_gdt_rw(cpu)), PAGE_KERNEL);
+__pa(get_cpu_gdt_rw(cpu)), PAGE_FIXMAP_GDT);
 }
 
+/* Load the original GDT from the per-cpu structure */
+void load_direct_gdt(int cpu)
+{
+   struct desc_ptr gdt_descr;
+
+   gdt_descr.address = (long)get_cpu_gdt_rw(cpu);
+   gdt_descr.size = GDT_SIZE - 1;
+   load_gdt(_descr);
+}
+EXPORT_SYMBOL_GPL(load_direct_gdt);
+
 /* Load a fixmap remapping of the per-cpu GDT */
 

Re: [Xen-devel] (resend) qemu crashes during VCPU hotplug

2017-02-16 Thread Boris Ostrovsky



On 02/16/2017 04:19 PM, Stefano Stabellini wrote:

On Thu, 16 Feb 2017, Boris Ostrovsky wrote:

On 02/15/2017 11:20 PM, Boris Ostrovsky wrote:

(Now with correct address for Stefano)

Upstream qemu appears to be crashing during VCPU hotplug. I think this
is something relatively new since I have been doing this a few week ago.

I reproduced this on two different setups. Haven't had a chance to look
any further but e3cadac073 looks suspicious.


Yes, this is the offending commit.

For Xen guests qemu never sets pcms->fw_cfg.


Thanks for narrowing it down. Are you using qemu-xen/staging?



Yes.



It looks
like it has been fixed in qemu.org by

commit 26ef65beab852caf2b1ef4976e3473f2d525164d
Author: Igor Mammedov 
Date:   Fri Dec 30 15:33:11 2016 +0100

pc: fix crash in rtc_set_memory() if initial cpu is marked as hotplugged

can you confirm?



Yes, this fixes it.

-boris






-boris



The crash happens in fw_cfg_modify_bytes_read() when we pass in NULL
pointer as first argument. The stack is below:


(gdb) where
#0  0x561d762d64d4 in fw_cfg_modify_bytes_read (s=0x0, key=5,
data=0x561d787031d0, len=2) at hw/nvram/fw_cfg.c:614
#1  0x561d762d6730 in fw_cfg_modify_i16 (s=0x0, key=5, value=2) at
hw/nvram/fw_cfg.c:656
#2  0x561d761195b3 in pc_cpu_plug (hotplug_dev=0x561d770f9810,
dev=0x561d7712a7e0, errp=0x7ffe8f75f2b0) at
/root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1823
#3  0x561d76119fc0 in pc_machine_device_plug_cb
(hotplug_dev=0x561d770f9810, dev=0x561d7712a7e0, errp=0x7ffe8f75f2b0) at
/root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1993
#4  0x561d76239cba in hotplug_handler_plug
(plug_handler=0x561d770f9810, plugged_dev=0x561d7712a7e0,
errp=0x7ffe8f75f2b0) at hw/core/hotplug.c:34
#5  0x561d7623584d in device_set_realized (obj=0x561d7712a7e0,
value=true, errp=0x7ffe8f75f468) at hw/core/qdev.c:928
#6  0x561d763e22a3 in property_set_bool (obj=0x561d7712a7e0,
v=0x561d78702090, name=0x561d764fd9d0 "realized", opaque=0x561d785aea00,
errp=0x7ffe8f75f468) at qom/object.c:1854
#7  0x561d763e07aa in object_property_set (obj=0x561d7712a7e0,
v=0x561d78702090, name=0x561d764fd9d0 "realized", errp=0x7ffe8f75f468)
at qom/object.c:1088
#8  0x561d763e3609 in object_property_set_qobject
(obj=0x561d7712a7e0, value=0x561d773869c0, name=0x561d764fd9d0
"realized", errp=0x7ffe8f75f468) at qom/qom-qobject.c:27
#9  0x561d763e0a40 in object_property_set_bool (obj=0x561d7712a7e0,
value=true, name=0x561d764fd9d0 "realized", errp=0x7ffe8f75f468) at
qom/object.c:1157
#10 0x561d76117304 in pc_new_cpu (typename=0x561d7707c880
"qemu32-i386-cpu", apic_id=1, errp=0x7ffe8f75f4c0) at
/root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1099
#11 0x561d761174cc in pc_hot_add_cpu (id=1, errp=0x7ffe8f75f558) at
/root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1131
#12 0x561d761cb7b3 in qmp_cpu_add (id=1, errp=0x7ffe8f75f558) at
qmp.c:126
#13 0x561d761bdc60 in qmp_marshal_cpu_add (args=0x561d7711a1b0,
ret=0x7ffe8f75f5b0, errp=0x7ffe8f75f5a8) at qmp-marshal.c:1274
#14 0x561d764b2f13 in do_qmp_dispatch (request=0x561d77129360,
errp=0x7ffe8f75f610) at qapi/qmp-dispatch.c:98
#15 0x561d764b3042 in qmp_dispatch (request=0x561d77129360) at
qapi/qmp-dispatch.c:125
#16 0x561d76084d39 in handle_qmp_command (parser=0x561d771288b0,
tokens=0x561d770f8cc0) at /root/xen/tools/qemu-xen-dir/monitor.c:3758
#17 0x561d764ba402 in json_message_process_token
(lexer=0x561d771288b8, input=0x561d770f9040, type=JSON_RCURLY, x=1,
y=11) at qobject/json-streamer.c:105
#18 0x561d764dd5dc in json_lexer_feed_char (lexer=0x561d771288b8,
ch=125 '}', flush=false) at qobject/json-lexer.c:319
#19 0x561d764dd71c in json_lexer_feed (lexer=0x561d771288b8,
buffer=0x7ffe8f75f880 "}\224Dx\035V", size=1) at qobject/json-lexer.c:369
#20 0x561d764ba4a2 in json_message_parser_feed
(parser=0x561d771288b0, buffer=0x7ffe8f75f880 "}\224Dx\035V", size=1) at
qobject/json-streamer.c:124
#21 0x561d76084e53 in monitor_qmp_read (opaque=0x561d77128830,
buf=0x7ffe8f75f880 "}\224Dx\035V", size=1) at
/root/xen/tools/qemu-xen-dir/monitor.c:3788
#22 0x561d761a3b2d in qemu_chr_be_write_impl (s=0x561d77107020,
buf=0x7ffe8f75f880 "}\224Dx\035V", len=1) at qemu-char.c:419
#23 0x561d761a3b8f in qemu_chr_be_write (s=0x561d77107020,
buf=0x7ffe8f75f880 "}\224Dx\035V", len=1) at qemu-char.c:431
#24 0x561d761a83d0 in tcp_chr_read (chan=0x561d785ae8a0,
cond=G_IO_IN, opaque=0x561d77107020) at qemu-char.c:3145
#25 0x561d76475a36 in qio_channel_fd_source_dispatch
(source=0x561d77cbe7c0, callback=0x561d761a8279 ,
user_data=0x561d77107020) at io/channel-watch.c:84
#26 0x7f77f3e407aa in g_main_context_dispatch () from
/lib64/libglib-2.0.so.0
#27 0x561d763f03ee in glib_pollfds_poll () at main-loop.c:259
#28 0x561d763f04dc in os_host_main_loop_wait (timeout=15045517) at
main-loop.c:306
#29 0x561d763f058c in main_loop_wait (nonblocking=0) at main-loop.c:556
#30 0x561d761b1cb5 in main_loop () at 

Re: [Xen-devel] [dpdk-dev] [PATCH] maintainers: claim responsability for xen

2017-02-16 Thread Vincent JARDIN

Le 16/02/2017 à 14:36, Konrad Rzeszutek Wilk a écrit :

Is it time now to officially remove Dom0 support?

So we do have an prototype implementation of netback but it is waiting
for review of xen-devel to the spec.

And I believe the implementation does utilize some of the dom0
parts of code in DPDK.


Please, do you have URLs/pointers about it? It would be interesting to 
share it with DPDK community too.


Best regards,
  Vincent

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v15 05/10] x86: add multiboot2 protocol support for EFI platforms

2017-02-16 Thread Daniel Kiper
On Thu, Feb 16, 2017 at 02:29:45AM -0700, Jan Beulich wrote:
> >>> On 15.02.17 at 22:53,  wrote:
> > On Wed, Feb 15, 2017 at 03:22:02AM -0700, Jan Beulich wrote:
> >> >>> On 14.02.17 at 19:38,  wrote:
> >> > --- a/xen/arch/x86/boot/head.S
> >> > +++ b/xen/arch/x86/boot/head.S
> >> > @@ -394,10 +394,18 @@ __start:
> >> >
> >> >  /* EFI IA-32 platforms are not supported. */
> >> >  cmpl$MULTIBOOT2_TAG_TYPE_EFI32,MB2_tag_type(%ecx)
> >> > +/*
> >> > + * Here we should zap vga_text_buffer. However, we can disable
> >> > + * VGA updates in simpler and more reliable way later.
> >> > + */
> >> >  je  .Lmb2_efi_ia_32
> >> >
> >> >  /* Bootloader shutdown EFI x64 boot services. */
> >> >  cmpl$MULTIBOOT2_TAG_TYPE_EFI64,MB2_tag_type(%ecx)
> >> > +/*
> >> > + * Here we should zap vga_text_buffer. However, we can disable
> >> > + * VGA updates in simpler and more reliable way later.
> >> > + */
> >> >  je  .Lmb2_no_bs
> >>
> >> I'm afraid I don't view these comments as helpful in understanding
> >> the whole situation. That's partly because I don't follow both the
> >> "simpler" and "more reliable" parts (using just the information here,
> >
> > OK, I will clarify it.
> >
> >> i.e. leaving aside what you've given as explanation earlier, albeit I
> >> don't think that was fully clarifying things either), and partly
> >> because I continue to think that the explanation should go where
> >> the labels are (which is what I had meant to suggest with my
> >> comment placement in reply to v14). Nor does the adjustment
> >
> > OK.
> >
> >> above help (me) understand the correctness of the dual use of
> >> .Lmb2_no_bs.
> >
> > What do you mean by "dual use of .Lmb2_no_bs."? I would like to be sure.
>
> As said in v14 review, it's being jumped to from two rather different
> places, and hence the VGA aspect isn't obviously the same for both.

OK, I will try to clarify. If a bootloader called us using __efi64_mb2_start
we are sure that we are running on EFI platform and there is no VGA there.
It means that we can safely zap vga_text_buffer unconditionally in first steps
(we do that in second instruction). Then we do not need to take care about
that in case of error. And one of these errors is lack of 
MULTIBOOT2_TAG_TYPE_EFI_BS
tag. It means that EFI boot services are shutdown. So, we are in black hole.
We have to inform user about that and halt the system. And that is why we
jump to .Lmb2_no_bs here.

On the other hand if the bootloader called us using start label then in most
cases we are running on legacy BIOS platforms. However, if the bootloader also
provided MULTIBOOT2_TAG_TYPE_EFI64 tag here then we are sure that we are running
on EFI platform and EFI boot services are shutdown. This happens when we are
loaded by old boot loader which does not understand MULTIBOOT2_HEADER_TAG_EFI_BS
and MULTIBOOT2_HEADER_TAG_ENTRY_ADDRESS_EFI64 tags. So, as above we can jump
to .Lmb2_no_bs here too.

I hope that helps.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 105860: regressions - trouble: broken/fail/pass

2017-02-16 Thread osstest service owner
flight 105860 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105860/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt 14 guest-saverestorefail REGR. vs. 105852
 test-armhf-armhf-xl 15 guest-start/debian.repeat fail REGR. vs. 105852
 test-amd64-amd64-xl-qemuu-debianhvm-i386 12 guest-saverestore fail REGR. vs. 
105852

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  79903e50dba9e7442c9b7ca424661bb020e9dbf2
baseline version:
 xen  7127d53fe891f9ea67357587a33a7aaba4b55f45

Last test of basis   105852  2017-02-16 14:01:33 Z0 days
Failing since105857  2017-02-16 16:01:30 Z0 days3 attempts
Testing same since   105858  2017-02-16 18:02:44 Z0 days2 attempts


People who touched revisions under test:
  Andrew Cooper 
  Daniel Kiper 
  Jan Beulich 
  Julien Grall 

jobs:
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-amd64-libvirt  pass
 build-arm64-pvopsfail
 test-armhf-armhf-xl  fail
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 fail
 test-amd64-amd64-libvirt fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 79903e50dba9e7442c9b7ca424661bb020e9dbf2
Author: Jan Beulich 
Date:   Thu Feb 16 18:11:42 2017 +0100

x86emul: catch exceptions occurring in stubs

Before adding more use of stubs cloned from decoded guest insns, guard
ourselves against mistakes there: Should an exception (with the
noteworthy exception of #PF) occur inside the stub, forward it to the
guest.

Since the exception fixup table entry can't encode the address of the
faulting insn itself, attach it to the return address instead. This at
once provides a convenient place to hand the exception information
back: The return address is being overwritten by it before branching to
the recovery code.

Take the opportunity and (finally!) add symbol resolution to the
respective log messages (the new one is intentionally not being coded
that way, as it covers stub addresses only, which don't have symbols
associated).

Also take the opportunity and make search_one_extable() static again.

Suggested-by: Andrew Cooper 
Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

commit 8c935f5ff1cac422b4de21cbab69e13d2ebb25be
Author: Daniel Kiper 
Date:   Thu Feb 16 18:10:04 2017 +0100

x86: add "w" flag to .init.data section definition

init.data section is clearly writable, so, add "w" flag to its
definition in xen/arch/x86/boot/x86_64.S.

Signed-off-by: Daniel Kiper 
Reviewed-by: Andrew Cooper 

commit e004384bb371f5ab76a79b83da79981f4c400b83
Author: Andrew Cooper 
Date:   Wed Feb 15 19:15:41 2017 +

x86/hypercall: Move hypercall continuation logic

The newly-repurposed arch/x86/hypercall.c is a more appropriate place for 
the
hypercall continuation logic to live.

This is purely code motion.

Signed-off-by: Andrew Cooper 

Re: [Xen-devel] (resend) qemu crashes during VCPU hotplug

2017-02-16 Thread Stefano Stabellini
On Thu, 16 Feb 2017, Boris Ostrovsky wrote:
> On 02/15/2017 11:20 PM, Boris Ostrovsky wrote:
> > (Now with correct address for Stefano)
> > 
> > Upstream qemu appears to be crashing during VCPU hotplug. I think this
> > is something relatively new since I have been doing this a few week ago.
> > 
> > I reproduced this on two different setups. Haven't had a chance to look
> > any further but e3cadac073 looks suspicious.
> 
> Yes, this is the offending commit.
> 
> For Xen guests qemu never sets pcms->fw_cfg.

Thanks for narrowing it down. Are you using qemu-xen/staging? It looks
like it has been fixed in qemu.org by

commit 26ef65beab852caf2b1ef4976e3473f2d525164d
Author: Igor Mammedov 
Date:   Fri Dec 30 15:33:11 2016 +0100

pc: fix crash in rtc_set_memory() if initial cpu is marked as hotplugged

can you confirm?



> -boris
> 
> > 
> > The crash happens in fw_cfg_modify_bytes_read() when we pass in NULL
> > pointer as first argument. The stack is below:
> > 
> > 
> > (gdb) where
> > #0  0x561d762d64d4 in fw_cfg_modify_bytes_read (s=0x0, key=5,
> > data=0x561d787031d0, len=2) at hw/nvram/fw_cfg.c:614
> > #1  0x561d762d6730 in fw_cfg_modify_i16 (s=0x0, key=5, value=2) at
> > hw/nvram/fw_cfg.c:656
> > #2  0x561d761195b3 in pc_cpu_plug (hotplug_dev=0x561d770f9810,
> > dev=0x561d7712a7e0, errp=0x7ffe8f75f2b0) at
> > /root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1823
> > #3  0x561d76119fc0 in pc_machine_device_plug_cb
> > (hotplug_dev=0x561d770f9810, dev=0x561d7712a7e0, errp=0x7ffe8f75f2b0) at
> > /root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1993
> > #4  0x561d76239cba in hotplug_handler_plug
> > (plug_handler=0x561d770f9810, plugged_dev=0x561d7712a7e0,
> > errp=0x7ffe8f75f2b0) at hw/core/hotplug.c:34
> > #5  0x561d7623584d in device_set_realized (obj=0x561d7712a7e0,
> > value=true, errp=0x7ffe8f75f468) at hw/core/qdev.c:928
> > #6  0x561d763e22a3 in property_set_bool (obj=0x561d7712a7e0,
> > v=0x561d78702090, name=0x561d764fd9d0 "realized", opaque=0x561d785aea00,
> > errp=0x7ffe8f75f468) at qom/object.c:1854
> > #7  0x561d763e07aa in object_property_set (obj=0x561d7712a7e0,
> > v=0x561d78702090, name=0x561d764fd9d0 "realized", errp=0x7ffe8f75f468)
> > at qom/object.c:1088
> > #8  0x561d763e3609 in object_property_set_qobject
> > (obj=0x561d7712a7e0, value=0x561d773869c0, name=0x561d764fd9d0
> > "realized", errp=0x7ffe8f75f468) at qom/qom-qobject.c:27
> > #9  0x561d763e0a40 in object_property_set_bool (obj=0x561d7712a7e0,
> > value=true, name=0x561d764fd9d0 "realized", errp=0x7ffe8f75f468) at
> > qom/object.c:1157
> > #10 0x561d76117304 in pc_new_cpu (typename=0x561d7707c880
> > "qemu32-i386-cpu", apic_id=1, errp=0x7ffe8f75f4c0) at
> > /root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1099
> > #11 0x561d761174cc in pc_hot_add_cpu (id=1, errp=0x7ffe8f75f558) at
> > /root/xen/tools/qemu-xen-dir/hw/i386/pc.c:1131
> > #12 0x561d761cb7b3 in qmp_cpu_add (id=1, errp=0x7ffe8f75f558) at
> > qmp.c:126
> > #13 0x561d761bdc60 in qmp_marshal_cpu_add (args=0x561d7711a1b0,
> > ret=0x7ffe8f75f5b0, errp=0x7ffe8f75f5a8) at qmp-marshal.c:1274
> > #14 0x561d764b2f13 in do_qmp_dispatch (request=0x561d77129360,
> > errp=0x7ffe8f75f610) at qapi/qmp-dispatch.c:98
> > #15 0x561d764b3042 in qmp_dispatch (request=0x561d77129360) at
> > qapi/qmp-dispatch.c:125
> > #16 0x561d76084d39 in handle_qmp_command (parser=0x561d771288b0,
> > tokens=0x561d770f8cc0) at /root/xen/tools/qemu-xen-dir/monitor.c:3758
> > #17 0x561d764ba402 in json_message_process_token
> > (lexer=0x561d771288b8, input=0x561d770f9040, type=JSON_RCURLY, x=1,
> > y=11) at qobject/json-streamer.c:105
> > #18 0x561d764dd5dc in json_lexer_feed_char (lexer=0x561d771288b8,
> > ch=125 '}', flush=false) at qobject/json-lexer.c:319
> > #19 0x561d764dd71c in json_lexer_feed (lexer=0x561d771288b8,
> > buffer=0x7ffe8f75f880 "}\224Dx\035V", size=1) at qobject/json-lexer.c:369
> > #20 0x561d764ba4a2 in json_message_parser_feed
> > (parser=0x561d771288b0, buffer=0x7ffe8f75f880 "}\224Dx\035V", size=1) at
> > qobject/json-streamer.c:124
> > #21 0x561d76084e53 in monitor_qmp_read (opaque=0x561d77128830,
> > buf=0x7ffe8f75f880 "}\224Dx\035V", size=1) at
> > /root/xen/tools/qemu-xen-dir/monitor.c:3788
> > #22 0x561d761a3b2d in qemu_chr_be_write_impl (s=0x561d77107020,
> > buf=0x7ffe8f75f880 "}\224Dx\035V", len=1) at qemu-char.c:419
> > #23 0x561d761a3b8f in qemu_chr_be_write (s=0x561d77107020,
> > buf=0x7ffe8f75f880 "}\224Dx\035V", len=1) at qemu-char.c:431
> > #24 0x561d761a83d0 in tcp_chr_read (chan=0x561d785ae8a0,
> > cond=G_IO_IN, opaque=0x561d77107020) at qemu-char.c:3145
> > #25 0x561d76475a36 in qio_channel_fd_source_dispatch
> > (source=0x561d77cbe7c0, callback=0x561d761a8279 ,
> > user_data=0x561d77107020) at io/channel-watch.c:84
> > #26 0x7f77f3e407aa in g_main_context_dispatch () from
> > /lib64/libglib-2.0.so.0
> > #27 0x561d763f03ee in glib_pollfds_poll 

Re: [Xen-devel] [PATCH v4 1/2] x86/paravirt: Change vcp_is_preempted() arg type to long

2017-02-16 Thread Waiman Long
On 02/16/2017 11:09 AM, Peter Zijlstra wrote:
> On Wed, Feb 15, 2017 at 04:37:49PM -0500, Waiman Long wrote:
>> The cpu argument in the function prototype of vcpu_is_preempted()
>> is changed from int to long. That makes it easier to provide a better
>> optimized assembly version of that function.
>>
>> For Xen, vcpu_is_preempted(long) calls xen_vcpu_stolen(int), the
>> downcast from long to int is not a problem as vCPU number won't exceed
>> 32 bits.
>>
> Note that because of the cast in PVOP_CALL_ARG1() this patch is
> pointless.
>
> Then again, it doesn't seem to affect code generation, so why not. Takes
> away the reliance on that weird cast.

I add this patch because I am a bit uneasy about clearing the upper 32
bits of rdi and assuming that the compiler won't have a previous use of
those bits. It gives me peace of mind.

Cheers,
Longman


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64

2017-02-16 Thread Waiman Long
On 02/16/2017 11:48 AM, Peter Zijlstra wrote:
> On Wed, Feb 15, 2017 at 04:37:50PM -0500, Waiman Long wrote:
>> +/*
>> + * Hand-optimize version for x86-64 to avoid 8 64-bit register saving and
>> + * restoring to/from the stack. It is assumed that the preempted value
>> + * is at an offset of 16 from the beginning of the kvm_steal_time structure
>> + * which is verified by the BUILD_BUG_ON() macro below.
>> + */
>> +#define PREEMPTED_OFFSET16
> As per Andrew's suggestion, the 'right' way is something like so.

Thanks for the tip. I was not aware of the asm-offsets stuff. I will
update the patch to use it.

Cheers,
Longman


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: increase default dom0_mem to 512M

2017-02-16 Thread Stefano Stabellini
On Thu, 16 Feb 2017, Julien Grall wrote:
> Hi Stefano,
> 
> On 15/02/2017 23:05, Stefano Stabellini wrote:
> > The default dom0_mem is 128M which is not sufficient to boot a Ubuntu
> > based Dom0. Increase it to 512M.
> > 
> > Signed-off-by: Stefano Stabellini 
> 
> I am not a big fan of increasing the default value. 128M is plenty enough if
> you use a small DOM0 (e.g buildroot or yocto) and people may rely on it
> because it is the default value in the documentation
> (see docs/misc/xen-command-line.markdown).
> 
> Also, 512M may boot Ubuntu for you but it might not be the case in all the
> configuration. There is no perfect default value, but I think the smaller is
> better. Looking at the documentation, it looks like x86 is using 128MB or 1/16
> of the memory (whichever is smaller).
> 
> But to be fair, I am not even sure why there is a default value, it is quite
> easy to specify the amount of memory used by DOM0 on the command line.

This is a topic particularly prone to bike-shedding :-)

Like you wrote, there is no perfect default value. The problem with
128M is that Dom0 will fail to boot without any meaningful errors. I
think it makes for a poor out of the box experience: the user is trying
to boot Xen for the first time on her board, she hasn't customized
much yet, and she has to waste a couple of hours to figure out why Dom0
is crashing.

On the other end, people that are trying to use as little memory as
possible, they are well past the first Xen boot, and they are most
certainly aware of the dom0_mem parameter.

In other words, setting dom0_mem to 128M by default hurts first time
users without helping seasoned users very much.

Rather than having dom0_mem=128M by default, causing a dom0 crash
without any obvious errors, I would rather crash Xen explicitly if
dom0_mem is not set. That way, the user is forced to type in the
dom0_mem parameter and could more easily guess why dom0 is crashing.


> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index c97a1f5..f4612a2 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -31,7 +31,7 @@ integer_param("dom0_max_vcpus", opt_dom0_max_vcpus);
> > 
> >  int dom0_11_mapping = 1;
> > 
> > -#define DOM0_MEM_DEFAULT 0x800 /* 128 MiB */
> > +#define DOM0_MEM_DEFAULT 0x2000 /* 512 MiB */
> 
> I would use the MB(..) macro here to make the code more readable.

I'll do that


> >  static u64 __initdata dom0_mem = DOM0_MEM_DEFAULT;
> > 
> >  static void __init parse_dom0_mem(const char *s)
> > 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/mm: Swap mfn_valid() to use mfn_t

2017-02-16 Thread Andrew Cooper
Replace one opencoded mfn_eq() and some coding style issues on altered lines.
Swap __mfn_valid() to being bool, although it can't be updated to take mfn_t
because of include dependencies.

No functional change.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Julien Grall 
CC: Tim Deegan 
CC: George Dunlap 
---
 xen/arch/arm/mem_access.c   |  2 +-
 xen/arch/arm/mm.c   |  2 +-
 xen/arch/arm/p2m.c  |  6 +++---
 xen/arch/arm/setup.c|  3 ++-
 xen/arch/x86/cpu/mcheck/mce.c   |  2 +-
 xen/arch/x86/cpu/mcheck/vmce.c  |  2 +-
 xen/arch/x86/cpu/vpmu.c |  2 +-
 xen/arch/x86/debug.c|  2 +-
 xen/arch/x86/domctl.c   |  2 +-
 xen/arch/x86/hvm/mtrr.c |  2 +-
 xen/arch/x86/mm.c   | 20 ++--
 xen/arch/x86/mm/hap/guest_walk.c|  2 +-
 xen/arch/x86/mm/hap/hap.c   |  2 --
 xen/arch/x86/mm/hap/nested_hap.c|  2 --
 xen/arch/x86/mm/mem_access.c|  4 ++--
 xen/arch/x86/mm/mem_sharing.c   |  4 +---
 xen/arch/x86/mm/p2m-ept.c   |  4 ++--
 xen/arch/x86/mm/p2m-pod.c   |  2 --
 xen/arch/x86/mm/p2m-pt.c|  2 --
 xen/arch/x86/mm/p2m.c   |  2 --
 xen/arch/x86/mm/paging.c|  2 --
 xen/arch/x86/mm/shadow/private.h|  2 --
 xen/arch/x86/tboot.c|  4 ++--
 xen/arch/x86/x86_64/mm.c| 16 
 xen/arch/x86/x86_64/traps.c | 14 +++---
 xen/common/grant_table.c| 10 +-
 xen/common/memory.c |  8 
 xen/common/page_alloc.c | 12 ++--
 xen/common/pdx.c|  2 +-
 xen/drivers/passthrough/amd/iommu_guest.c   | 10 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
 xen/drivers/passthrough/vtd/dmar.c  |  2 +-
 xen/drivers/passthrough/vtd/x86/vtd.c   |  2 +-
 xen/include/asm-arm/mm.h|  4 ++--
 xen/include/asm-arm/p2m.h   |  2 +-
 xen/include/asm-x86/p2m.h   |  2 +-
 xen/include/asm-x86/page.h  |  2 +-
 xen/include/xen/pdx.h   |  2 +-
 xen/include/xen/tmem_xen.h  |  2 +-
 39 files changed, 77 insertions(+), 92 deletions(-)

diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
index 03b20c4..04b1506 100644
--- a/xen/arch/arm/mem_access.c
+++ b/xen/arch/arm/mem_access.c
@@ -172,7 +172,7 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned 
long flag,
 if ( mfn_eq(mfn, INVALID_MFN) )
 goto err;
 
-if ( !mfn_valid(mfn_x(mfn)) )
+if ( !mfn_valid(mfn) )
 goto err;
 
 /*
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 2d96423..f0a2edd 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1350,7 +1350,7 @@ int replace_grant_host_mapping(unsigned long addr, 
unsigned long mfn,
 
 bool is_iomem_page(mfn_t mfn)
 {
-return !mfn_valid(mfn_x(mfn));
+return !mfn_valid(mfn);
 }
 
 void clear_and_clean_page(struct page_info *page)
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 5e8f6cd..e36d075 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -648,7 +648,7 @@ static void p2m_put_l3_page(const lpae_t pte)
 {
 unsigned long mfn = pte.p2m.base;
 
-ASSERT(mfn_valid(mfn));
+ASSERT(mfn_valid(_mfn(mfn)));
 put_page(mfn_to_page(mfn));
 }
 }
@@ -695,7 +695,7 @@ static void p2m_free_entry(struct p2m_domain *p2m,
 p2m_flush_tlb_sync(p2m);
 
 mfn = _mfn(entry.p2m.base);
-ASSERT(mfn_valid(mfn_x(mfn)));
+ASSERT(mfn_valid(mfn));
 
 free_domheap_page(mfn_to_page(mfn_x(mfn)));
 }
@@ -1412,7 +1412,7 @@ struct page_info *get_page_from_gva(struct vcpu *v, 
vaddr_t va,
 if ( rc )
 goto err;
 
-if ( !mfn_valid(maddr >> PAGE_SHIFT) )
+if ( !mfn_valid(_mfn(maddr >> PAGE_SHIFT)) )
 goto err;
 
 page = mfn_to_page(maddr >> PAGE_SHIFT);
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 2bf4363..b25ad80 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -268,7 +268,8 @@ void __init discard_initial_modules(void)
 if ( mi->module[i].kind == BOOTMOD_XEN )
 continue;
 
-if ( !mfn_valid(paddr_to_pfn(s)) || !mfn_valid(paddr_to_pfn(e)))
+if ( !mfn_valid(_mfn(paddr_to_pfn(s))) ||
+ !mfn_valid(_mfn(paddr_to_pfn(e
 continue;
 
 dt_unreserved_regions(s, e, init_domheap_pages, 0);
diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c

Re: [Xen-devel] [PATCH v4 1/2] arm: read/write rank->vcpu atomically

2017-02-16 Thread Stefano Stabellini
On Thu, 16 Feb 2017, Julien Grall wrote:
> Hi Stefano,
> 
> On 11/02/17 02:05, Stefano Stabellini wrote:
> > We don't need a lock in vgic_get_target_vcpu anymore, solving the
> > following lock inversion bug: the rank lock should be taken first, then
> > the vgic lock. However, gic_update_one_lr is called with the vgic lock
> > held, and it calls vgic_get_target_vcpu, which tries to obtain the rank
> > lock.
> > 
> > Coverity-ID: 1381855
> > Coverity-ID: 1381853
> > 
> > Signed-off-by: Stefano Stabellini 
> > ---
> >  xen/arch/arm/vgic-v2.c |  6 +++---
> >  xen/arch/arm/vgic-v3.c |  6 +++---
> >  xen/arch/arm/vgic.c| 27 +--
> >  3 files changed, 11 insertions(+), 28 deletions(-)
> > 
> > diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
> > index 3dbcfe8..b30379e 100644
> > --- a/xen/arch/arm/vgic-v2.c
> > +++ b/xen/arch/arm/vgic-v2.c
> > @@ -79,7 +79,7 @@ static uint32_t vgic_fetch_itargetsr(struct vgic_irq_rank
> > *rank,
> >  offset &= ~(NR_TARGETS_PER_ITARGETSR - 1);
> > 
> >  for ( i = 0; i < NR_TARGETS_PER_ITARGETSR; i++, offset++ )
> > -reg |= (1 << rank->vcpu[offset]) << (i * NR_BITS_PER_TARGET);
> > +reg |= (1 << read_atomic(>vcpu[offset])) << (i *
> > NR_BITS_PER_TARGET);
> 
> I was about to suggested to turn vcpu into an atomic_t to catch potential
> misuse. But unfortunately atomic_t is int.

Indeed


> So I would probably add a comment on top of the field vcpu in vgic_irq_rank
> explaining that vcpu should be read using atomic.
>
> With that:
> 
> Reviewed-by: Julien Grall 

Thank you, I added:

+ * Use atomic operations to read/write the vcpu fields to avoid
+ * taking the rank lock.

and committed.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen on ARM IRQ latency and scheduler overhead

2017-02-16 Thread Stefano Stabellini
On Thu, 16 Feb 2017, Dario Faggioli wrote:
> On Fri, 2017-02-10 at 10:32 -0800, Stefano Stabellini wrote:
> > On Fri, 10 Feb 2017, Dario Faggioli wrote:
> > > Right, interesting use case. I'm glad to see there's some interest
> > > in
> > > it, and am happy to help investigating, and trying to make things
> > > better.
> > 
> > Thank you!
> > 
> Hey, FYI, I am looking into this. It's just that I've got a couple of
> other things in my plate right now.

OK


> > > Ok, do you (or anyone) mind explaining in a little bit more details
> > > what the app tries to measure and how it does that.
> > 
> > Give a look at app/xen/guest_irq_latency/apu.c:
> > 
> > https://github.com/edgarigl/tbm/blob/master/app/xen/guest_irq_latency
> > /apu.c
> > 
> > This is my version which uses the phys_timer (instead of the
> > virt_timer):
> > 
> > https://github.com/sstabellini/tbm/blob/phys-timer/app/xen/guest_irq_
> > latency/apu.c
> > 
> Yep, I did look at those.
> 
> > Edgar can jump in to add more info if needed (he is the author of the
> > app), but as you can see from the code, the app is very simple. It
> > sets
> > a timer event in the future, then, after receiving the event, it
> > checks
> > the current time and compare it with the deadline.
> > 
> Right, and you check the current time with:
> 
>   now = aarch64_irq_get_stamp(el);
> 
> which I guess is compatible with the values you use for the counter.

Yes


> > > > These are the results, in nanosec:
> > > > 
> > > >     AVG MIN MAX WARM MAX
> > > > 
> > > > NODEBUG no WFI  1890    1800    3170    2070
> > > > NODEBUG WFI 4850    4810    7030    4980
> > > > NODEBUG no WFI credit2  2217    2090    3420    2650
> > > > NODEBUG WFI credit2 8080    7890    10320   8300
> > > > 
> > > > DEBUG no WFI    2252    2080    3320    2650
> > > > DEBUG WFI   6500    6140    8520    8130
> > > > DEBUG WFI, credit2  8050    7870    10680   8450
> > > > 
> > > > DEBUG means Xen DEBUG build.
> > > > 
> [...]
> > > > As you can see, depending on whether the guest issues a WFI or
> > > > not
> > > > while
> > > > waiting for interrupts, the results change significantly.
> > > > Interestingly,
> > > > credit2 does worse than credit1 in this area.
> > > > 
> > > This is with current staging right? 
> > 
> > That's right.
> > 
> So, when you have the chance, can I see the output of
> 
>  xl debug-key r
>  xl dmesg
> 
> Both under Credit1 and Credit2?

I'll see what I can do.


> > > I can try sending a quick patch for disabling the tick when a CPU
> > > is
> > > idle, but I'd need your help in testing it.
> > 
> > That might be useful, however, if I understand this right, we don't
> > actually want a periodic timer in Xen just to make the system more
> > responsive, do we?
> > 
> IMO, no. I'd call that an hack, and don't think we should go that
> route.
> 
> Not until we have figured out and squeezed as much as possible all the
> other sources of latency, and that has proven not to be enough, at
> least.
> 
> I'll send the patch.
> 
> > > > Assuming that the problem is indeed the scheduler, one workaround
> > > > that
> > > > we could introduce today would be to avoid calling vcpu_unblock
> > > > on
> > > > guest
> > > > WFI and call vcpu_yield instead. This change makes things
> > > > significantly
> > > > better:
> > > > 
> > > >  AVG MIN MAX WARM
> > > > MAX
> > > > DEBUG WFI (yield, no block)  2900    2190    5130    5130
> > > > DEBUG WFI (yield, no block) credit2  3514    2280    6180    5430
> > > > 
> > > > Is that a reasonable change to make? Would it cause significantly
> > > > more
> > > > power consumption in Xen (because xen/arch/arm/domain.c:idle_loop
> > > > might
> > > > not be called anymore)?
> > > > 
> > > Exactly. So, I think that, as Linux has 'idle=poll', it is
> > > conceivable
> > > to have something similar in Xen, and if we do, I guess it can be
> > > implemented as you suggest.
> > > 
> > > But, no, I don't think this is satisfying as default, not before
> > > trying
> > > to figure out what is going on, and if we can improve things in
> > > other
> > > ways.
> > 
> > OK. Should I write a patch for that? I guess it would be arm specific
> > initially. What do you think it would be a good name for the option?
> > 
> Well, I think such an option may be useful on other arches too, but we
> better measure/verify that before. Therefore, I'd be ok for this to be
> only implemented on ARM for now.
> 
> As per the name, I actually like the 'idle=', and as values, what about
> 'sleep' or 'block' for the current default, and stick to 'poll' for the
> new behavior you'll implement? Or do you think it is at risk of
> confusion with Linux?
> 
> An alternative would be something like 'wfi=[sleep,idle]', or
> 'wfi=[block,poll]', but that is ARM specific, and it'd mean we will
> need another option for making x86 behave similarly.

That's a 

[Xen-devel] [xen-unstable-smoke test] 105858: regressions - trouble: broken/fail/pass

2017-02-16 Thread osstest service owner
flight 105858 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105858/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt 14 guest-saverestorefail REGR. vs. 105852
 test-armhf-armhf-xl 15 guest-start/debian.repeat fail REGR. vs. 105852
 test-amd64-amd64-xl-qemuu-debianhvm-i386 12 guest-saverestore fail REGR. vs. 
105852

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  79903e50dba9e7442c9b7ca424661bb020e9dbf2
baseline version:
 xen  7127d53fe891f9ea67357587a33a7aaba4b55f45

Last test of basis   105852  2017-02-16 14:01:33 Z0 days
Failing since105857  2017-02-16 16:01:30 Z0 days2 attempts
Testing same since   105858  2017-02-16 18:02:44 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Daniel Kiper 
  Jan Beulich 
  Julien Grall 

jobs:
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-amd64-libvirt  pass
 build-arm64-pvopsfail
 test-armhf-armhf-xl  fail
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 fail
 test-amd64-amd64-libvirt fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 79903e50dba9e7442c9b7ca424661bb020e9dbf2
Author: Jan Beulich 
Date:   Thu Feb 16 18:11:42 2017 +0100

x86emul: catch exceptions occurring in stubs

Before adding more use of stubs cloned from decoded guest insns, guard
ourselves against mistakes there: Should an exception (with the
noteworthy exception of #PF) occur inside the stub, forward it to the
guest.

Since the exception fixup table entry can't encode the address of the
faulting insn itself, attach it to the return address instead. This at
once provides a convenient place to hand the exception information
back: The return address is being overwritten by it before branching to
the recovery code.

Take the opportunity and (finally!) add symbol resolution to the
respective log messages (the new one is intentionally not being coded
that way, as it covers stub addresses only, which don't have symbols
associated).

Also take the opportunity and make search_one_extable() static again.

Suggested-by: Andrew Cooper 
Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

commit 8c935f5ff1cac422b4de21cbab69e13d2ebb25be
Author: Daniel Kiper 
Date:   Thu Feb 16 18:10:04 2017 +0100

x86: add "w" flag to .init.data section definition

init.data section is clearly writable, so, add "w" flag to its
definition in xen/arch/x86/boot/x86_64.S.

Signed-off-by: Daniel Kiper 
Reviewed-by: Andrew Cooper 

commit e004384bb371f5ab76a79b83da79981f4c400b83
Author: Andrew Cooper 
Date:   Wed Feb 15 19:15:41 2017 +

x86/hypercall: Move hypercall continuation logic

The newly-repurposed arch/x86/hypercall.c is a more appropriate place for 
the
hypercall continuation logic to live.

This is purely code motion.

Signed-off-by: Andrew Cooper 

[Xen-devel] [xen-unstable test] 105840: tolerable FAIL - PUSHED

2017-02-16 Thread osstest service owner
flight 105840 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/105840/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 105821
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 105821
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 105821
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 105821
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 105821
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 105821
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 105821
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 105821

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 build-arm64   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 xen  1e88db4701d6e2d00c04795e6aacaea942b617e6
baseline version:
 xen  93e1435290867703c50acad1f54b9208df473562

Last test of basis   105821  2017-02-15 15:11:39 Z1 days
Testing same since   105840  2017-02-16 07:20:11 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Dario Faggioli 
  George Dunlap 
  Ian Jackson 
  Jan Beulich 
  Julien Grall 
  Konrad Rzeszutek Wilk 
  Meng Xu 
  Olaf Hering 
  Oleksandr Andrushchenko 
  Stefano 

Re: [Xen-devel] [PATCH v4 2/2] arm: proper ordering for correct execution of gic_update_one_lr and vgic_store_itargetsr

2017-02-16 Thread Julien Grall

Hi Stefano,

On 11/02/17 02:05, Stefano Stabellini wrote:

Concurrent execution of gic_update_one_lr and vgic_store_itargetsr can
result in the wrong pcpu being set as irq target, see
http://marc.info/?l=xen-devel=148218667104072.

To solve the issue, add barriers, remove an irq from the inflight
queue, only after the affinity has been set. On the other end, write the
new vcpu target, before checking GIC_IRQ_GUEST_MIGRATING and inflight.

Signed-off-by: Stefano Stabellini 
---
 xen/arch/arm/gic.c | 3 ++-
 xen/arch/arm/vgic-v2.c | 4 ++--
 xen/arch/arm/vgic-v3.c | 4 +++-
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index a5348f2..bb52959 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -503,12 +503,13 @@ static void gic_update_one_lr(struct vcpu *v, int i)
  !test_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
 gic_raise_guest_irq(v, irq, p->priority);
 else {
-list_del_init(>inflight);
 if ( test_and_clear_bit(GIC_IRQ_GUEST_MIGRATING, >status) )
 {
 struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
 irq_set_affinity(p->desc, cpumask_of(v_target->processor));
 }
+smp_mb();
+list_del_init(>inflight);


I don't understand why you remove from the inflight list afterwards. If 
you do that you introduce that same problem as discussed in

<7a78c859-fa6f-ba10-b574-d8edd46ea...@arm.com>

As long as the interrupt is routed to the pCPU running 
gic_update_one_lr, the interrupt cannot fired because the interrupts are 
masked. However, as soon as irq_set_affinity is called the interrupt may 
fire on the other pCPU.


However, list_del_init is not atomic and not protected by any lock. So 
vgic_vcpu_inject_irq may see a corrupted version of {p,n}->inflight.


Did I miss anything?

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/3] fuzz/x86emul: avoid race in link farm rune

2017-02-16 Thread Andrew Cooper
On 16/02/17 18:56, Wei Liu wrote:
> Several `ln -sf` can race with each other and cause error like:
>
> 14:43:56 00:07:06 O: ln: cannot remove 'asm': No such file or directory
>
> Provide dedicated targets for soft-linking directories.
>
> Reported-by: Andrew Cooper 
> Signed-off-by: Wei Liu 
> ---
>  tools/fuzz/x86_instruction_emulator/Makefile | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/tools/fuzz/x86_instruction_emulator/Makefile 
> b/tools/fuzz/x86_instruction_emulator/Makefile
> index fede7e9afd..0cd49753cf 100644
> --- a/tools/fuzz/x86_instruction_emulator/Makefile
> +++ b/tools/fuzz/x86_instruction_emulator/Makefile
> @@ -8,12 +8,16 @@ else
>  x86-instruction-emulator-fuzzer-all:
>  endif
>  
> -x86_emulate/x86_emulate.c x86_emulate/x86_emulate.h:
> +x86_emulate:
>   [ -L x86_emulate ] || ln -sf $(XEN_ROOT)/xen/arch/x86/x86_emulate .
>  
> -asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h:
> +x86_emulate/x86_emulate.c x86_emulate/x86_emulate.h: x86_emulate

You should be able to do this:

x86_emulate/%: x86_emulate

> +
> +asm:
>   [ -L asm ] || ln -sf $(XEN_ROOT)/xen/include/asm-x86 asm
>  
> +asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h: asm

And this:

asm/%: asm

Otherwise, Reviewed-by: Andrew Cooper 

> +
>  x86_emulate.c x86_emulate.h: %:
>   [ -L $* ] || ln -sf $(XEN_ROOT)/tools/tests/x86_emulator/$*
>  


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table

2017-02-16 Thread Shanker Donthineni

Hi Andre,


On 01/30/2017 12:31 PM, Andre Przywara wrote:

Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
an EventID (the MSI payload or interrupt ID) to a pair of LPI number
and collection ID, which points to the target CPU.
This mapping is stored in the device and collection tables, which software
has to provide for the ITS to use.
Allocate the required memory and hand it the ITS.
The maximum number of devices is limited to a compile-time constant
exposed in Kconfig.

Signed-off-by: Andre Przywara 
---
  xen/arch/arm/Kconfig |  14 +
  xen/arch/arm/gic-v3-its.c| 129
+++
  xen/arch/arm/gic-v3.c|   5 ++
  xen/include/asm-arm/gic_v3_its.h |  55 -
  4 files changed, 202 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 71734a1..81bc233 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -64,6 +64,20 @@ config MAX_PHYS_LPI_BITS
This can be overriden on the command line with the max_lpi_bits
parameter.

+config MAX_PHYS_ITS_DEVICE_BITS
+depends on HAS_ITS
+int "Number of device bits the ITS supports"
+range 1 32
+default "10"
+help
+  Specifies the maximum number of devices which want to use the
ITS.
+  Xen needs to allocates memory for the whole range very early.
+  The allocation scheme may be sparse, so a much larger number must
+  be supported to cover devices with a high bus number or those on
+  separate bus segments.
+  This can be overriden on the command line with the
max_its_device_bits
+  parameter.
+
  endmenu

  menu "ARM errata workaround via the alternative framework"
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index ff0f571..c31fef6 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -20,9 +20,138 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  #include 
  #include 
  #include 
+#include 
+
+#define BASER_ATTR_MASK   \
+((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)   | \
+ (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT) | \
+ (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
+#define BASER_RO_MASK   (GENMASK(58, 56) | GENMASK(52, 48))
+
+static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
+{
+uint64_t ret;
+
+if ( page_bits < 16 )
+return (uint64_t)addr & GENMASK(47, page_bits);
+
+ret = addr & GENMASK(47, 16);
+return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
+}
+
+#define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)
+
+static int its_map_baser(void __iomem *basereg, uint64_t regc, int
nr_items)
+{
+uint64_t attr, reg;
+int entry_size = ((regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f) + 1;
+int pagesz = 0, order, table_size;
+void *buffer = NULL;
+
+attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
+attr |= GIC_BASER_CACHE_SameAsInner <<
GITS_BASER_OUTER_CACHEABILITY_SHIFT;
+attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
+
+/*
+ * Setup the BASE register with the attributes that we like. Then read
+ * it back and see what sticks (page size, cacheability and
shareability
+ * attributes), retrying if necessary.
+ */
+while ( 1 )
+{
+table_size = ROUNDUP(nr_items * entry_size,
BIT(PAGE_BITS(pagesz)));
+order = get_order_from_bytes(table_size);
+
+if ( !buffer )
+buffer = alloc_xenheap_pages(order, 0);
+if ( !buffer )
+return -ENOMEM;
+
+reg  = attr;
+reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
+reg |= table_size >> PAGE_BITS(pagesz);
+reg |= regc & BASER_RO_MASK;
+reg |= GITS_VALID_BIT;
+reg |= encode_phys_addr(virt_to_maddr(buffer), PAGE_BITS(pagesz));
+
+writeq_relaxed(reg, basereg);
+regc = readl_relaxed(basereg);

expecting regc = readq_relaxed(baserq)?


+
+/* The host didn't like our attributes, just use what it returned.
*/
+if ( (regc & BASER_ATTR_MASK) != attr )
+{
+/* If we can't map it shareable, drop cacheability as well. */
+if ( (regc & GITS_BASER_SHAREABILITY_MASK) ==
GIC_BASER_NonShareable )
+{
+regc &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
+attr = regc & BASER_ATTR_MASK;
+continue;
+}
+attr = regc & BASER_ATTR_MASK;
+}
+
+/* If the host accepted our page size, we are done. */
+if ( (regc & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
Invalid check, should be 'if ( ((regc >> GITS_BASER_PAGE_SIZE_SHIFT) & 
0x3) == pagesz)'

+return 0;
+
+/* None of the page sizes was accepted, give up */
+if ( pagesz >= 2 )
+   

Re: [Xen-devel] Unshared IOMMU issues

2017-02-16 Thread Stefano Stabellini
On Thu, 16 Feb 2017, Julien Grall wrote:
> Hi Jan,
> 
> On 16/02/17 16:34, Jan Beulich wrote:
> > > > > On 16.02.17 at 17:11,  wrote:
> > > On 16/02/17 15:52, Jan Beulich wrote:
> > > > > > > On 16.02.17 at 16:02,  wrote:
> > > > > On Thu, Feb 16, 2017 at 11:36 AM, Jan Beulich 
> > > > > wrote:
> > > > > > > > > On 15.02.17 at 18:43,  wrote:
> > > > > > > 1.
> > > > > > > I need:
> > > > > > > Allow P2M core on ARM to update IOMMU mapping from the first
> > > > > > > "p2m_set_entry".
> > > > > > > I do:
> > > > > > > I explicitly set need_iommu flag for *every* guest domain during
> > > > > > > iommu_domain_init() on ARM in case if page table is not shared.
> > > > > > > At that moment I have no knowledge about will any device be
> > > > > > > assigned
> > > > > > > to this domain or not. I am just want to receive all mapping
> > > > > > > updates
> > > > > > > from P2M code. The P2M will update IOMMU mapping only when
> > > > > > > need_iommu
> > > > > > > is set and page table is not shared.
> > > > > > > I have doubts:
> > > > > > > Is it correct to just force need_iommu flag?
> > > > > > 
> > > > > > No, I don't think so. This is a waste of a measurable amount of
> > > > > > resources when page tables aren't shared.
> > > > > > 
> > > > > > > Or maybe another flag should be introduced?
> > > > > > 
> > > > > > Not sure what you think of here. Where's the problem with building
> > > > > > IOMMU page tables at the time the first device gets assigned, just
> > > > > > like x86 does?
> > > > > OK, I have already had a look at  arch_iommu_populate_page_table() for
> > > > > x86.
> > > > > I don't know at the moment how this solution can help me.
> > > > > There are a least two points the prevent me from doing the similar
> > > > > thing.
> > > > > 1. For create IOMMU mapping I need both mfn and gfn. (+ flags).
> > > > > I am able to get mfn only. How can I find corresponding gfn?
> > > > 
> > > > As the x86 one shows, via mfn_to_gmfn(). If ARM doesn't have
> > > > this, perhaps it needs to gain it?
> > > 
> > > Looking at the x86 implementation, mfn_to_gmfn is using a table for that
> > > indexed by the MFN. This is requiring virtual address space that is
> > > already scarce on ARM32 and also using physical memory.
> > > 
> > > I am not convinced this is the right things to do on ARM as the only
> > > user so far will be the IOMMU code.
> > > 
> > > Another solution would be to go through the stage-2 page table and
> > > replicate all the mappings.
> > 
> > That's certainly an option, if you want to save the memory (and
> > VA space on ARM32). It only makes the x86 model of establishing
> > the mappings slightly more compute intensive.
> 
> I made a quick calculation, ARM32 supports up 40-bit PA and IPA (e.g guest
> address), which means 28-bits of MFN/GFN. The GFN would have to be stored in a
> 32-bit for alignment, so we would need 2^28 * 4 = 1GiB of virtual address
> space and potentially physical memory.
> We don't have 1GB of VA space free on 32-bit right now.
> 
> ARM64 currently supports up to 48-bit PA and 48-bit IPA, which means 36-bits
> of MFN/GFN. The GFN would have to be stored in 64-bit for alignment, so we
> would need 2^36 * 8 = 512GiB of virtual address space and potentially physical
> memory. While virtual address space is not a problem, the memory is a problem
> for embedded platform. We want Xen to be as lean as possible.

I think you are right that it's best not to introduce mfn-to-gfn
tracking on ARM.


> I though a bit more on the advantage to create the IOMMU page tables later on.
> 
> For devices assigned at domain creation, we know that devices will be assigned
> so we could let Xen and populated IOMMU while allocating the memory for the
> domain.
> 
> For hotplug devices, this would only happen for PCI as integrated devices
> cannot be hotplug. As we go towards emulating a root complex in Xen rather
> than the PV approach, you would need the root complex to be instantiated when
> the domain is created (unless we want to hotplug too?). IHMO, if you assign a
> root complex is likely that you will want to assign a PCI afterwards. So
> allocating page tables at that time sounds sensible.
> 
> This would avoid to walk the stage-2 page tables at runtime.
> 
> Any opinions?

Obviously, static device assignment is not a problem. The issue is only
hotplug, which today we don't support.

Like you say, hotplug by definition requires a discoverable bus of some
sort. For example PCI. When we introduce it in guests, we'll also
introduce IOMMU pagetables. The only downside of this idea, is that it
will require users to write something in the VM config file, for example
pci=[''], just to reserve the right to do pci hotplug at some point in
the future. This is not the case today on x86. It's not great, but I
cannot see a way around it, given that we probably don't want to
introduce a root complex in all ARM guests 

[Xen-devel] [PATCH 2/3] x86emul/test: avoid race in link farm rune

2017-02-16 Thread Wei Liu
Several `ln -sf` can race with each other.  Provide dedicated targets
for soft-linking directories.

Signed-off-by: Wei Liu 
---
 tools/tests/x86_emulator/Makefile | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/tests/x86_emulator/Makefile 
b/tools/tests/x86_emulator/Makefile
index 9bf36947c0..860e3867d6 100644
--- a/tools/tests/x86_emulator/Makefile
+++ b/tools/tests/x86_emulator/Makefile
@@ -40,12 +40,16 @@ distclean: clean
 .PHONY: install
 install:
 
-x86_emulate/x86_emulate.c x86_emulate/x86_emulate.h:
+x86_emulate:
[ -L x86_emulate ] || ln -sf $(XEN_ROOT)/xen/arch/x86/x86_emulate .
 
-asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h:
+x86_emulate/x86_emulate.c x86_emulate/x86_emulate.h: x86_emulate
+
+asm:
[ -L asm ] || ln -sf $(XEN_ROOT)/xen/include/asm-x86 asm
 
+asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h: asm
+
 HOSTCFLAGS += $(CFLAGS_xeninclude) -I.
 
 x86.h := asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 0/3] Fuzz/x86emual and x86emul/test fixes

2017-02-16 Thread Wei Liu
Wei Liu (3):
  fuzz/x86emul: avoid race in link farm rune
  x86emul/test: avoid race in link farm rune
  gitignore: ignore asm soft link in fuzz and x86emul test

 .gitignore   | 2 ++
 tools/fuzz/x86_instruction_emulator/Makefile | 8 ++--
 tools/tests/x86_emulator/Makefile| 8 ++--
 3 files changed, 14 insertions(+), 4 deletions(-)

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 3/3] gitignore: ignore asm soft link in fuzz and x86emul test

2017-02-16 Thread Wei Liu
Signed-off-by: Wei Liu 
---
 .gitignore | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/.gitignore b/.gitignore
index 8810c6975a..c8d56d1bdb 100644
--- a/.gitignore
+++ b/.gitignore
@@ -147,6 +147,7 @@ tools/flask/utils/flask-setenforce
 tools/flask/utils/flask-set-bool
 tools/flask/utils/flask-label-pci
 tools/fuzz/libelf/afl-libelf-fuzzer
+tools/fuzz/x86_instruction_emulator/asm
 tools/fuzz/x86_instruction_emulator/x86_emulate*
 tools/fuzz/x86_instruction_emulator/afl-x86-insn-emulator-fuzzer
 tools/helpers/_paths.h
@@ -209,6 +210,7 @@ tools/python/build/*
 tools/security/secpol_tool
 tools/security/xen/*
 tools/security/xensec_tool
+tools/tests/x86_emulator/asm
 tools/tests/x86_emulator/blowfish.bin
 tools/tests/x86_emulator/blowfish.h
 tools/tests/x86_emulator/test_x86_emulator
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/3] fuzz/x86emul: avoid race in link farm rune

2017-02-16 Thread Wei Liu
Several `ln -sf` can race with each other and cause error like:

14:43:56 00:07:06 O: ln: cannot remove 'asm': No such file or directory

Provide dedicated targets for soft-linking directories.

Reported-by: Andrew Cooper 
Signed-off-by: Wei Liu 
---
 tools/fuzz/x86_instruction_emulator/Makefile | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/Makefile 
b/tools/fuzz/x86_instruction_emulator/Makefile
index fede7e9afd..0cd49753cf 100644
--- a/tools/fuzz/x86_instruction_emulator/Makefile
+++ b/tools/fuzz/x86_instruction_emulator/Makefile
@@ -8,12 +8,16 @@ else
 x86-instruction-emulator-fuzzer-all:
 endif
 
-x86_emulate/x86_emulate.c x86_emulate/x86_emulate.h:
+x86_emulate:
[ -L x86_emulate ] || ln -sf $(XEN_ROOT)/xen/arch/x86/x86_emulate .
 
-asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h:
+x86_emulate/x86_emulate.c x86_emulate/x86_emulate.h: x86_emulate
+
+asm:
[ -L asm ] || ln -sf $(XEN_ROOT)/xen/include/asm-x86 asm
 
+asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h: asm
+
 x86_emulate.c x86_emulate.h: %:
[ -L $* ] || ln -sf $(XEN_ROOT)/tools/tests/x86_emulator/$*
 
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/2] arm: read/write rank->vcpu atomically

2017-02-16 Thread Julien Grall

Hi Stefano,

On 11/02/17 02:05, Stefano Stabellini wrote:

We don't need a lock in vgic_get_target_vcpu anymore, solving the
following lock inversion bug: the rank lock should be taken first, then
the vgic lock. However, gic_update_one_lr is called with the vgic lock
held, and it calls vgic_get_target_vcpu, which tries to obtain the rank
lock.

Coverity-ID: 1381855
Coverity-ID: 1381853

Signed-off-by: Stefano Stabellini 
---
 xen/arch/arm/vgic-v2.c |  6 +++---
 xen/arch/arm/vgic-v3.c |  6 +++---
 xen/arch/arm/vgic.c| 27 +--
 3 files changed, 11 insertions(+), 28 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index 3dbcfe8..b30379e 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -79,7 +79,7 @@ static uint32_t vgic_fetch_itargetsr(struct vgic_irq_rank 
*rank,
 offset &= ~(NR_TARGETS_PER_ITARGETSR - 1);

 for ( i = 0; i < NR_TARGETS_PER_ITARGETSR; i++, offset++ )
-reg |= (1 << rank->vcpu[offset]) << (i * NR_BITS_PER_TARGET);
+reg |= (1 << read_atomic(>vcpu[offset])) << (i * 
NR_BITS_PER_TARGET);


I was about to suggested to turn vcpu into an atomic_t to catch 
potential misuse. But unfortunately atomic_t is int.


So I would probably add a comment on top of the field vcpu in 
vgic_irq_rank explaining that vcpu should be read using atomic.


With that:

Reviewed-by: Julien Grall 

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Next Xen ARM community call

2017-02-16 Thread Stefano Stabellini
On Thu, 16 Feb 2017, Julien Grall wrote:
> Hello,
> 
> The last two community calls went really good and I am suggesting to have a
> new one on Wednesday 1st March at 4pm UTC. Any opinions?

Is it possible to change the time to 5pm?


> Also, do you have any specific topic you would like to talk during the next
> call?

I would like to discuss progress on PV protocols and IRQ latency.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] qemu-upstream triggering OOM killer

2017-02-16 Thread Stefano Stabellini
On Thu, 16 Feb 2017, Jan Beulich wrote:
> >>> On 16.02.17 at 16:23,  wrote:
>  On 14.02.17 at 15:56,  wrote:
> >> On Fri, Feb 10, 2017 at 02:54:23AM -0700, Jan Beulich wrote:
> >>> Not so far. It appears to happen when grub clears the screen
> >>> before displaying its graphical menu, so I'd rather suspect an issue
> >>> with a graphics related change (the one you pointed out isn't).
> >> 
> >> I tried to reproduce this, by limiting the amount of memory available to
> >> qemu using cgroups, but about 44MB of memory is enough to boot a guest
> >> (tried Ubuntu and Debian).
> > 
> > Okay, not a qemuu regression after all, but a libxc one. It just so
> > happens that qemut tries to allocate a much larger amount, which
> > triggers mmap() failure earlier and hence doesn't manage to trigger
> > the oom killer. Patch (almost) on its way.
> 
> Patch sent, allowing that guest to get further (and Windows to
> properly boot). However, now the guest is stuck right at the point
> where X wants to switch to its designated video mode, with qemu
> (for somewhere between half a minute and a minute) consuming
> one full CPU's bandwidth. Once qemu's CPU consumption went
> down, no further progress is being made though.
> 
> Again I'd be thankful for hints on how to debug such a situation.

I would bisect it. It's probably due to a change in the cirrus vga code
or common vga code. It might be worth testing with stdvga=1 to narrow it
down.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT

2017-02-16 Thread Julien Grall

On 16/02/17 17:44, Andre Przywara wrote:

Hi Julien,


Hi Andre,


On 06/02/17 12:39, Julien Grall wrote:

On 30/01/17 18:31, Andre Przywara wrote:

+
+if ( !dt_device_is_compatible(its, "arm,gic-v3-its") )
+continue;
+
+if ( !dt_device_is_available(its) )
+continue;


Can an ITS really be disabled? Or is it just for debugging?


This was indeed introduced for debugging, but is useful with multiple
ITSes. Firmware could ship with a DT covering the maximum hardware
configuration, then disabling not existing hardware at boot time.

And in general I consider this good style to support the status property.


I tend to agree here, however this will have a side-effect on the 
device-tree generated for DOM0. The ITS node will not be replicated and 
you will end up using a broken device-tree. While Linux will only 
complain, some other OS may just abort.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Unshared IOMMU issues

2017-02-16 Thread Julien Grall

Hi Jan,

On 16/02/17 16:34, Jan Beulich wrote:

On 16.02.17 at 17:11,  wrote:

On 16/02/17 15:52, Jan Beulich wrote:

On 16.02.17 at 16:02,  wrote:

On Thu, Feb 16, 2017 at 11:36 AM, Jan Beulich  wrote:

On 15.02.17 at 18:43,  wrote:

1.
I need:
Allow P2M core on ARM to update IOMMU mapping from the first "p2m_set_entry".
I do:
I explicitly set need_iommu flag for *every* guest domain during
iommu_domain_init() on ARM in case if page table is not shared.
At that moment I have no knowledge about will any device be assigned
to this domain or not. I am just want to receive all mapping updates
from P2M code. The P2M will update IOMMU mapping only when need_iommu
is set and page table is not shared.
I have doubts:
Is it correct to just force need_iommu flag?


No, I don't think so. This is a waste of a measurable amount of
resources when page tables aren't shared.


Or maybe another flag should be introduced?


Not sure what you think of here. Where's the problem with building
IOMMU page tables at the time the first device gets assigned, just
like x86 does?

OK, I have already had a look at  arch_iommu_populate_page_table() for x86.
I don't know at the moment how this solution can help me.
There are a least two points the prevent me from doing the similar thing.
1. For create IOMMU mapping I need both mfn and gfn. (+ flags).
I am able to get mfn only. How can I find corresponding gfn?


As the x86 one shows, via mfn_to_gmfn(). If ARM doesn't have
this, perhaps it needs to gain it?


Looking at the x86 implementation, mfn_to_gmfn is using a table for that
indexed by the MFN. This is requiring virtual address space that is
already scarce on ARM32 and also using physical memory.

I am not convinced this is the right things to do on ARM as the only
user so far will be the IOMMU code.

Another solution would be to go through the stage-2 page table and
replicate all the mappings.


That's certainly an option, if you want to save the memory (and
VA space on ARM32). It only makes the x86 model of establishing
the mappings slightly more compute intensive.


I made a quick calculation, ARM32 supports up 40-bit PA and IPA (e.g 
guest address), which means 28-bits of MFN/GFN. The GFN would have to be 
stored in a 32-bit for alignment, so we would need 2^28 * 4 = 1GiB of 
virtual address space and potentially physical memory.

We don't have 1GB of VA space free on 32-bit right now.

ARM64 currently supports up to 48-bit PA and 48-bit IPA, which means 
36-bits of MFN/GFN. The GFN would have to be stored in 64-bit for 
alignment, so we would need 2^36 * 8 = 512GiB of virtual address space 
and potentially physical memory. While virtual address space is not a 
problem, the memory is a problem for embedded platform. We want Xen to 
be as lean as possible.


I though a bit more on the advantage to create the IOMMU page tables 
later on.


For devices assigned at domain creation, we know that devices will be 
assigned so we could let Xen and populated IOMMU while allocating the 
memory for the domain.


For hotplug devices, this would only happen for PCI as integrated 
devices cannot be hotplug. As we go towards emulating a root complex in 
Xen rather than the PV approach, you would need the root complex to be 
instantiated when the domain is created (unless we want to hotplug 
too?). IHMO, if you assign a root complex is likely that you will want 
to assign a PCI afterwards. So allocating page tables at that time 
sounds sensible.


This would avoid to walk the stage-2 page tables at runtime.

Any opinions?


2. The d->page_list seems only contains domain RAM (not 100% sure).
Where can I get other regions (mmios, etc)?


These necessarily are being tracked for the domain, so you need to
take them from wherever they're stored on ARM.


Is there any reason why you don't seem to have such code on x86? AFAICT
only RAM is currently mapped.


Well, no-one care so far, I would guess. Even runtime mappings of
MMIO space were mad work properly only very recently (by Roger).


Regarding ARM, we know whether a domain is allowed to access a certain
range of MMIO, but, similarly to above, we don't have the conversion MFN
-> GFN for them. However in this case, we would not be able to use an
M2P as a same MFN may be mapped in multiple domain.


Mapped by multiple domains? If one DomU and Dom0, I can see
this as possible, but not a requirement. If multiple DomU-s I have
to raise the question of security.


The interrupt controller GICv2 supports virtualization and allow the 
guest to manage interrupt as it was running on baremetal. There is a 
per-CPU interface that is mapped on every domain. Obviously, the state 
is saved/restored during vCPU context switch.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   3   >