Re: [Xen-devel] [PATCH] Fix a panic in SPEC_CTRL_ENTRY_FROM_INTR_IST

2018-02-13 Thread Jan Beulich
>>> On 14.02.18 at 05:03,  wrote:
> On on IBRS available env, bootup panic when bti=0 like below:
> 
> (XEN) Speculative mitigation facilities:
> (XEN)   Hardware features: SMEP IBRS/IBPB STIBP
> (XEN) BTI mitigations: Thunk N/A, Others: IBRS- SMEP
> (XEN) [ Xen-4.4.4OVM  x86_64  debug=n  Tainted:C ]
> (XEN) CPU:0
> (XEN) RIP:e008:[]
> entry.o#handle_ist_exception+0xd1/0x176
> (XEN) RFLAGS: 00010046   CONTEXT: hypervisor
> (XEN) rax:    rbx:    rcx: 0048
> (XEN) rdx: 0001   rsi:    rdi: 
> (XEN) rbp:    rsp: 82d080529f58   r8:  
> (XEN) r9:     r10:    r11: 
> (XEN) r12:    r13:    r14: 82d08052
> (XEN) r15:    cr0: 8005003b   cr4: 001506f0
> (XEN) cr3: 76fbe000   cr2: 
> (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
> (XEN) Xen stack trace from rsp=82d080529f58:
> (XEN)0018  0002 82d080528000
> (XEN) 82d0802a50e0 82d08052fd98 82d08072fc00
> (XEN) 0001 0400 0830
> (XEN) 000a 82d0803f0fc0 0002
> (XEN)82d080298876 e008 0046 82d08052fdf8
> (XEN)
> (XEN) Xen call trace:
> (XEN)[] entry.o#handle_ist_exception+0xd1/0x176
> (XEN)
> (XEN)
> (XEN) 
> (XEN) Panic on CPU 0:
> (XEN) GENERAL PROTECTION FAULT
> (XEN) [error_code=]
> (XEN) 
> 
> It's due to %edx isn't cleared to zero before wrmsr.
> 
> DO_OVERWRITE_RSB clobbers %eax and happend to cover the bug in certain case so
> we didn't reproduce without bti=0. Change to use %edx.
> 
> Reviewed-by: Boris Ostrovsky 
> Tested-by: Boris Ostrovsky 
> Signed-off-by: Zhenzhong Duan 

Two formal things: Please put tags in sequential (in time) order:
reporter, author(s), reviewers/testers. And please follow patch
submission rules - send them _to_ the list, with maintainers (and
other interested parties) on _cc_.

> --- a/xen/include/asm-x86/spec_ctrl_asm.h
> +++ b/xen/include/asm-x86/spec_ctrl_asm.h
> @@ -269,16 +269,16 @@
>   * This is logical merge of DO_OVERWRITE_RSB and DO_SPEC_CTRL_ENTRY
>   * maybexen=1, but with conditionals rather than alternatives.
>   */
> -movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %eax
> +movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %edx
>  
> -testb $BTI_IST_RSB, %al
> +testb $BTI_IST_RSB, %dl
>  jz .L\@_skip_rsb
>  
>  DO_OVERWRITE_RSB
>  
>  .L\@_skip_rsb:
>  
> -testb $BTI_IST_WRMSR, %al
> +testb $BTI_IST_WRMSR, %dl
>  jz .L\@_skip_wrmsr
>  
>  xor %edx, %edx
> @@ -291,6 +291,7 @@
>   * Load Xen's intended value.  SPEC_CTRL_IBRS vs 0 is encoded in the
>   * bottom bit of bti_ist_info, via a deliberate alias with BTI_IST_IBRS.
>   */
> +xor %edx, %edx
>  mov $MSR_SPEC_CTRL, %ecx
>  and $BTI_IST_IBRS, %eax
>  wrmsr

This is wrong now, and the comment very clearly states why. I think
rather than switching %eax to %edx it would be better to preserve
%eax (e.g. by saving into %edx) around DO_OVERWRITE_RSB. I'll
send an updated patch in a few minutes.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/xpti: Hide almost all of .text and all .data/.rodata/.bss mappings

2018-02-13 Thread Juergen Gross
On 13/02/18 20:45, Andrew Cooper wrote:
> The current XPTI implementation isolates the directmap (and therefore a lot of
> guest data), but a large quantity of CPU0's state (including its stack)
> remains visible.
> 
> Furthermore, an attacker able to read .text is in a vastly superior position
> to normal when it comes to fingerprinting Xen for known vulnerabilities, or
> scanning for ROP/Spectre gadgets.
> 
> Collect together the entrypoints in .text.entry (currently 3x4k frames, but
> can almost certainly be slimmed down), and create a common mapping which is
> inserted into each per-cpu shadow.  The stubs are also inserted into this
> mapping by pointing at the in-use L2.  This allows stubs allocated later (SMP
> boot, or CPU hotplug) to work without further changes to the common mappings.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Wei Liu 
> CC: Juergen Gross 
> 
> RFC, because I don't think the stubs handling is particularly sensible.
> 
> We allocate 4k of virtual address space per CPU, but squash loads of CPUs
> together onto a single MFN.  The stubs ought to be isolated as well (as they
> leak the virtual addresses of each stack), which can be done by allocating an
> MFN per CPU (and simplifies cpu_smpboot_alloc() somewhat).  At this point, we
> can't use a common set of mappings, and will have to clone the single stub and
> .entry.text into each PCPUs copy of the pagetables.
> 
> Also, my plan to cause .text.entry to straddle a 512TB boundary (and therefore
> avoid any further pagetable allocations) has come a little unstuck because of
> CONFIG_BIGMEM.  I'm still working out whether there is a sensible way to
> rearrange the virtual layout for this plan to work.
> ---
>  xen/arch/x86/smpboot.c | 37 -
>  xen/arch/x86/x86_64/compat/entry.S |  2 ++
>  xen/arch/x86/x86_64/entry.S|  4 +++-
>  xen/arch/x86/xen.lds.S |  7 +++
>  4 files changed, 44 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
> index 2ebef03..2519141 100644
> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -622,6 +622,9 @@ unsigned long alloc_stub_page(unsigned int cpu, unsigned 
> long *mfn)
>  unmap_domain_page(memset(__map_domain_page(pg), 0xcc, PAGE_SIZE));
>  }
>  
> +/* Confirm that all stubs fit in a single L2 pagetable. */
> +BUILD_BUG_ON(NR_CPUS * PAGE_SIZE > (1u << L2_PAGETABLE_SHIFT));

So we limit NR_CPUS to be max 512 now?

Maybe you should use STUB_BUF_SIZE instead of PAGE_SIZE?

BTW: Is there any reason we don't use a common stub page mapped to each
per-cpu stack area? The stack address can easily be obtained via %rip
relative addressing then (see my patch for the per-vcpu stacks:
https://lists.xen.org/archives/html/xen-devel/2018-02/msg00773.html )


Juergen

> +
>  stub_va = XEN_VIRT_END - (cpu + 1) * PAGE_SIZE;
>  if ( map_pages_to_xen(stub_va, mfn_x(page_to_mfn(pg)), 1,
>PAGE_HYPERVISOR_RX | MAP_SMALL_PAGES) )
> @@ -651,9 +654,6 @@ static int clone_mapping(const void *ptr, root_pgentry_t 
> *rpt)
>  l2_pgentry_t *pl2e;
>  l1_pgentry_t *pl1e;
>  
> -if ( linear < DIRECTMAP_VIRT_START )
> -return 0;
> -
>  flags = l3e_get_flags(*pl3e);
>  ASSERT(flags & _PAGE_PRESENT);
>  if ( flags & _PAGE_PSE )
> @@ -744,6 +744,9 @@ static __read_mostly int8_t opt_xpti = -1;
>  boolean_param("xpti", opt_xpti);
>  DEFINE_PER_CPU(root_pgentry_t *, root_pgt);
>  
> +static root_pgentry_t common_pgt;
> +extern char _stextentry[], _etextentry[];
> +
>  static int setup_cpu_root_pgt(unsigned int cpu)
>  {
>  root_pgentry_t *rpt;
> @@ -764,8 +767,32 @@ static int setup_cpu_root_pgt(unsigned int cpu)
>  idle_pg_table[root_table_offset(RO_MPT_VIRT_START)];
>  /* SH_LINEAR_PT inserted together with guest mappings. */
>  /* PERDOMAIN inserted during context switch. */
> -rpt[root_table_offset(XEN_VIRT_START)] =
> -idle_pg_table[root_table_offset(XEN_VIRT_START)];
> +
> +/* One-time setup of common_pgt, which maps .text.entry and the stubs. */
> +if ( unlikely(!root_get_intpte(common_pgt)) )
> +{
> +unsigned long stubs_linear = XEN_VIRT_END - 1;
> +l3_pgentry_t *stubs_main, *stubs_shadow;
> +char *ptr;
> +
> +for ( rc = 0, ptr = _stextentry;
> +  !rc && ptr < _etextentry; ptr += PAGE_SIZE )
> +rc = clone_mapping(ptr, rpt);
> +
> +if ( rc )
> +return rc;
> +
> +stubs_main = 
> l4e_to_l3e(idle_pg_table[l4_table_offset(stubs_linear)]);
> +stubs_shadow = l4e_to_l3e(rpt[l4_table_offset(stubs_linear)]);
> +
> +/* Splice into the regular L2 mapping the stubs. */
> +stubs_shadow[l3_table_offset(stubs_linear)] =
> +stubs_main[l3_table_offset(stubs_linear)];
> +
> + 

[Xen-devel] [seabios test] 119091: regressions - FAIL

2018-02-13 Thread osstest service owner
flight 119091 seabios real [real]
http://logs.test-lab.xenproject.org/osstest/logs/119091/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail in 119060 REGR. vs. 
115539

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-ws16-amd64 16 guest-localmigrate/x10 fail pass in 
119060

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115539
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 115539
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 115539
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 seabios  4a6dbcea3e412fe12effa2f812f50dd7eae90955
baseline version:
 seabios  0ca6d6277dfafc671a5b3718cbeb5c78e2a888ea

Last test of basis   115539  2017-11-03 20:48:58 Z  102 days
Failing since115733  2017-11-10 17:19:59 Z   95 days  120 attempts
Testing same since   118668  2018-02-08 04:50:43 Z6 days8 attempts


People who touched revisions under test:
  Kevin O'Connor 
  Marcel Apfelbaum 
  Michael S. Tsirkin 
  Nikolay Nikolov 
  Paul Menzel 
  Stefan Berger 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-ws16-amd64 fail
 test-amd64-i386-xl-qemuu-ws16-amd64  fail
 test-amd64-amd64-xl-qemuu-win10-i386 fail
 test-amd64-i386-xl-qemuu-win10-i386  fail
 test-amd64-amd64-qemuu-nested-intel  pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 4a6dbcea3e412fe12effa2f812f50dd7eae90955
Author: Nikolay Nikolov 
Date:   Sun Feb 4 17:27:01 2018 +0200

floppy: Use timer_check() in floppy_wait_irq()

Use timer_check() instead of using floppy_motor_counter in BDA for the
timeout check in floppy_wait_irq().

The problem with using floppy_motor_counter was that, after it reaches
0, it immediately stops the floppy motors, which is not what is
supposed to happen on real hardware. Instead, after a timeout (like in
the end of every floppy operation, regardless of the result - success,
timeout or error), the floppy motors must be kept spinning for
additional 2 seconds (the FLOPPY_MOTOR_TICKS). So, now the

Re: [Xen-devel] [PATCH v4 2/7] xen: xsm: flask: introduce XENMAPSPACE_gmfn_share for memory sharing

2018-02-13 Thread Zhongze Liu
Hi Jan,

2018-02-13 23:26 GMT+08:00 Jan Beulich :
 On 13.02.18 at 16:15,  wrote:
>> I've updated the comments according to your previous suggestions,
>> do they look good to you?
>
> The one in the public header is way too verbose. I specifically don't
> see why you would need to spell out XSM privilege requirements
> there. Please make new comments match existing ones in style and
> verbosity if at all possible, while still conveying all necessary /
> relevant information.
>

I shortened it a little bit, and now it looks like:

#define XENMAPSPACE_gmfn_share   6 /* GMFN from another dom. Unlike
gmfn_foreign,
  if (c) tries to map pages from (t) into
  (d), this doesn't require that (d) itself
  has the privilege to map the pages, but
  instead requires that (c) has the
  privilege to do so, as long as (d) and (t)
  are allowed to share memory pages.
  This is XENMEM_add_to_physmap_batch only,
  and currently ARM only. */


Cheers,

Zhongze Liu

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-linus bisection] complete test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm

2018-02-13 Thread osstest service owner
branch xen-unstable
xenbranch xen-unstable
job test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm
testid xen-boot

Tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  linux 
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
  Bug introduced:  178e834c47b0d01352c48730235aae69898fbc02
  Bug not present: 13ddd1667e7f01071cdf120132238ffca004a88e
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/119147/


  (Revision log too long, omitted.)


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osstest/results/bisect/linux-linus/test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm.xen-boot.html
Revision IDs in each graph node refer, respectively, to the Trees above.


Running cs-bisection-step 
--graph-out=/home/logs/results/bisect/linux-linus/test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm.xen-boot
 --summary-out=tmp/119147.bisection-summary --basis-template=118324 
--blessings=real,real-bisect linux-linus 
test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm xen-boot
Searching for failure / basis pass:
 119064 fail [host=huxelrebe0] / 118629 [host=pinot0] 118598 [host=baroque0] 
118586 [host=pinot1] 118576 [host=baroque1] 118566 [host=italia1] 118556 
[host=fiano0] 118538 [host=fiano0] 118501 [host=elbling1] 118464 ok.
Failure / basis pass flights: 119064 / 118464
(tree with no url: minios)
(tree with no url: ovmf)
(tree with no url: seabios)
Tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git
Latest 178e834c47b0d01352c48730235aae69898fbc02 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
c8ea0457495342c417c3dc033bba25148b279f60 
2b033e396f4fa0981bae1213cdacd15775655a97 
c93014ad3aa6aa88dfa5e96f66e8adb561483b8d
Basis pass 13ddd1667e7f01071cdf120132238ffca004a88e 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
c8ea0457495342c417c3dc033bba25148b279f60 
2b033e396f4fa0981bae1213cdacd15775655a97 
4c7e478d597b0346eef3a256cfd6794ac778b608
Generating revisions with ./adhoc-revtuple-generator  
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git#13ddd1667e7f01071cdf120132238ffca004a88e-178e834c47b0d01352c48730235aae69898fbc02
 
git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860
 
git://xenbits.xen.org/qemu-xen-traditional.git#c8ea0457495342c417c3dc033bba25148b279f60-c8ea0457495342c417c3dc033bba25148b279f60
 
git://xenbits.xen.org/qemu-xen.git#2b033e396f4fa0981bae1213cdacd15775655a97-2b033e396f4fa0981bae1213cdacd15775655a97
 
git://xenbits.xen.org/xen.git#4c7e478d597b0346eef3a256cfd6794ac778b608-c93014ad3aa6aa88dfa5e96f66e8adb561483b8d
adhoc-revtuple-generator: tree discontiguous: linux-2.6
Loaded 1002 nodes in revision graph
Searching for test results:
 118445 [host=huxelrebe1]
 118362 [host=elbling0]
 118401 [host=chardonnay1]
 118428 [host=chardonnay0]
 118464 pass 13ddd1667e7f01071cdf120132238ffca004a88e 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
c8ea0457495342c417c3dc033bba25148b279f60 
2b033e396f4fa0981bae1213cdacd15775655a97 
4c7e478d597b0346eef3a256cfd6794ac778b608
 118538 [host=fiano0]
 118561 [host=fiano0]
 118565 [host=fiano0]
 118501 [host=elbling1]
 118559 [host=fiano0]
 118560 [host=fiano0]
 118555 [host=fiano0]
 118556 [host=fiano0]
 118563 [host=fiano0]
 118566 [host=italia1]
 118576 [host=baroque1]
 118586 [host=pinot1]
 118629 [host=pinot0]
 118598 [host=baroque0]
 118638 fail irrelevant
 118672 fail irrelevant
 118775 fail irrelevant
 118893 fail irrelevant
 118968 fail irrelevant
 119064 fail 178e834c47b0d01352c48730235aae69898fbc02 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
c8ea0457495342c417c3dc033bba25148b279f60 
2b033e396f4fa0981bae1213cdacd15775655a97 
c93014ad3aa6aa88dfa5e96f66e8adb561483b8d
 119067 pass 13ddd1667e7f01071cdf120132238ffca004a88e 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
c8ea0457495342c417c3dc033bba25148b279f60 
2b033e396f4fa0981bae1213cdacd15775655a97 
4c7e478d597b0346eef3a256cfd6794ac778b608
 119073 fail irrelevant
 119120 fail 178e834c47b0d01352c48730235aae69898fbc02 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
c8ea0457495342c417c3dc033bba25148b279f60 
2b033e396f4fa0981bae1213cdacd15775655a97 
c93014ad3aa6aa88dfa5e96f66e8adb561483b8d
 119082 pass 13ddd1667e7f01071cdf120132238ffca004a88e 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
c8ea0457495342c417c3dc033bba25148b279f60 
2b033e396f4fa0981bae1213cdacd15775655a97 

[Xen-devel] [xen-unstable-smoke test] 119138: regressions - FAIL

2018-02-13 Thread osstest service owner
flight 119138 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/119138/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 119098

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386  1 build-check(1) blocked n/a
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  e139d34a1c4b7775d5855458a325e0e4176bdf7e
baseline version:
 xen  3f491d6873be9caa77f02ad8d98f174f0152b819

Last test of basis   119098  2018-02-13 17:01:30 Z0 days
Testing same since   119108  2018-02-13 20:01:33 Z0 days4 attempts


People who touched revisions under test:
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  fail
 build-armhf  pass
 build-amd64-libvirt  blocked 
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 blocked 
 test-amd64-amd64-libvirt blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit e139d34a1c4b7775d5855458a325e0e4176bdf7e
Author: Jan Beulich 
Date:   Tue Feb 13 18:19:33 2018 +0100

firmware/shim: correctly handle errors during Xen tree setup

"set -e" on a separate Makefile line is meaningless. Glue together all
the lines that this is supposed to cover.

Signed-off-by: Jan Beulich 
Reviewed-by: Roger Pau Monné 
Reviewed-by: Wei Liu 
(qemu changes not included)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] Fix a panic in SPEC_CTRL_ENTRY_FROM_INTR_IST

2018-02-13 Thread Zhenzhong Duan
On on IBRS available env, bootup panic when bti=0 like below:

(XEN) Speculative mitigation facilities:
(XEN)   Hardware features: SMEP IBRS/IBPB STIBP
(XEN) BTI mitigations: Thunk N/A, Others: IBRS- SMEP
(XEN) [ Xen-4.4.4OVM  x86_64  debug=n  Tainted:C ]
(XEN) CPU:0
(XEN) RIP:e008:[]
entry.o#handle_ist_exception+0xd1/0x176
(XEN) RFLAGS: 00010046   CONTEXT: hypervisor
(XEN) rax:    rbx:    rcx: 0048
(XEN) rdx: 0001   rsi:    rdi: 
(XEN) rbp:    rsp: 82d080529f58   r8:  
(XEN) r9:     r10:    r11: 
(XEN) r12:    r13:    r14: 82d08052
(XEN) r15:    cr0: 8005003b   cr4: 001506f0
(XEN) cr3: 76fbe000   cr2: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=82d080529f58:
(XEN)0018  0002 82d080528000
(XEN) 82d0802a50e0 82d08052fd98 82d08072fc00
(XEN) 0001 0400 0830
(XEN) 000a 82d0803f0fc0 0002
(XEN)82d080298876 e008 0046 82d08052fdf8
(XEN)
(XEN) Xen call trace:
(XEN)[] entry.o#handle_ist_exception+0xd1/0x176
(XEN)
(XEN)
(XEN) 
(XEN) Panic on CPU 0:
(XEN) GENERAL PROTECTION FAULT
(XEN) [error_code=]
(XEN) 

It's due to %edx isn't cleared to zero before wrmsr.

DO_OVERWRITE_RSB clobbers %eax and happend to cover the bug in certain case so
we didn't reproduce without bti=0. Change to use %edx.

Reviewed-by: Boris Ostrovsky 
Tested-by: Boris Ostrovsky 
Signed-off-by: Zhenzhong Duan 
---
 xen/include/asm-x86/spec_ctrl_asm.h |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/xen/include/asm-x86/spec_ctrl_asm.h 
b/xen/include/asm-x86/spec_ctrl_asm.h
index 814f53d..55d87c0 100644
--- a/xen/include/asm-x86/spec_ctrl_asm.h
+++ b/xen/include/asm-x86/spec_ctrl_asm.h
@@ -269,16 +269,16 @@
  * This is logical merge of DO_OVERWRITE_RSB and DO_SPEC_CTRL_ENTRY
  * maybexen=1, but with conditionals rather than alternatives.
  */
-movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %eax
+movzbl STACK_CPUINFO_FIELD(bti_ist_info)(%r14), %edx
 
-testb $BTI_IST_RSB, %al
+testb $BTI_IST_RSB, %dl
 jz .L\@_skip_rsb
 
 DO_OVERWRITE_RSB
 
 .L\@_skip_rsb:
 
-testb $BTI_IST_WRMSR, %al
+testb $BTI_IST_WRMSR, %dl
 jz .L\@_skip_wrmsr
 
 xor %edx, %edx
@@ -291,6 +291,7 @@
  * Load Xen's intended value.  SPEC_CTRL_IBRS vs 0 is encoded in the
  * bottom bit of bti_ist_info, via a deliberate alias with BTI_IST_IBRS.
  */
+xor %edx, %edx
 mov $MSR_SPEC_CTRL, %ecx
 and $BTI_IST_IBRS, %eax
 wrmsr
-- 
1.7.3

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable test] 119072: regressions - FAIL

2018-02-13 Thread osstest service owner
flight 119072 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/119072/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-pvopsbroken  in 119023
 build-armhf-pvops5 host-build-prep fail in 119023 REGR. vs. 118698
 build-i386-libvirt6 libvirt-build  fail in 119023 REGR. vs. 118698

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-ovmf-amd64 16 guest-localmigrate/x10 fail in 119023 
pass in 119072
 test-amd64-amd64-xl-qemut-ws16-amd64 16 guest-localmigrate/x10 fail pass in 
119023

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked in 119023 n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)  blocked in 119023 n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked in 119023 n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked in 119023 n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked in 119023 n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked in 119023 n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1) blocked in 119023 n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked in 119023 n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked in 119023 n/a
 test-armhf-armhf-examine  1 build-check(1)   blocked in 119023 n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked in 119023 n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked in 
119023 n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked in 119023 n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked in 119023 n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked in 119023 n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked in 119023 n/a
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop  fail in 119023 like 118698
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 118698
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 118698
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 118698
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 118698
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 118698
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 118698
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 118698
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 118698
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 118698
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 

[Xen-devel] [PATCH v2 05/16] Save/Restore Support: Add kernel shutdown logic to shutdown.c

2018-02-13 Thread Bruno Alvisio
Created shutdown.c for the shutdown thread and all the shutdown related
functions.

Signed-off-by: Bruno Alvisio 
---
Changesd since v1:
   * Updated license to a BSD 3-clause. This license was taken
from the updated original file. (Repo: sysml/mini-os)
---
 Makefile   |   1 +
 include/shutdown.h |  11 
 shutdown.c | 188 +
 3 files changed, 200 insertions(+)
 create mode 100644 include/shutdown.h
 create mode 100644 shutdown.c

diff --git a/Makefile b/Makefile
index 88315c4..6a05de6 100644
--- a/Makefile
+++ b/Makefile
@@ -53,6 +53,7 @@ src-y += mm.c
 src-$(CONFIG_NETFRONT) += netfront.c
 src-$(CONFIG_PCIFRONT) += pcifront.c
 src-y += sched.c
+src-y += shutdown.c
 src-$(CONFIG_TEST) += test.c
 src-$(CONFIG_BALLOON) += balloon.c
 
diff --git a/include/shutdown.h b/include/shutdown.h
new file mode 100644
index 000..a5ec019
--- /dev/null
+++ b/include/shutdown.h
@@ -0,0 +1,11 @@
+#ifndef _SHUTDOWN_H_
+#define _SHUTDOWN_H_
+
+#include 
+
+void init_shutdown(start_info_t *si);
+
+void kernel_shutdown(int reason) __attribute__((noreturn));
+void kernel_suspend(void);
+
+#endif
diff --git a/shutdown.c b/shutdown.c
new file mode 100644
index 000..aba146e
--- /dev/null
+++ b/shutdown.c
@@ -0,0 +1,188 @@
+/*
+ *  MiniOS
+ *
+ *   file: fromdevice.cc
+ *
+ * Authors: Joao Martins 
+ *
+ *
+ * Copyright (c) 2014, NEC Europe Ltd., NEC Corporation. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the copyright holder nor the names of its
+ *contributors may be used to endorse or promote products derived from
+ *this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ * THIS HEADER MAY NOT BE EXTRACTED OR MODIFIED IN ANY WAY.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+
+static start_info_t *start_info_ptr;
+
+static const char *path = "control/shutdown";
+static const char *token = "control/shutdown";
+static xenbus_event_queue events = NULL;
+static int end_shutdown_thread = 0;
+
+#ifdef CONFIG_XENBUS
+/* This should be overridden by the application we are linked against. */
+__attribute__((weak)) void app_shutdown(unsigned reason)
+{
+printk("Shutdown requested: %d\n", reason);
+if (reason == SHUTDOWN_suspend) {
+kernel_suspend();
+} else {
+struct sched_shutdown sched_shutdown = { .reason = reason };
+HYPERVISOR_sched_op(SCHEDOP_shutdown, _shutdown);
+}
+}
+
+static void shutdown_thread(void *p)
+{
+char *shutdown, *err;
+unsigned int shutdown_reason;
+
+xenbus_watch_path_token(XBT_NIL, path, token, );
+
+for ( ;; ) {
+xenbus_wait_for_watch();
+if ((err = xenbus_read(XBT_NIL, path, ))) {
+free(err);
+do_exit();
+}
+
+if (end_shutdown_thread)
+break;
+
+if (!strcmp(shutdown, "")) {
+/* Avoid spurious event on xenbus */
+/* FIXME: investigate the reason of the spurious event */
+free(shutdown);
+continue;
+} else if (!strcmp(shutdown, "poweroff")) {
+shutdown_reason = SHUTDOWN_poweroff;
+} else if (!strcmp(shutdown, "reboot")) {
+shutdown_reason = SHUTDOWN_reboot;
+} else if (!strcmp(shutdown, "suspend")) {
+shutdown_reason = SHUTDOWN_suspend;
+} else {
+shutdown_reason = SHUTDOWN_crash;
+}
+free(shutdown);
+
+/* Acknowledge shutdown request */
+if ((err = xenbus_write(XBT_NIL, path, ""))) {
+free(err);
+

[Xen-devel] [PATCH v2 03/16] Save/Restore Support: Declare kernel and arch pre/post suspend functions

2018-02-13 Thread Bruno Alvisio
For mini-OS to support suspend and restore, the kernel will have to suspend
different modules such as xenbus, console, irq, etc. During save/restore the
kernel and arch pre_suspend and post_suspend functions will be invoked to
suspend/resume each of the modules.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 arch/x86/setup.c | 10 ++
 include/kernel.h |  2 ++
 include/x86/os.h |  4 ++--
 kernel.c | 10 ++
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/x86/setup.c b/arch/x86/setup.c
index 5278227..3dd86f9 100644
--- a/arch/x86/setup.c
+++ b/arch/x86/setup.c
@@ -204,6 +204,16 @@ arch_init(void *par)
start_kernel();
 }
 
+void arch_pre_suspend(void)
+{
+
+}
+
+void arch_post_suspend(int canceled)
+{
+
+}
+
 void
 arch_fini(void)
 {
diff --git a/include/kernel.h b/include/kernel.h
index d37ddda..161d757 100644
--- a/include/kernel.h
+++ b/include/kernel.h
@@ -5,6 +5,8 @@
 extern char cmdline[MAX_CMDLINE_SIZE];
 
 void start_kernel(void);
+void pre_suspend(void);
+void post_suspend(int canceled);
 void do_exit(void) __attribute__((noreturn));
 void arch_do_exit(void);
 void stop_kernel(void);
diff --git a/include/x86/os.h b/include/x86/os.h
index d155914..a73b63e 100644
--- a/include/x86/os.h
+++ b/include/x86/os.h
@@ -71,10 +71,10 @@ void trap_fini(void);
 void xen_callback_vector(void);
 #endif
 
+void arch_pre_suspend(void);
+void arch_post_suspend(int canceled);
 void arch_fini(void);
 
-
-
 #ifdef CONFIG_PARAVIRT
 
 /* 
diff --git a/kernel.c b/kernel.c
index 0d84a9b..90c865a 100644
--- a/kernel.c
+++ b/kernel.c
@@ -155,6 +155,16 @@ void start_kernel(void)
 run_idle_thread();
 }
 
+void pre_suspend(void)
+{
+
+}
+
+void post_suspend(int canceled)
+{
+
+}
+
 void stop_kernel(void)
 {
 /* TODO: fs import */
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 15/16] Save/Restore Support: Add suspend/restore support for netfront

2018-02-13 Thread Bruno Alvisio
Performed an additional cleanup to make the file more syntactically consistent.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 include/netfront.h |   8 +-
 kernel.c   |   8 ++
 netfront.c | 309 ++---
 3 files changed, 236 insertions(+), 89 deletions(-)

diff --git a/include/netfront.h b/include/netfront.h
index 2b95da9..1164d50 100644
--- a/include/netfront.h
+++ b/include/netfront.h
@@ -3,9 +3,15 @@
 #include 
 #endif
 struct netfront_dev;
-struct netfront_dev *init_netfront(char *nodename, void (*netif_rx)(unsigned 
char *data, int len), unsigned char rawmac[6], char **ip);
+struct netfront_dev *init_netfront(char *nodename,
+   void (*netif_rx)(unsigned char *data,
+int len, void* arg),
+   unsigned char rawmac[6],
+   char **ip);
 void netfront_xmit(struct netfront_dev *dev, unsigned char* data,int len);
 void shutdown_netfront(struct netfront_dev *dev);
+void suspend_netfront(void);
+void resume_netfront(void);
 #ifdef HAVE_LIBC
 int netfront_tap_open(char *nodename);
 ssize_t netfront_receive(struct netfront_dev *dev, unsigned char *data, size_t 
len);
diff --git a/kernel.c b/kernel.c
index 1393d15..301273d 100644
--- a/kernel.c
+++ b/kernel.c
@@ -119,6 +119,10 @@ void start_kernel(void* par)
 
 void pre_suspend(void)
 {
+#ifdef CONFIG_NETFRONT
+suspend_netfront();
+#endif
+
 #ifdef CONFIG_XENBUS
 suspend_xenbus();
 #endif
@@ -147,6 +151,10 @@ void post_suspend(int canceled)
 #ifdef CONFIG_XENBUS
 resume_xenbus(canceled);
 #endif
+
+#ifdef CONFIG_NETFRONT
+resume_netfront();
+#endif
 }
 
 void stop_kernel(void)
diff --git a/netfront.c b/netfront.c
index b8fac62..50b3a57 100644
--- a/netfront.c
+++ b/netfront.c
@@ -63,10 +63,30 @@ struct netfront_dev {
 size_t rlen;
 #endif
 
-void (*netif_rx)(unsigned char* data, int len);
+void (*netif_rx)(unsigned char* data, int len, void* arg);
+void *netif_rx_arg;
 };
 
+struct netfront_dev_list {
+struct netfront_dev *dev;
+unsigned char rawmac[6];
+char *ip;
+
+int refcount;
+
+struct netfront_dev_list *next;
+};
+
+static struct netfront_dev_list *dev_list = NULL;
+
 void init_rx_buffers(struct netfront_dev *dev);
+static struct netfront_dev *_init_netfront(struct netfront_dev *dev,
+   unsigned char rawmac[6], char **ip);
+static void _shutdown_netfront(struct netfront_dev *dev);
+void netfront_set_rx_handler(struct netfront_dev *dev,
+ void (*thenetif_rx)(unsigned char *data, int len,
+ void *arg),
+ void *arg);
 
 static inline void add_id_to_freelist(unsigned int id,unsigned short* freelist)
 {
@@ -81,7 +101,7 @@ static inline unsigned short get_id_from_freelist(unsigned 
short* freelist)
 return id;
 }
 
-__attribute__((weak)) void netif_rx(unsigned char* data,int len)
+__attribute__((weak)) void netif_rx(unsigned char* data, int len, void *arg)
 {
 printk("%d bytes incoming at %p\n",len,data);
 }
@@ -120,21 +140,20 @@ moretodo:
 page = (unsigned char*)buf->page;
 gnttab_end_access(buf->gref);
 
-if (rx->status > NETIF_RSP_NULL)
-{
+if (rx->status > NETIF_RSP_NULL) {
 #ifdef HAVE_LIBC
-   if (dev->netif_rx == NETIF_SELECT_RX) {
-   int len = rx->status;
-   ASSERT(current == main_thread);
-   if (len > dev->len)
-   len = dev->len;
-   memcpy(dev->data, page+rx->offset, len);
-   dev->rlen = len;
-   /* No need to receive the rest for now */
-   dobreak = 1;
-   } else
+if (dev->netif_rx == NETIF_SELECT_RX) {
+int len = rx->status;
+ASSERT(current == main_thread);
+if (len > dev->len)
+len = dev->len;
+memcpy(dev->data, page+rx->offset, len);
+dev->rlen = len;
+/* No need to receive the rest for now */
+dobreak = 1;
+} else
 #endif
-   dev->netif_rx(page+rx->offset,rx->status);
+   dev->netif_rx(page+rx->offset, rx->status, 
dev->netif_rx_arg);
 }
 }
 dev->rx.rsp_cons=cons;
@@ -144,17 +163,16 @@ moretodo:
 
 req_prod = dev->rx.req_prod_pvt;
 
-for(i=0; irx, req_prod + i);
 struct net_buffer* buf = >rx_buffers[id];
 void* page = buf->page;
 
 /* We are sure to have free gnttab entries since they got released 
above */
-

[Xen-devel] [PATCH v2 13/16] Save/Restore Support: Add suspend/restore support for Grant Tables.

2018-02-13 Thread Bruno Alvisio
Signed-off-by: Bruno Alvisio 
---
Changed since v1:
- Moved suspend/resume _gnttab to arch specific files
---
 arch/x86/mm.c| 34 ++
 gnttab.c | 10 ++
 include/gnttab.h |  4 
 kernel.c |  4 
 4 files changed, 52 insertions(+)

diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index 1b163ac..2597c5b 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -917,6 +917,40 @@ grant_entry_v1_t *arch_init_gnttab(int nr_grant_frames)
 return map_frames(frames, nr_grant_frames);
 }
 
+void arch_suspend_gnttab(grant_entry_v1_t *gnttab_table, int nr_grant_frames)
+{
+#ifdef CONFIG_PARAVIRT
+int i;
+
+for (i = 0; i < nr_grant_frames; i++) {
+HYPERVISOR_update_va_mapping((unsigned long)(((char *)gnttab_table) + 
PAGE_SIZE*i),
+(pte_t){0x0<

[Xen-devel] [PATCH v2 14/16] Save/Restore Support: Add suspend/restore support for xenbus

2018-02-13 Thread Bruno Alvisio
Currently the watch path is not saved in the watch struct when it is registered.
During xenbus resume the path is needed so that the watches can be registered 
again.
Thus, 'path' field is added to struct watch so that watches can be re-registered
during xenbus resume.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 include/xenbus.h |   2 ++
 kernel.c |   8 +
 xenbus/xenbus.c  | 106 +++
 3 files changed, 85 insertions(+), 31 deletions(-)

diff --git a/include/xenbus.h b/include/xenbus.h
index b2d5072..3871f35 100644
--- a/include/xenbus.h
+++ b/include/xenbus.h
@@ -120,6 +120,8 @@ domid_t xenbus_get_self_id(void);
 #ifdef CONFIG_XENBUS
 /* Reset the XenBus system. */
 void fini_xenbus(void);
+void suspend_xenbus(void);
+void resume_xenbus(int canceled);
 #else
 static inline void fini_xenbus(void)
 {
diff --git a/kernel.c b/kernel.c
index 933cbcd..1393d15 100644
--- a/kernel.c
+++ b/kernel.c
@@ -119,6 +119,10 @@ void start_kernel(void* par)
 
 void pre_suspend(void)
 {
+#ifdef CONFIG_XENBUS
+suspend_xenbus();
+#endif
+
 local_irq_disable();
 
 suspend_gnttab();
@@ -139,6 +143,10 @@ void post_suspend(int canceled)
 resume_gnttab();
 
 local_irq_enable();
+
+#ifdef CONFIG_XENBUS
+resume_xenbus(canceled);
+#endif
 }
 
 void stop_kernel(void)
diff --git a/xenbus/xenbus.c b/xenbus/xenbus.c
index c2d2bd1..d72dc3a 100644
--- a/xenbus/xenbus.c
+++ b/xenbus/xenbus.c
@@ -50,6 +50,7 @@ DECLARE_WAIT_QUEUE_HEAD(xenbus_watch_queue);
 xenbus_event_queue xenbus_events;
 static struct watch {
 char *token;
+char *path;
 xenbus_event_queue *events;
 struct watch *next;
 } *watches;
@@ -63,6 +64,8 @@ struct xenbus_req_info
 #define NR_REQS 32
 static struct xenbus_req_info req_info[NR_REQS];
 
+static char *errmsg(struct xsd_sockmsg *rep);
+
 uint32_t xenbus_evtchn;
 
 #ifdef CONFIG_PARAVIRT
@@ -231,45 +234,39 @@ static void xenbus_thread_func(void *ign)
 struct xsd_sockmsg msg;
 unsigned prod = xenstore_buf->rsp_prod;
 
-for (;;) 
-{
+for (;;) {
 wait_event(xb_waitq, prod != xenstore_buf->rsp_prod);
-while (1) 
-{
+while (1) {
 prod = xenstore_buf->rsp_prod;
 DEBUG("Rsp_cons %d, rsp_prod %d.\n", xenstore_buf->rsp_cons,
-xenstore_buf->rsp_prod);
+  xenstore_buf->rsp_prod);
 if (xenstore_buf->rsp_prod - xenstore_buf->rsp_cons < sizeof(msg))
 break;
 rmb();
-memcpy_from_ring(xenstore_buf->rsp,
-,
-MASK_XENSTORE_IDX(xenstore_buf->rsp_cons),
-sizeof(msg));
-DEBUG("Msg len %d, %d avail, id %d.\n",
-msg.len + sizeof(msg),
-xenstore_buf->rsp_prod - xenstore_buf->rsp_cons,
-msg.req_id);
+memcpy_from_ring(xenstore_buf->rsp, ,
+ MASK_XENSTORE_IDX(xenstore_buf->rsp_cons),
+ sizeof(msg));
+DEBUG("Msg len %d, %d avail, id %d.\n", msg.len + sizeof(msg),
+  xenstore_buf->rsp_prod - xenstore_buf->rsp_cons, msg.req_id);
+
 if (xenstore_buf->rsp_prod - xenstore_buf->rsp_cons <
-sizeof(msg) + msg.len)
+sizeof(msg) + msg.len)
 break;
 
 DEBUG("Message is good.\n");
 
-if(msg.type == XS_WATCH_EVENT)
-{
-   struct xenbus_event *event = malloc(sizeof(*event) + msg.len);
+if (msg.type == XS_WATCH_EVENT) {
+struct xenbus_event *event = malloc(sizeof(*event) + msg.len);
 xenbus_event_queue *events = NULL;
-   char *data = (char*)event + sizeof(*event);
+char *data = (char*)event + sizeof(*event);
 struct watch *watch;
 
-memcpy_from_ring(xenstore_buf->rsp,
-   data,
+memcpy_from_ring(xenstore_buf->rsp, data,
 MASK_XENSTORE_IDX(xenstore_buf->rsp_cons + sizeof(msg)),
 msg.len);
 
-   event->path = data;
-   event->token = event->path + strlen(event->path) + 1;
+event->path = data;
+event->token = event->path + strlen(event->path) + 1;
 
 mb();
 xenstore_buf->rsp_cons += msg.len + sizeof(msg);
@@ -288,15 +285,11 @@ static void xenbus_thread_func(void *ign)
 printk("unexpected watch token %s\n", event->token);
 free(event);
 }
-}
-
-else
-{
+} else {
 req_info[msg.req_id].reply = malloc(sizeof(msg) + msg.len);
-memcpy_from_ring(xenstore_buf->rsp,
-

[Xen-devel] [PATCH v2 09/16] Save/Restore Support: Disable/enable IRQs during suspend/restore

2018-02-13 Thread Bruno Alvisio
Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 kernel.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel.c b/kernel.c
index 1cd40e8..782eb79 100644
--- a/kernel.c
+++ b/kernel.c
@@ -119,12 +119,12 @@ void start_kernel(void* par)
 
 void pre_suspend(void)
 {
-
+local_irq_disable();
 }
 
 void post_suspend(int canceled)
 {
-
+local_irq_enable();
 }
 
 void stop_kernel(void)
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 16/16] Save/Restore Support: Implement code for arch suspend/resume

2018-02-13 Thread Bruno Alvisio
Before suspending the domain the shared_info_page is unmapped and for PVs the
pagetables should be canonicalized. After resume the shared_info_page should be
mapped again.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
Changed since v1:
  * Fixed comment
---
 arch/x86/setup.c | 51 +++
 1 file changed, 51 insertions(+)

diff --git a/arch/x86/setup.c b/arch/x86/setup.c
index b6e0541..b5ed1c8 100644
--- a/arch/x86/setup.c
+++ b/arch/x86/setup.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_PARAVIRT
 /*
@@ -42,6 +43,11 @@ union start_info_union start_info_union;
 #endif
 
 /*
+ * This pointer holds a reference to the copy of the start_info struct.
+ */
+static start_info_t *start_info_ptr;
+
+/*
  * Shared page for communicating with the hypervisor.
  * Events flags go here, for example.
  */
@@ -212,18 +218,63 @@ arch_init(void *par)
 #ifdef CONFIG_PARAVIRT
memcpy(_info, par, sizeof(start_info));
 #endif
+   start_info_ptr = (start_info_t *)par;
 
start_kernel((start_info_t *)par);
 }
 
 void arch_pre_suspend(void)
 {
+#ifdef CONFIG_PARAVIRT
+   /* Replace xenstore and console mfns with the correspondent pfns */
+start_info_ptr->store_mfn =
+virt_to_pfn(mfn_to_virt(start_info_ptr->store_mfn));
+start_info_ptr->console.domU.mfn =
+virt_to_pfn(mfn_to_virt(start_info_ptr->console.domU.mfn));
+#else
+uint64_t store_v;
+uint64_t console_v;
+
+if( hvm_get_parameter(HVM_PARAM_STORE_PFN, _v) )
+BUG();
+start_info_ptr->store_mfn = store_v;
+
+if( hvm_get_parameter(HVM_PARAM_CONSOLE_PFN, _v) )
+BUG();
+start_info_ptr->console.domU.mfn = console_v;
+#endif
+unmap_shared_info();
 
+arch_mm_pre_suspend();
 }
 
 void arch_post_suspend(int canceled)
 {
+#if CONFIG_PARAVIRT
+if (canceled) {
+start_info_ptr->store_mfn = pfn_to_mfn(start_info_ptr->store_mfn);
+start_info_ptr->console.domU.mfn = 
pfn_to_mfn(start_info_ptr->console.domU.mfn);
+} else {
+memcpy(_info, start_info_ptr, sizeof(start_info_t));
+}
+#else
+uint64_t store_v;
+uint64_t console_v;
+
+if (hvm_get_parameter(HVM_PARAM_STORE_PFN, _v))
+BUG();
+start_info_ptr->store_mfn = pfn_to_mfn(store_v);
 
+if (hvm_get_parameter(HVM_PARAM_CONSOLE_PFN, _v))
+BUG();
+start_info_ptr->console.domU.mfn = pfn_to_mfn(console_v);
+#endif
+
+HYPERVISOR_shared_info = map_shared_info((void*) 
start_info_ptr->shared_info);
+#ifndef CONFIG_PARAVIRT
+xen_callback_vector();
+#endif
+arch_mm_post_suspend(canceled);
 }
 
 void
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 12/16] Save/Restore Support: Add support for suspend/restore events.

2018-02-13 Thread Bruno Alvisio
Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 events.c | 5 +
 include/events.h | 1 +
 kernel.c | 2 ++
 3 files changed, 8 insertions(+)

diff --git a/events.c b/events.c
index e8ef8aa..342aead 100644
--- a/events.c
+++ b/events.c
@@ -183,6 +183,11 @@ void fini_events(void)
 arch_fini_events();
 }
 
+void suspend_events(void)
+{
+unbind_all_ports();
+}
+
 void default_handler(evtchn_port_t port, struct pt_regs *regs, void *ignore)
 {
 printk("[Port %d] - event received\n", port);
diff --git a/include/events.h b/include/events.h
index 89b5997..705ad93 100644
--- a/include/events.h
+++ b/include/events.h
@@ -55,5 +55,6 @@ static inline int notify_remote_via_evtchn(evtchn_port_t port)
 }
 
 void fini_events(void);
+void suspend_events(void);
 
 #endif /* _EVENTS_H_ */
diff --git a/kernel.c b/kernel.c
index 2fb69bf..d078e0a 100644
--- a/kernel.c
+++ b/kernel.c
@@ -124,6 +124,8 @@ void pre_suspend(void)
 fini_time();
 
 suspend_console();
+
+suspend_events();
 }
 
 void post_suspend(int canceled)
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 00/16] Save/Restore Support for mini-OS PVH

2018-02-13 Thread Bruno Alvisio
Hi all,

I am sending the second revision for supporting save/restore in Mini-OS PVH. 
The 
branch can be found at: 

https://github.com/balvisio/mini-os/tree/feature/mini-os-suspend-support-submission-2

Feedback would be greatly appreciated.

Cheers,

Bruno

Signed-off-by: Bruno Alvisio 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 06/16] Save/Restore Support: Moved shutdown thread to shutdown.c

2018-02-13 Thread Bruno Alvisio
The shutdown thread present in kernel.c was removed and now the thread in
shutdown.c is created instead.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 arch/x86/setup.c |  2 +-
 include/kernel.h |  2 +-
 kernel.c | 50 ++
 3 files changed, 8 insertions(+), 46 deletions(-)

diff --git a/arch/x86/setup.c b/arch/x86/setup.c
index 3dd86f9..31fa2c6 100644
--- a/arch/x86/setup.c
+++ b/arch/x86/setup.c
@@ -201,7 +201,7 @@ arch_init(void *par)
memcpy(_info, par, sizeof(start_info));
 #endif
 
-   start_kernel();
+   start_kernel((start_info_t *)par);
 }
 
 void arch_pre_suspend(void)
diff --git a/include/kernel.h b/include/kernel.h
index 161d757..742abf5 100644
--- a/include/kernel.h
+++ b/include/kernel.h
@@ -4,7 +4,7 @@
 #define MAX_CMDLINE_SIZE 1024
 extern char cmdline[MAX_CMDLINE_SIZE];
 
-void start_kernel(void);
+void start_kernel(void* par);
 void pre_suspend(void);
 void post_suspend(int canceled);
 void do_exit(void) __attribute__((noreturn));
diff --git a/kernel.c b/kernel.c
index 90c865a..1cd40e8 100644
--- a/kernel.c
+++ b/kernel.c
@@ -42,6 +42,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_XENBUS
+#include 
+#endif
 #include 
 #include 
 #include 
@@ -66,48 +69,6 @@ void setup_xen_features(void)
 }
 }
 
-#ifdef CONFIG_XENBUS
-/* This should be overridden by the application we are linked against. */
-__attribute__((weak)) void app_shutdown(unsigned reason)
-{
-struct sched_shutdown sched_shutdown = { .reason = reason };
-printk("Shutdown requested: %d\n", reason);
-HYPERVISOR_sched_op(SCHEDOP_shutdown, _shutdown);
-}
-
-static void shutdown_thread(void *p)
-{
-const char *path = "control/shutdown";
-const char *token = path;
-xenbus_event_queue events = NULL;
-char *shutdown = NULL, *err;
-unsigned int shutdown_reason;
-xenbus_watch_path_token(XBT_NIL, path, token, );
-while ((err = xenbus_read(XBT_NIL, path, )) != NULL || 
!strcmp(shutdown, ""))
-{
-free(err);
-free(shutdown);
-shutdown = NULL;
-xenbus_wait_for_watch();
-}
-err = xenbus_unwatch_path_token(XBT_NIL, path, token);
-free(err);
-err = xenbus_write(XBT_NIL, path, "");
-free(err);
-printk("Shutting down (%s)\n", shutdown);
-
-if (!strcmp(shutdown, "poweroff"))
-shutdown_reason = SHUTDOWN_poweroff;
-else if (!strcmp(shutdown, "reboot"))
-shutdown_reason = SHUTDOWN_reboot;
-else
-/* Unknown */
-shutdown_reason = SHUTDOWN_crash;
-app_shutdown(shutdown_reason);
-free(shutdown);
-}
-#endif
-
 
 /* This should be overridden by the application we are linked against. */
 __attribute__((weak)) int app_main(void *p)
@@ -116,7 +77,7 @@ __attribute__((weak)) int app_main(void *p)
 return 0;
 }
 
-void start_kernel(void)
+void start_kernel(void* par)
 {
 /* Set up events. */
 init_events();
@@ -145,7 +106,8 @@ void start_kernel(void)
 init_xenbus();
 
 #ifdef CONFIG_XENBUS
-create_thread("shutdown", shutdown_thread, NULL);
+/* Init shutdown thread */
+init_shutdown((start_info_t *)par);
 #endif
 
 /* Call (possibly overridden) app_main() */
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 02/16] Save/Restore Support: Refactor trap_init() and setup vector callbacks

2018-02-13 Thread Bruno Alvisio
Currently the setup of the IDT and the request to set the HVM vector callbacks
are performed both in the trap_init function.

As part of the post-suspend operation, the HVM vector callback needs to be setup
again while the IDT does not. Thus, the trap_init function is split into two
separate functions: trap_init (sets up IDT) and xen_callback_vector (sets the
HVM vector callback). During the post-suspend operations the xen_callback_vector
function will be invoked.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 arch/x86/traps.c | 17 +++--
 include/x86/os.h |  3 +++
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/traps.c b/arch/x86/traps.c
index aa17da3..a7388a5 100644
--- a/arch/x86/traps.c
+++ b/arch/x86/traps.c
@@ -389,6 +389,16 @@ static void setup_gate(unsigned int entry, void *addr, 
unsigned int dpl)
 #endif
 }
 
+void xen_callback_vector(void)
+{
+if (hvm_set_parameter(HVM_PARAM_CALLBACK_IRQ,
+ (2ULL << 56) | TRAP_xen_callback))
+{
+xprintk("Request for Xen HVM callback vector failed\n");
+do_exit();
+}
+}
+
 void trap_init(void)
 {
 setup_gate(TRAP_divide_error, _error, 0);
@@ -415,12 +425,7 @@ void trap_init(void)
 gdt[GDTE_TSS] = (typeof(*gdt))INIT_GDTE((unsigned long), 0x67, 0x89);
 asm volatile ("ltr %w0" :: "rm" (GDTE_TSS * 8));
 
-if ( hvm_set_parameter(HVM_PARAM_CALLBACK_IRQ,
-   (2ULL << 56) | TRAP_xen_callback) )
-{
-xprintk("Request for Xen HVM callback vector failed\n");
-do_exit();
-}
+xen_callback_vector();
 }
 
 void trap_fini(void)
diff --git a/include/x86/os.h b/include/x86/os.h
index fbc2eeb..d155914 100644
--- a/include/x86/os.h
+++ b/include/x86/os.h
@@ -67,6 +67,9 @@ extern shared_info_t *HYPERVISOR_shared_info;
 
 void trap_init(void);
 void trap_fini(void);
+#ifndef CONFIG_PARAVIRT
+void xen_callback_vector(void);
+#endif
 
 void arch_fini(void);
 
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 11/16] Save/Restore Support: Add suspend/restore support for console

2018-02-13 Thread Bruno Alvisio
Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 console/console.c  | 15 -
 console/xenbus.c   |  3 +-
 console/xencons_ring.c | 83 +++---
 include/console.h  |  6 +++-
 kernel.c   |  4 +++
 lib/sys.c  |  2 +-
 6 files changed, 77 insertions(+), 36 deletions(-)

diff --git a/console/console.c b/console/console.c
index 2e04552..9814506 100644
--- a/console/console.c
+++ b/console/console.c
@@ -52,6 +52,7 @@
 
 /* If console not initialised the printk will be sent to xen serial line 
NOTE: you need to enable verbose in xen/Rules.mk for it to work. */
+static struct consfront_dev* xen_console = NULL;
 static int console_initialised = 0;
 
 __attribute__((weak)) void console_input(char * buf, unsigned len)
@@ -162,8 +163,20 @@ void xprintk(const char *fmt, ...)
 void init_console(void)
 {   
 printk("Initialising console ... ");
-xencons_ring_init();
+xen_console = xencons_ring_init();
 console_initialised = 1;
 /* This is also required to notify the daemon */
 printk("done.\n");
 }
+
+void suspend_console(void)
+{
+console_initialised = 0;
+xencons_ring_fini(xen_console);
+}
+
+void resume_console(void)
+{
+xencons_ring_resume(xen_console);
+console_initialised = 1;
+}
\ No newline at end of file
diff --git a/console/xenbus.c b/console/xenbus.c
index 1c9a590..654b469 100644
--- a/console/xenbus.c
+++ b/console/xenbus.c
@@ -188,8 +188,7 @@ error:
 return NULL;
 }
 
-void fini_console(struct consfront_dev *dev)
+void fini_consfront(struct consfront_dev *dev)
 {
 if (dev) free_consfront(dev);
 }
-
diff --git a/console/xencons_ring.c b/console/xencons_ring.c
index dd64a41..b6db74e 100644
--- a/console/xencons_ring.c
+++ b/console/xencons_ring.c
@@ -19,6 +19,8 @@ DECLARE_WAIT_QUEUE_HEAD(console_queue);
 static struct xencons_interface *console_ring;
 uint32_t console_evtchn;
 
+static struct consfront_dev* resume_xen_console(struct consfront_dev* dev);
+
 #ifdef CONFIG_PARAVIRT
 void get_console(void *p)
 {
@@ -32,10 +34,12 @@ void get_console(void *p)
 {
 uint64_t v = -1;
 
-hvm_get_parameter(HVM_PARAM_CONSOLE_EVTCHN, );
+if (hvm_get_parameter(HVM_PARAM_CONSOLE_EVTCHN, ))
+BUG();
 console_evtchn = v;
 
-hvm_get_parameter(HVM_PARAM_CONSOLE_PFN, );
+if (hvm_get_parameter(HVM_PARAM_CONSOLE_PFN, ))
+BUG();
 console_ring = (struct xencons_interface *)map_frame_virt(v);
 }
 #endif
@@ -89,9 +93,7 @@ int xencons_ring_send(struct consfront_dev *dev, const char 
*data, unsigned len)
 notify_daemon(dev);
 
 return sent;
-}  
-
-
+}
 
 void console_handle_input(evtchn_port_t port, struct pt_regs *regs, void *data)
 {
@@ -177,41 +179,60 @@ int xencons_ring_recv(struct consfront_dev *dev, char 
*data, unsigned len)
 
 struct consfront_dev *xencons_ring_init(void)
 {
-   int err;
-   struct consfront_dev *dev;
+struct consfront_dev *dev;
 
-   if (!console_evtchn)
-   return 0;
+if (!console_evtchn)
+return 0;
 
-   dev = malloc(sizeof(struct consfront_dev));
-   memset(dev, 0, sizeof(struct consfront_dev));
-   dev->nodename = "device/console";
-   dev->dom = 0;
-   dev->backend = 0;
-   dev->ring_ref = 0;
+dev = malloc(sizeof(struct consfront_dev));
+memset(dev, 0, sizeof(struct consfront_dev));
+dev->nodename = "device/console";
+dev->dom = 0;
+dev->backend = 0;
+dev->ring_ref = 0;
 
 #ifdef HAVE_LIBC
-   dev->fd = -1;
+dev->fd = -1;
 #endif
-   dev->evtchn = console_evtchn;
-   dev->ring = xencons_interface();
-
-   err = bind_evtchn(dev->evtchn, console_handle_input, dev);
-   if (err <= 0) {
-   printk("XEN console request chn bind failed %i\n", err);
-free(dev);
-   return NULL;
-   }
-unmask_evtchn(dev->evtchn);
 
-   /* In case we have in-flight data after save/restore... */
-   notify_daemon(dev);
+return resume_xen_console(dev);
+}
+
+static struct consfront_dev* resume_xen_console(struct consfront_dev* dev)
+{
+int err;
 
-   return dev;
+dev->evtchn = console_evtchn;
+dev->ring = xencons_interface();
+
+err = bind_evtchn(dev->evtchn, console_handle_input, dev);
+if (err <= 0) {
+printk("XEN console request chn bind failed %i\n", err);
+free(dev);
+return NULL;
+}
+unmask_evtchn(dev->evtchn);
+
+/* In case we have in-flight data after save/restore... */
+notify_daemon(dev);
+
+return dev;
 }
 
-void xencons_resume(void)
+void xencons_ring_fini(struct consfront_dev* dev)
 {
-   (void)xencons_ring_init();
+if (dev)
+mask_evtchn(dev->evtchn);
 }
 
+void xencons_ring_resume(struct consfront_dev* dev)
+{
+if (dev) {
+#if CONFIG_PARAVIRT
+get_console(_info);
+#else
+

[Xen-devel] [PATCH v2 07/16] Save/Restore Support: Add unmap_shared_info

2018-02-13 Thread Bruno Alvisio
This function is necessary as part of the pre-suspend operation.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
Changed since v1:
 * Changed HYPERVISOR_shared_info for shared_info
---
 arch/x86/setup.c | 12 
 hypervisor.c | 12 
 include/hypervisor.h |  1 +
 3 files changed, 25 insertions(+)

diff --git a/arch/x86/setup.c b/arch/x86/setup.c
index 31fa2c6..b6e0541 100644
--- a/arch/x86/setup.c
+++ b/arch/x86/setup.c
@@ -93,6 +93,18 @@ shared_info_t *map_shared_info(void *p)
 return (shared_info_t *)shared_info;
 }
 
+void unmap_shared_info(void)
+{
+int rc;
+
+if ( (rc = HYPERVISOR_update_va_mapping((unsigned long)shared_info,
+__pte((virt_to_mfn(shared_info)<

[Xen-devel] [PATCH v2 08/16] Save/Restore Support: Add arch_mm_pre|post_suspend

2018-02-13 Thread Bruno Alvisio
For PV guests the pagetables reference the real MFNs rather than PFNs, so when
the guest is resumed into a different area of a hosts memory, these will need to
be rewritten. Thus for PV guests the MFNs need to be replaced with PFNs:
canonicalization.

PVH guests are auto-translated so no memory operation is needed.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 arch/x86/mm.c | 14 ++
 include/x86/arch_mm.h |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index 05ad029..1b163ac 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -848,6 +848,20 @@ void arch_init_p2m(unsigned long max_pfn)
 
 arch_remap_p2m(max_pfn);
 }
+
+void arch_mm_pre_suspend(void)
+{
+//TODO: Canonicalize pagetables
+}
+
+void arch_mm_post_suspend(int canceled)
+{
+//TODO: Locate pagetables and 'uncanonicalize' them
+}
+#else
+void arch_mm_pre_suspend(void){ }
+
+void arch_mm_post_suspend(int canceled){ }
 #endif
 
 void arch_init_mm(unsigned long* start_pfn_p, unsigned long* max_pfn_p)
diff --git a/include/x86/arch_mm.h b/include/x86/arch_mm.h
index ab8a53e..cbbeb21 100644
--- a/include/x86/arch_mm.h
+++ b/include/x86/arch_mm.h
@@ -279,6 +279,9 @@ pgentry_t *need_pgt(unsigned long addr);
 void arch_mm_preinit(void *p);
 unsigned long alloc_virt_kernel(unsigned n_pages);
 
+void arch_mm_pre_suspend(void);
+void arch_mm_post_suspend(int canceled);
+
 #ifndef CONFIG_PARAVIRT
 void arch_print_memmap(void);
 #endif
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 07/17] Save/Restore Support: Add unmap_shared_info

2018-02-13 Thread Bruno Alvisio
This function is necessary as part of the pre-suspend operation.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
Changed since v1:
 * Changed HYPERVISOR_shared_info for shared_info
---
 arch/x86/setup.c | 12 
 hypervisor.c | 12 
 include/hypervisor.h |  1 +
 3 files changed, 25 insertions(+)

diff --git a/arch/x86/setup.c b/arch/x86/setup.c
index 31fa2c6..b6e0541 100644
--- a/arch/x86/setup.c
+++ b/arch/x86/setup.c
@@ -93,6 +93,18 @@ shared_info_t *map_shared_info(void *p)
 return (shared_info_t *)shared_info;
 }
 
+void unmap_shared_info(void)
+{
+int rc;
+
+if ( (rc = HYPERVISOR_update_va_mapping((unsigned long)shared_info,
+__pte((virt_to_mfn(shared_info)<

[Xen-devel] [PATCH v2 04/16] Save/Restore Support: Add xenbus_release_wait_for_watch

2018-02-13 Thread Bruno Alvisio
xenbus_release_wait_for_watch generates a fake event to trigger make
xenbus_wait_for_watch return. This is necessary to wake up waiting threads.

release_xenbus_id additionally checks if the number of requests == 0 to wake
up the 'waiting' suspend xenbus thread.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
Changed since v1:
  * Added doc for change in release_xenbus_id
---
 include/xenbus.h |  1 +
 xenbus/xenbus.c  | 10 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/xenbus.h b/include/xenbus.h
index 12391b9..b2d5072 100644
--- a/include/xenbus.h
+++ b/include/xenbus.h
@@ -42,6 +42,7 @@ char *xenbus_unwatch_path_token(xenbus_transaction_t xbt, 
const char *path, cons
 extern struct wait_queue_head xenbus_watch_queue;
 void xenbus_wait_for_watch(xenbus_event_queue *queue);
 char **xenbus_wait_for_watch_return(xenbus_event_queue *queue);
+void xenbus_release_wait_for_watch(xenbus_event_queue *queue);
 char* xenbus_wait_for_value(const char *path, const char *value, 
xenbus_event_queue *queue);
 char *xenbus_wait_for_state_change(const char* path, XenbusState *state, 
xenbus_event_queue *queue);
 char *xenbus_switch_state(xenbus_transaction_t xbt, const char* path, 
XenbusState state);
diff --git a/xenbus/xenbus.c b/xenbus/xenbus.c
index 636786c..c2d2bd1 100644
--- a/xenbus/xenbus.c
+++ b/xenbus/xenbus.c
@@ -129,6 +129,14 @@ void xenbus_wait_for_watch(xenbus_event_queue *queue)
 printk("unexpected path returned by watch\n");
 }
 
+void xenbus_release_wait_for_watch(xenbus_event_queue *queue)
+{
+struct xenbus_event *event = malloc(sizeof(*event));
+event->next = *queue;
+*queue = event;
+wake_up(_watch_queue);
+}
+
 char* xenbus_wait_for_value(const char* path, const char* value, 
xenbus_event_queue *queue)
 {
 if (!queue)
@@ -318,7 +326,7 @@ static void release_xenbus_id(int id)
 req_info[id].in_use = 0;
 nr_live_reqs--;
 req_info[id].in_use = 0;
-if (nr_live_reqs == NR_REQS - 1)
+if (nr_live_reqs == 0 || nr_live_reqs == NR_REQS - 1)
 wake_up(_wq);
 spin_unlock(_lock);
 }
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 11/17] Save/Restore Support: Add suspend/restore support for console

2018-02-13 Thread Bruno Alvisio
Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 console/console.c  | 15 -
 console/xenbus.c   |  3 +-
 console/xencons_ring.c | 83 +++---
 include/console.h  |  6 +++-
 kernel.c   |  4 +++
 lib/sys.c  |  2 +-
 6 files changed, 77 insertions(+), 36 deletions(-)

diff --git a/console/console.c b/console/console.c
index 2e04552..9814506 100644
--- a/console/console.c
+++ b/console/console.c
@@ -52,6 +52,7 @@
 
 /* If console not initialised the printk will be sent to xen serial line 
NOTE: you need to enable verbose in xen/Rules.mk for it to work. */
+static struct consfront_dev* xen_console = NULL;
 static int console_initialised = 0;
 
 __attribute__((weak)) void console_input(char * buf, unsigned len)
@@ -162,8 +163,20 @@ void xprintk(const char *fmt, ...)
 void init_console(void)
 {   
 printk("Initialising console ... ");
-xencons_ring_init();
+xen_console = xencons_ring_init();
 console_initialised = 1;
 /* This is also required to notify the daemon */
 printk("done.\n");
 }
+
+void suspend_console(void)
+{
+console_initialised = 0;
+xencons_ring_fini(xen_console);
+}
+
+void resume_console(void)
+{
+xencons_ring_resume(xen_console);
+console_initialised = 1;
+}
\ No newline at end of file
diff --git a/console/xenbus.c b/console/xenbus.c
index 1c9a590..654b469 100644
--- a/console/xenbus.c
+++ b/console/xenbus.c
@@ -188,8 +188,7 @@ error:
 return NULL;
 }
 
-void fini_console(struct consfront_dev *dev)
+void fini_consfront(struct consfront_dev *dev)
 {
 if (dev) free_consfront(dev);
 }
-
diff --git a/console/xencons_ring.c b/console/xencons_ring.c
index dd64a41..b6db74e 100644
--- a/console/xencons_ring.c
+++ b/console/xencons_ring.c
@@ -19,6 +19,8 @@ DECLARE_WAIT_QUEUE_HEAD(console_queue);
 static struct xencons_interface *console_ring;
 uint32_t console_evtchn;
 
+static struct consfront_dev* resume_xen_console(struct consfront_dev* dev);
+
 #ifdef CONFIG_PARAVIRT
 void get_console(void *p)
 {
@@ -32,10 +34,12 @@ void get_console(void *p)
 {
 uint64_t v = -1;
 
-hvm_get_parameter(HVM_PARAM_CONSOLE_EVTCHN, );
+if (hvm_get_parameter(HVM_PARAM_CONSOLE_EVTCHN, ))
+BUG();
 console_evtchn = v;
 
-hvm_get_parameter(HVM_PARAM_CONSOLE_PFN, );
+if (hvm_get_parameter(HVM_PARAM_CONSOLE_PFN, ))
+BUG();
 console_ring = (struct xencons_interface *)map_frame_virt(v);
 }
 #endif
@@ -89,9 +93,7 @@ int xencons_ring_send(struct consfront_dev *dev, const char 
*data, unsigned len)
 notify_daemon(dev);
 
 return sent;
-}  
-
-
+}
 
 void console_handle_input(evtchn_port_t port, struct pt_regs *regs, void *data)
 {
@@ -177,41 +179,60 @@ int xencons_ring_recv(struct consfront_dev *dev, char 
*data, unsigned len)
 
 struct consfront_dev *xencons_ring_init(void)
 {
-   int err;
-   struct consfront_dev *dev;
+struct consfront_dev *dev;
 
-   if (!console_evtchn)
-   return 0;
+if (!console_evtchn)
+return 0;
 
-   dev = malloc(sizeof(struct consfront_dev));
-   memset(dev, 0, sizeof(struct consfront_dev));
-   dev->nodename = "device/console";
-   dev->dom = 0;
-   dev->backend = 0;
-   dev->ring_ref = 0;
+dev = malloc(sizeof(struct consfront_dev));
+memset(dev, 0, sizeof(struct consfront_dev));
+dev->nodename = "device/console";
+dev->dom = 0;
+dev->backend = 0;
+dev->ring_ref = 0;
 
 #ifdef HAVE_LIBC
-   dev->fd = -1;
+dev->fd = -1;
 #endif
-   dev->evtchn = console_evtchn;
-   dev->ring = xencons_interface();
-
-   err = bind_evtchn(dev->evtchn, console_handle_input, dev);
-   if (err <= 0) {
-   printk("XEN console request chn bind failed %i\n", err);
-free(dev);
-   return NULL;
-   }
-unmask_evtchn(dev->evtchn);
 
-   /* In case we have in-flight data after save/restore... */
-   notify_daemon(dev);
+return resume_xen_console(dev);
+}
+
+static struct consfront_dev* resume_xen_console(struct consfront_dev* dev)
+{
+int err;
 
-   return dev;
+dev->evtchn = console_evtchn;
+dev->ring = xencons_interface();
+
+err = bind_evtchn(dev->evtchn, console_handle_input, dev);
+if (err <= 0) {
+printk("XEN console request chn bind failed %i\n", err);
+free(dev);
+return NULL;
+}
+unmask_evtchn(dev->evtchn);
+
+/* In case we have in-flight data after save/restore... */
+notify_daemon(dev);
+
+return dev;
 }
 
-void xencons_resume(void)
+void xencons_ring_fini(struct consfront_dev* dev)
 {
-   (void)xencons_ring_init();
+if (dev)
+mask_evtchn(dev->evtchn);
 }
 
+void xencons_ring_resume(struct consfront_dev* dev)
+{
+if (dev) {
+#if CONFIG_PARAVIRT
+get_console(_info);
+#else
+

[Xen-devel] [PATCH v2 08/17] Save/Restore Support: Add arch_mm_pre|post_suspend

2018-02-13 Thread Bruno Alvisio
For PV guests the pagetables reference the real MFNs rather than PFNs, so when
the guest is resumed into a different area of a hosts memory, these will need to
be rewritten. Thus for PV guests the MFNs need to be replaced with PFNs:
canonicalization.

PVH guests are auto-translated so no memory operation is needed.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 arch/x86/mm.c | 14 ++
 include/x86/arch_mm.h |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/arch/x86/mm.c b/arch/x86/mm.c
index 05ad029..1b163ac 100644
--- a/arch/x86/mm.c
+++ b/arch/x86/mm.c
@@ -848,6 +848,20 @@ void arch_init_p2m(unsigned long max_pfn)
 
 arch_remap_p2m(max_pfn);
 }
+
+void arch_mm_pre_suspend(void)
+{
+//TODO: Canonicalize pagetables
+}
+
+void arch_mm_post_suspend(int canceled)
+{
+//TODO: Locate pagetables and 'uncanonicalize' them
+}
+#else
+void arch_mm_pre_suspend(void){ }
+
+void arch_mm_post_suspend(int canceled){ }
 #endif
 
 void arch_init_mm(unsigned long* start_pfn_p, unsigned long* max_pfn_p)
diff --git a/include/x86/arch_mm.h b/include/x86/arch_mm.h
index ab8a53e..cbbeb21 100644
--- a/include/x86/arch_mm.h
+++ b/include/x86/arch_mm.h
@@ -279,6 +279,9 @@ pgentry_t *need_pgt(unsigned long addr);
 void arch_mm_preinit(void *p);
 unsigned long alloc_virt_kernel(unsigned n_pages);
 
+void arch_mm_pre_suspend(void);
+void arch_mm_post_suspend(int canceled);
+
 #ifndef CONFIG_PARAVIRT
 void arch_print_memmap(void);
 #endif
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 04/17] Save/Restore Support: Add xenbus_release_wait_for_watch

2018-02-13 Thread Bruno Alvisio
xenbus_release_wait_for_watch generates a fake event to trigger make
xenbus_wait_for_watch return. This is necessary to wake up waiting threads.

release_xenbus_id additionally checks if the number of requests == 0 to wake
up the 'waiting' suspend xenbus thread.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
Changed since v1:
  * Added doc for change in release_xenbus_id
---
 include/xenbus.h |  1 +
 xenbus/xenbus.c  | 10 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/xenbus.h b/include/xenbus.h
index 12391b9..b2d5072 100644
--- a/include/xenbus.h
+++ b/include/xenbus.h
@@ -42,6 +42,7 @@ char *xenbus_unwatch_path_token(xenbus_transaction_t xbt, 
const char *path, cons
 extern struct wait_queue_head xenbus_watch_queue;
 void xenbus_wait_for_watch(xenbus_event_queue *queue);
 char **xenbus_wait_for_watch_return(xenbus_event_queue *queue);
+void xenbus_release_wait_for_watch(xenbus_event_queue *queue);
 char* xenbus_wait_for_value(const char *path, const char *value, 
xenbus_event_queue *queue);
 char *xenbus_wait_for_state_change(const char* path, XenbusState *state, 
xenbus_event_queue *queue);
 char *xenbus_switch_state(xenbus_transaction_t xbt, const char* path, 
XenbusState state);
diff --git a/xenbus/xenbus.c b/xenbus/xenbus.c
index 636786c..c2d2bd1 100644
--- a/xenbus/xenbus.c
+++ b/xenbus/xenbus.c
@@ -129,6 +129,14 @@ void xenbus_wait_for_watch(xenbus_event_queue *queue)
 printk("unexpected path returned by watch\n");
 }
 
+void xenbus_release_wait_for_watch(xenbus_event_queue *queue)
+{
+struct xenbus_event *event = malloc(sizeof(*event));
+event->next = *queue;
+*queue = event;
+wake_up(_watch_queue);
+}
+
 char* xenbus_wait_for_value(const char* path, const char* value, 
xenbus_event_queue *queue)
 {
 if (!queue)
@@ -318,7 +326,7 @@ static void release_xenbus_id(int id)
 req_info[id].in_use = 0;
 nr_live_reqs--;
 req_info[id].in_use = 0;
-if (nr_live_reqs == NR_REQS - 1)
+if (nr_live_reqs == 0 || nr_live_reqs == NR_REQS - 1)
 wake_up(_wq);
 spin_unlock(_lock);
 }
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 119124: regressions - FAIL

2018-02-13 Thread osstest service owner
flight 119124 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/119124/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 119098

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-debianhvm-i386  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  e139d34a1c4b7775d5855458a325e0e4176bdf7e
baseline version:
 xen  3f491d6873be9caa77f02ad8d98f174f0152b819

Last test of basis   119098  2018-02-13 17:01:30 Z0 days
Testing same since   119108  2018-02-13 20:01:33 Z0 days3 attempts


People who touched revisions under test:
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  fail
 build-armhf  pass
 build-amd64-libvirt  blocked 
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 blocked 
 test-amd64-amd64-libvirt blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit e139d34a1c4b7775d5855458a325e0e4176bdf7e
Author: Jan Beulich 
Date:   Tue Feb 13 18:19:33 2018 +0100

firmware/shim: correctly handle errors during Xen tree setup

"set -e" on a separate Makefile line is meaningless. Glue together all
the lines that this is supposed to cover.

Signed-off-by: Jan Beulich 
Reviewed-by: Roger Pau Monné 
Reviewed-by: Wei Liu 
(qemu changes not included)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 00/16] Save/Restore Support for mini-OS PVH

2018-02-13 Thread Bruno Alvisio
Hi all,

I am sending the second revision for supporting save/restore in Mini-OS PVH. 
The 
branch can be found at: 

https://github.com/balvisio/mini-os/tree/feature/mini-os-suspend-support-submission-2

Feedback would be greatly appreciated.

Cheers,

Bruno

Signed-off-by: Bruno Alvisio 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 10/17] Save/Restore Support: Add suspend/resume support for timers

2018-02-13 Thread Bruno Alvisio
Signed-off-by: Bruno Alvisio 
---
Changed since v1:
   * Removed resume/suspend_time() and used init/fini_time() instead
---
 arch/x86/time.c | 1 -
 kernel.c| 4 
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/time.c b/arch/x86/time.c
index 3658142..8077c80 100644
--- a/arch/x86/time.c
+++ b/arch/x86/time.c
@@ -233,7 +233,6 @@ static void timer_handler(evtchn_port_t ev, struct pt_regs 
*regs, void *ign)
 static evtchn_port_t port;
 void init_time(void)
 {
-printk("Initialising timer interface\n");
 port = bind_virq(VIRQ_TIMER, _handler, NULL);
 unmask_evtchn(port);
 }
diff --git a/kernel.c b/kernel.c
index 782eb79..3564af3 100644
--- a/kernel.c
+++ b/kernel.c
@@ -120,10 +120,14 @@ void start_kernel(void* par)
 void pre_suspend(void)
 {
 local_irq_disable();
+
+fini_time();
 }
 
 void post_suspend(int canceled)
 {
+init_time();
+
 local_irq_enable();
 }
 
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 06/17] Save/Restore Support: Moved shutdown thread to shutdown.c

2018-02-13 Thread Bruno Alvisio
The shutdown thread present in kernel.c was removed and now the thread in
shutdown.c is created instead.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 arch/x86/setup.c |  2 +-
 include/kernel.h |  2 +-
 kernel.c | 50 ++
 3 files changed, 8 insertions(+), 46 deletions(-)

diff --git a/arch/x86/setup.c b/arch/x86/setup.c
index 3dd86f9..31fa2c6 100644
--- a/arch/x86/setup.c
+++ b/arch/x86/setup.c
@@ -201,7 +201,7 @@ arch_init(void *par)
memcpy(_info, par, sizeof(start_info));
 #endif
 
-   start_kernel();
+   start_kernel((start_info_t *)par);
 }
 
 void arch_pre_suspend(void)
diff --git a/include/kernel.h b/include/kernel.h
index 161d757..742abf5 100644
--- a/include/kernel.h
+++ b/include/kernel.h
@@ -4,7 +4,7 @@
 #define MAX_CMDLINE_SIZE 1024
 extern char cmdline[MAX_CMDLINE_SIZE];
 
-void start_kernel(void);
+void start_kernel(void* par);
 void pre_suspend(void);
 void post_suspend(int canceled);
 void do_exit(void) __attribute__((noreturn));
diff --git a/kernel.c b/kernel.c
index 90c865a..1cd40e8 100644
--- a/kernel.c
+++ b/kernel.c
@@ -42,6 +42,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_XENBUS
+#include 
+#endif
 #include 
 #include 
 #include 
@@ -66,48 +69,6 @@ void setup_xen_features(void)
 }
 }
 
-#ifdef CONFIG_XENBUS
-/* This should be overridden by the application we are linked against. */
-__attribute__((weak)) void app_shutdown(unsigned reason)
-{
-struct sched_shutdown sched_shutdown = { .reason = reason };
-printk("Shutdown requested: %d\n", reason);
-HYPERVISOR_sched_op(SCHEDOP_shutdown, _shutdown);
-}
-
-static void shutdown_thread(void *p)
-{
-const char *path = "control/shutdown";
-const char *token = path;
-xenbus_event_queue events = NULL;
-char *shutdown = NULL, *err;
-unsigned int shutdown_reason;
-xenbus_watch_path_token(XBT_NIL, path, token, );
-while ((err = xenbus_read(XBT_NIL, path, )) != NULL || 
!strcmp(shutdown, ""))
-{
-free(err);
-free(shutdown);
-shutdown = NULL;
-xenbus_wait_for_watch();
-}
-err = xenbus_unwatch_path_token(XBT_NIL, path, token);
-free(err);
-err = xenbus_write(XBT_NIL, path, "");
-free(err);
-printk("Shutting down (%s)\n", shutdown);
-
-if (!strcmp(shutdown, "poweroff"))
-shutdown_reason = SHUTDOWN_poweroff;
-else if (!strcmp(shutdown, "reboot"))
-shutdown_reason = SHUTDOWN_reboot;
-else
-/* Unknown */
-shutdown_reason = SHUTDOWN_crash;
-app_shutdown(shutdown_reason);
-free(shutdown);
-}
-#endif
-
 
 /* This should be overridden by the application we are linked against. */
 __attribute__((weak)) int app_main(void *p)
@@ -116,7 +77,7 @@ __attribute__((weak)) int app_main(void *p)
 return 0;
 }
 
-void start_kernel(void)
+void start_kernel(void* par)
 {
 /* Set up events. */
 init_events();
@@ -145,7 +106,8 @@ void start_kernel(void)
 init_xenbus();
 
 #ifdef CONFIG_XENBUS
-create_thread("shutdown", shutdown_thread, NULL);
+/* Init shutdown thread */
+init_shutdown((start_info_t *)par);
 #endif
 
 /* Call (possibly overridden) app_main() */
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 05/17] Save/Restore Support: Add kernel shutdown logic to shutdown.c

2018-02-13 Thread Bruno Alvisio
Created shutdown.c for the shutdown thread and all the shutdown related
functions.

Signed-off-by: Bruno Alvisio 
---
Changesd since v1:
   * Updated license to a BSD 3-clause. This license was taken
from the updated original file. (Repo: sysml/mini-os)
---
 Makefile   |   1 +
 include/shutdown.h |  11 
 shutdown.c | 188 +
 3 files changed, 200 insertions(+)
 create mode 100644 include/shutdown.h
 create mode 100644 shutdown.c

diff --git a/Makefile b/Makefile
index 88315c4..6a05de6 100644
--- a/Makefile
+++ b/Makefile
@@ -53,6 +53,7 @@ src-y += mm.c
 src-$(CONFIG_NETFRONT) += netfront.c
 src-$(CONFIG_PCIFRONT) += pcifront.c
 src-y += sched.c
+src-y += shutdown.c
 src-$(CONFIG_TEST) += test.c
 src-$(CONFIG_BALLOON) += balloon.c
 
diff --git a/include/shutdown.h b/include/shutdown.h
new file mode 100644
index 000..a5ec019
--- /dev/null
+++ b/include/shutdown.h
@@ -0,0 +1,11 @@
+#ifndef _SHUTDOWN_H_
+#define _SHUTDOWN_H_
+
+#include 
+
+void init_shutdown(start_info_t *si);
+
+void kernel_shutdown(int reason) __attribute__((noreturn));
+void kernel_suspend(void);
+
+#endif
diff --git a/shutdown.c b/shutdown.c
new file mode 100644
index 000..aba146e
--- /dev/null
+++ b/shutdown.c
@@ -0,0 +1,188 @@
+/*
+ *  MiniOS
+ *
+ *   file: fromdevice.cc
+ *
+ * Authors: Joao Martins 
+ *
+ *
+ * Copyright (c) 2014, NEC Europe Ltd., NEC Corporation. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the copyright holder nor the names of its
+ *contributors may be used to endorse or promote products derived from
+ *this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ * THIS HEADER MAY NOT BE EXTRACTED OR MODIFIED IN ANY WAY.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+
+static start_info_t *start_info_ptr;
+
+static const char *path = "control/shutdown";
+static const char *token = "control/shutdown";
+static xenbus_event_queue events = NULL;
+static int end_shutdown_thread = 0;
+
+#ifdef CONFIG_XENBUS
+/* This should be overridden by the application we are linked against. */
+__attribute__((weak)) void app_shutdown(unsigned reason)
+{
+printk("Shutdown requested: %d\n", reason);
+if (reason == SHUTDOWN_suspend) {
+kernel_suspend();
+} else {
+struct sched_shutdown sched_shutdown = { .reason = reason };
+HYPERVISOR_sched_op(SCHEDOP_shutdown, _shutdown);
+}
+}
+
+static void shutdown_thread(void *p)
+{
+char *shutdown, *err;
+unsigned int shutdown_reason;
+
+xenbus_watch_path_token(XBT_NIL, path, token, );
+
+for ( ;; ) {
+xenbus_wait_for_watch();
+if ((err = xenbus_read(XBT_NIL, path, ))) {
+free(err);
+do_exit();
+}
+
+if (end_shutdown_thread)
+break;
+
+if (!strcmp(shutdown, "")) {
+/* Avoid spurious event on xenbus */
+/* FIXME: investigate the reason of the spurious event */
+free(shutdown);
+continue;
+} else if (!strcmp(shutdown, "poweroff")) {
+shutdown_reason = SHUTDOWN_poweroff;
+} else if (!strcmp(shutdown, "reboot")) {
+shutdown_reason = SHUTDOWN_reboot;
+} else if (!strcmp(shutdown, "suspend")) {
+shutdown_reason = SHUTDOWN_suspend;
+} else {
+shutdown_reason = SHUTDOWN_crash;
+}
+free(shutdown);
+
+/* Acknowledge shutdown request */
+if ((err = xenbus_write(XBT_NIL, path, ""))) {
+free(err);
+

[Xen-devel] [PATCH v2 03/17] Save/Restore Support: Declare kernel and arch pre/post suspend functions

2018-02-13 Thread Bruno Alvisio
For mini-OS to support suspend and restore, the kernel will have to suspend
different modules such as xenbus, console, irq, etc. During save/restore the
kernel and arch pre_suspend and post_suspend functions will be invoked to
suspend/resume each of the modules.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 arch/x86/setup.c | 10 ++
 include/kernel.h |  2 ++
 include/x86/os.h |  4 ++--
 kernel.c | 10 ++
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/x86/setup.c b/arch/x86/setup.c
index 5278227..3dd86f9 100644
--- a/arch/x86/setup.c
+++ b/arch/x86/setup.c
@@ -204,6 +204,16 @@ arch_init(void *par)
start_kernel();
 }
 
+void arch_pre_suspend(void)
+{
+
+}
+
+void arch_post_suspend(int canceled)
+{
+
+}
+
 void
 arch_fini(void)
 {
diff --git a/include/kernel.h b/include/kernel.h
index d37ddda..161d757 100644
--- a/include/kernel.h
+++ b/include/kernel.h
@@ -5,6 +5,8 @@
 extern char cmdline[MAX_CMDLINE_SIZE];
 
 void start_kernel(void);
+void pre_suspend(void);
+void post_suspend(int canceled);
 void do_exit(void) __attribute__((noreturn));
 void arch_do_exit(void);
 void stop_kernel(void);
diff --git a/include/x86/os.h b/include/x86/os.h
index d155914..a73b63e 100644
--- a/include/x86/os.h
+++ b/include/x86/os.h
@@ -71,10 +71,10 @@ void trap_fini(void);
 void xen_callback_vector(void);
 #endif
 
+void arch_pre_suspend(void);
+void arch_post_suspend(int canceled);
 void arch_fini(void);
 
-
-
 #ifdef CONFIG_PARAVIRT
 
 /* 
diff --git a/kernel.c b/kernel.c
index 0d84a9b..90c865a 100644
--- a/kernel.c
+++ b/kernel.c
@@ -155,6 +155,16 @@ void start_kernel(void)
 run_idle_thread();
 }
 
+void pre_suspend(void)
+{
+
+}
+
+void post_suspend(int canceled)
+{
+
+}
+
 void stop_kernel(void)
 {
 /* TODO: fs import */
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 09/17] Save/Restore Support: Disable/enable IRQs during suspend/restore

2018-02-13 Thread Bruno Alvisio
Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 kernel.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel.c b/kernel.c
index 1cd40e8..782eb79 100644
--- a/kernel.c
+++ b/kernel.c
@@ -119,12 +119,12 @@ void start_kernel(void* par)
 
 void pre_suspend(void)
 {
-
+local_irq_disable();
 }
 
 void post_suspend(int canceled)
 {
-
+local_irq_enable();
 }
 
 void stop_kernel(void)
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 02/17] Save/Restore Support: Refactor trap_init() and setup vector callbacks

2018-02-13 Thread Bruno Alvisio
Currently the setup of the IDT and the request to set the HVM vector callbacks
are performed both in the trap_init function.

As part of the post-suspend operation, the HVM vector callback needs to be setup
again while the IDT does not. Thus, the trap_init function is split into two
separate functions: trap_init (sets up IDT) and xen_callback_vector (sets the
HVM vector callback). During the post-suspend operations the xen_callback_vector
function will be invoked.

Signed-off-by: Bruno Alvisio 
Reviewed-by: Samuel Thibault 
---
 arch/x86/traps.c | 17 +++--
 include/x86/os.h |  3 +++
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/traps.c b/arch/x86/traps.c
index aa17da3..a7388a5 100644
--- a/arch/x86/traps.c
+++ b/arch/x86/traps.c
@@ -389,6 +389,16 @@ static void setup_gate(unsigned int entry, void *addr, 
unsigned int dpl)
 #endif
 }
 
+void xen_callback_vector(void)
+{
+if (hvm_set_parameter(HVM_PARAM_CALLBACK_IRQ,
+ (2ULL << 56) | TRAP_xen_callback))
+{
+xprintk("Request for Xen HVM callback vector failed\n");
+do_exit();
+}
+}
+
 void trap_init(void)
 {
 setup_gate(TRAP_divide_error, _error, 0);
@@ -415,12 +425,7 @@ void trap_init(void)
 gdt[GDTE_TSS] = (typeof(*gdt))INIT_GDTE((unsigned long), 0x67, 0x89);
 asm volatile ("ltr %w0" :: "rm" (GDTE_TSS * 8));
 
-if ( hvm_set_parameter(HVM_PARAM_CALLBACK_IRQ,
-   (2ULL << 56) | TRAP_xen_callback) )
-{
-xprintk("Request for Xen HVM callback vector failed\n");
-do_exit();
-}
+xen_callback_vector();
 }
 
 void trap_fini(void)
diff --git a/include/x86/os.h b/include/x86/os.h
index fbc2eeb..d155914 100644
--- a/include/x86/os.h
+++ b/include/x86/os.h
@@ -67,6 +67,9 @@ extern shared_info_t *HYPERVISOR_shared_info;
 
 void trap_init(void);
 void trap_fini(void);
+#ifndef CONFIG_PARAVIRT
+void xen_callback_vector(void);
+#endif
 
 void arch_fini(void);
 
-- 
2.3.2 (Apple Git-55)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC PATCH v2 8/9] hyper_dmabuf: event-polling mechanism for detecting a new hyper_DMABUF

2018-02-13 Thread Dongwon Kim
New method based on polling for a importing VM to know about a new
hyper_DMABUF exported to it.

For this, the userspace now can poll the device node to check if
there a new event, which is created if there's a new hyper_DMABUF
available in importing VM (just exported).

A poll function call was added to the device driver interface for this
new functionality. Event-generation functionalitywas also implemented in
all other relavant parts of driver.

This "event-polling" mechanism is optional feature and can be enabled
by setting a Kernel config option, "HYPER_DMABUF_EVENT_GEN".

Signed-off-by: Dongwon Kim 
Signed-off-by: Mateusz Polrola 
---
 drivers/dma-buf/hyper_dmabuf/Kconfig  |  20 +++
 drivers/dma-buf/hyper_dmabuf/Makefile |   1 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c   | 146 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.h   |  11 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_event.c | 122 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_event.h |  38 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.c  |   1 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c   |  11 ++
 include/uapi/linux/hyper_dmabuf.h |  11 ++
 9 files changed, 361 insertions(+)
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_event.c
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_event.h

diff --git a/drivers/dma-buf/hyper_dmabuf/Kconfig 
b/drivers/dma-buf/hyper_dmabuf/Kconfig
index 68f3d6ce2c1f..92510731af25 100644
--- a/drivers/dma-buf/hyper_dmabuf/Kconfig
+++ b/drivers/dma-buf/hyper_dmabuf/Kconfig
@@ -20,6 +20,16 @@ config HYPER_DMABUF_SYSFS
 
  The location of sysfs is under ""
 
+config HYPER_DMABUF_EVENT_GEN
+bool "Enable event-generation and polling operation"
+default n
+depends on HYPER_DMABUF
+help
+  With this config enabled, hyper_dmabuf driver on the importer side
+  generates events and queue those up in the event list whenever a new
+  shared DMA-BUF is available. Events in the list can be retrieved by
+  read operation.
+
 config HYPER_DMABUF_XEN
 bool "Configure hyper_dmabuf for XEN hypervisor"
 default y
@@ -27,4 +37,14 @@ config HYPER_DMABUF_XEN
 help
   Enabling Hyper_DMABUF Backend for XEN hypervisor
 
+config HYPER_DMABUF_XEN_AUTO_RX_CH_ADD
+bool "Enable automatic rx-ch add with 10 secs interval"
+default y
+depends on HYPER_DMABUF && HYPER_DMABUF_XEN
+help
+  If enabled, driver reads a node in xenstore every 10 seconds
+  to check whether there is any tx comm ch configured by another
+  domain then initialize matched rx comm ch automatically for any
+  existing tx comm chs.
+
 endmenu
diff --git a/drivers/dma-buf/hyper_dmabuf/Makefile 
b/drivers/dma-buf/hyper_dmabuf/Makefile
index 578a669a0d3e..f573dd5c4054 100644
--- a/drivers/dma-buf/hyper_dmabuf/Makefile
+++ b/drivers/dma-buf/hyper_dmabuf/Makefile
@@ -11,6 +11,7 @@ ifneq ($(KERNELRELEASE),)
 hyper_dmabuf_id.o \
 hyper_dmabuf_remote_sync.o \
 hyper_dmabuf_query.o \
+hyper_dmabuf_event.o \
 
 ifeq ($(CONFIG_HYPER_DMABUF_XEN), y)
$(TARGET_MODULE)-objs += backends/xen/hyper_dmabuf_xen_comm.o \
diff --git a/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c 
b/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c
index 3320f9dcc769..087f091ccae9 100644
--- a/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c
+++ b/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c
@@ -41,6 +41,7 @@
 #include "hyper_dmabuf_ioctl.h"
 #include "hyper_dmabuf_list.h"
 #include "hyper_dmabuf_id.h"
+#include "hyper_dmabuf_event.h"
 
 #ifdef CONFIG_HYPER_DMABUF_XEN
 #include "backends/xen/hyper_dmabuf_xen_drv.h"
@@ -91,10 +92,138 @@ static int hyper_dmabuf_release(struct inode *inode, 
struct file *filp)
return 0;
 }
 
+#ifdef CONFIG_HYPER_DMABUF_EVENT_GEN
+
+static unsigned int hyper_dmabuf_event_poll(struct file *filp,
+struct poll_table_struct *wait)
+{
+   poll_wait(filp, _drv_priv->event_wait, wait);
+
+   if (!list_empty(_drv_priv->event_list))
+   return POLLIN | POLLRDNORM;
+
+   return 0;
+}
+
+static ssize_t hyper_dmabuf_event_read(struct file *filp, char __user *buffer,
+   size_t count, loff_t *offset)
+{
+   int ret;
+
+   /* only root can read events */
+   if (!capable(CAP_DAC_OVERRIDE)) {
+   dev_err(hy_drv_priv->dev,
+   "Only root can read events\n");
+   return -EPERM;
+   }
+
+   /* make sure user buffer can be written */
+   if (!access_ok(VERIFY_WRITE, buffer, count)) {
+   dev_err(hy_drv_priv->dev,
+   "User buffer can't be 

[Xen-devel] [RFC PATCH v2 1/9] hyper_dmabuf: initial upload of hyper_dmabuf drv core framework

2018-02-13 Thread Dongwon Kim
Upload of intial version of core framework in hyper_DMABUF driver
enabling DMA_BUF exchange between two different VMs in virtualized
platform based on Hypervisor such as XEN.

Hyper_DMABUF drv's primary role is to import a DMA_BUF from originator
then re-export it to another Linux VM so that it can be mapped and
accessed in there.

This driver has two layers, one is so called, "core framework", which
contains driver interface and core functions handling export/import of
new hyper_DMABUF and its maintenance. This part of the driver is
independent from Hypervisor so can work as is with any Hypervisor.

The other layer is called "Hypervisor Backend". This layer represents
the interface between "core framework" and actual Hypervisor, handling
memory sharing and communication. Not like "core framework", every
Hypervisor needs it's own backend interface designed using its native
mechanism for memory sharing and inter-VM communication.

This patch contains the first part, "core framework", which consists of
7 source files and 11 header files. Some brief description of these
source code are attached below:

hyper_dmabuf_drv.c

- Linux driver interface and initialization/cleaning-up routines

hyper_dmabuf_ioctl.c

- IOCTLs calls for export/import of DMA-BUF comm channel's creation and
  destruction.

hyper_dmabuf_sgl_proc.c

- Provides methods to managing DMA-BUF for exporing and importing. For
  exporting, extraction of pages, sharing pages via procedures in
  "Backend" and notifying importing VM exist. For importing, all
  operations related to the reconstruction of DMA-BUF (with shared
  pages) on importer's side are defined.

hyper_dmabuf_ops.c

- Standard DMA-BUF operations for hyper_DMABUF reconstructed on
  importer's side.

hyper_dmabuf_list.c

- Lists for storing exported and imported hyper_DMABUF to keep track of
  remote usage of hyper_DMABUF currently being shared.

hyper_dmabuf_msg.c

- Defines messages exchanged between VMs (exporter and importer) and
  function calls for sending and parsing (when received) those.

hyper_dmabuf_id.c

- Contains methods to generate and manage "hyper_DMABUF id" for each
  hyper_DMABUF being exported. It is a global handle for a hyper_DMABUF,
  which another VM needs to know to import it.

hyper_dmabuf_struct.h

- Contains data structures of importer or exporter hyper_DMABUF

include/uapi/linux/hyper_dmabuf.h

- Contains definition of data types and structures referenced by user
  application to interact with driver

Signed-off-by: Dongwon Kim 
Signed-off-by: Mateusz Polrola 
---
 drivers/dma-buf/Kconfig|   2 +
 drivers/dma-buf/Makefile   |   1 +
 drivers/dma-buf/hyper_dmabuf/Kconfig   |  23 +
 drivers/dma-buf/hyper_dmabuf/Makefile  |  34 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c| 254 
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.h| 111 
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.c | 135 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.h |  53 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c  | 672 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.h  |  52 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.c   | 294 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.h   |  73 +++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c| 320 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.h|  87 +++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.c| 264 
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.h|  34 ++
 .../dma-buf/hyper_dmabuf/hyper_dmabuf_sgl_proc.c   | 256 
 .../dma-buf/hyper_dmabuf/hyper_dmabuf_sgl_proc.h   |  43 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_struct.h | 131 
 include/uapi/linux/hyper_dmabuf.h  |  87 +++
 20 files changed, 2926 insertions(+)
 create mode 100644 drivers/dma-buf/hyper_dmabuf/Kconfig
 create mode 100644 drivers/dma-buf/hyper_dmabuf/Makefile
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.h
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.c
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.h
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.h
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.c
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.h
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.h
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.c
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.h
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_sgl_proc.c
 create mode 100644 

[Xen-devel] [RFC PATCH v2 2/9] hyper_dmabuf: architecture specification and reference guide

2018-02-13 Thread Dongwon Kim
Reference document for hyper_DMABUF driver

Documentation/hyper-dmabuf-sharing.txt

Signed-off-by: Dongwon Kim 
---
 Documentation/hyper-dmabuf-sharing.txt | 734 +
 1 file changed, 734 insertions(+)
 create mode 100644 Documentation/hyper-dmabuf-sharing.txt

diff --git a/Documentation/hyper-dmabuf-sharing.txt 
b/Documentation/hyper-dmabuf-sharing.txt
new file mode 100644
index ..928e411931e3
--- /dev/null
+++ b/Documentation/hyper-dmabuf-sharing.txt
@@ -0,0 +1,734 @@
+Linux Hyper DMABUF Driver
+
+--
+Section 1. Overview
+--
+
+Hyper_DMABUF driver is a Linux device driver running on multiple Virtual
+achines (VMs), which expands DMA-BUF sharing capability to the VM environment
+where multiple different OS instances need to share same physical data without
+data-copy across VMs.
+
+To share a DMA_BUF across VMs, an instance of the Hyper_DMABUF drv on the
+exporting VM (so called, “exporter”) imports a local DMA_BUF from the original
+producer of the buffer, then re-exports it with an unique ID, hyper_dmabuf_id
+for the buffer to the importing VM (so called, “importer”).
+
+Another instance of the Hyper_DMABUF driver on importer registers
+a hyper_dmabuf_id together with reference information for the shared physical
+pages associated with the DMA_BUF to its database when the export happens.
+
+The actual mapping of the DMA_BUF on the importer’s side is done by
+the Hyper_DMABUF driver when user space issues the IOCTL command to access
+the shared DMA_BUF. The Hyper_DMABUF driver works as both an importing and
+exporting driver as is, that is, no special configuration is required.
+Consequently, only a single module per VM is needed to enable cross-VM DMA_BUF
+exchange.
+
+--
+Section 2. Architecture
+--
+
+1. Hyper_DMABUF ID
+
+hyper_dmabuf_id is a global handle for shared DMA BUFs, which is compatible
+across VMs. It is a key used by the importer to retrieve information about
+shared Kernel pages behind the DMA_BUF structure from the IMPORT list. When
+a DMA_BUF is exported to another domain, its hyper_dmabuf_id and META data
+are also kept in the EXPORT list by the exporter for further synchronization
+of control over the DMA_BUF.
+
+hyper_dmabuf_id is “targeted”, meaning it is valid only in exporting (owner of
+the buffer) and importing VMs, where the corresponding hyper_dmabuf_id is
+stored in their database (EXPORT and IMPORT lists).
+
+A user-space application specifies the targeted VM id in the user parameter
+when it calls the IOCTL command to export shared DMA_BUF to another VM.
+
+hyper_dmabuf_id_t is a data type for hyper_dmabuf_id. It is defined as 16-byte
+data structure, and it contains id and rng_key[3] as elements for
+the structure.
+
+typedef struct {
+int id;
+int rng_key[3]; /* 12bytes long random number */
+} hyper_dmabuf_id_t;
+
+The first element in the hyper_dmabuf_id structure, int id is combined data of
+a count number generated by the driver running on the exporter and
+the exporter’s ID. The VM’s ID is a one byte value and located at the field’s
+SB in int id. The remaining three bytes in int id are reserved for a count
+number.
+
+However, there is a limit related to this count number, which is 1000.
+Therefore, only little more than a byte starting from the LSB is actually used
+for storing this count number.
+
+#define HYPER_DMABUF_ID_CREATE(domid, id) \
+domid) & 0xFF) << 24) | ((id) & 0xFF))
+
+This limit on the count number directly means the maximum number of DMA BUFs
+that  can be shared simultaneously by one VM. The second element of
+hyper_dmabuf_id, that is int rng_key[3], is an array of three integers. These
+numbers are generated by Linux’s native random number generation mechanism.
+This field is added to enhance the security of the Hyper DMABUF driver by
+maximizing the entropy of hyper_dmabuf_id (that is, preventing it from being
+guessed by a security attacker).
+
+Once DMA_BUF is no longer shared, the hyper_dmabuf_id associated with
+the DMA_BUF is released, but the count number in hyper_dmabuf_id is saved in
+the ID list for reuse. However, random keys stored in int rng_key[3] are not
+reused. Instead, those keys are always filled with freshly generated random
+keys for security.
+
+2. IOCTLs
+
+a. IOCTL_HYPER_DMABUF_TX_CH_SETUP
+
+This type of IOCTL is used for initialization of a one-directional transmit
+communication channel with a remote domain.
+
+The user space argument for this type of IOCTL is defined as:
+
+struct ioctl_hyper_dmabuf_tx_ch_setup {
+/* IN parameters */
+/* Remote domain id */
+int remote_domain;
+};
+
+b. 

[Xen-devel] [RFC PATCH v2 5/9] hyper_dmabuf: default backend for XEN hypervisor

2018-02-13 Thread Dongwon Kim
From: "Matuesz Polrola" 

The default backend for XEN hypervisor. This backend contains actual
implementation of individual methods defined in "struct hyper_dmabuf_bknd_ops"
defined as:

struct hyper_dmabuf_bknd_ops {
/* backend initialization routine (optional) */
int (*init)(void);

/* backend cleanup routine (optional) */
int (*cleanup)(void);

/* retreiving id of current virtual machine */
int (*get_vm_id)(void);

/* get pages shared via hypervisor-specific method */
int (*share_pages)(struct page **, int, int, void **);

/* make shared pages unshared via hypervisor specific method */
int (*unshare_pages)(void **, int);

/* map remotely shared pages on importer's side via
 * hypervisor-specific method
 */
struct page ** (*map_shared_pages)(unsigned long, int, int, void **);

/* unmap and free shared pages on importer's side via
 * hypervisor-specific method
 */
int (*unmap_shared_pages)(void **, int);

/* initialize communication environment */
int (*init_comm_env)(void);

void (*destroy_comm)(void);

/* upstream ch setup (receiving and responding) */
int (*init_rx_ch)(int);

/* downstream ch setup (transmitting and parsing responses) */
int (*init_tx_ch)(int);

int (*send_req)(int, struct hyper_dmabuf_req *, int);
};

First two methods are for extra initialization or cleaning up possibly
required for the current Hypervisor (optional). Third method
(.get_vm_id) provides a way to get current VM's id, which will be used
as an identication of source VM of shared hyper_DMABUF later.

All other methods are related to either memory sharing or inter-VM
communication, which are minimum requirement for hyper_DMABUF driver.
(Brief description of role of each method is embedded as a comment in the
definition of the structure above and header file.)

Actual implementation of each of these methods specific to XEN is under
backends/xen/. Their mappings are done as followed:

struct hyper_dmabuf_bknd_ops xen_bknd_ops = {
.init = NULL, /* not needed for xen */
.cleanup = NULL, /* not needed for xen */
.get_vm_id = xen_be_get_domid,
.share_pages = xen_be_share_pages,
.unshare_pages = xen_be_unshare_pages,
.map_shared_pages = (void *)xen_be_map_shared_pages,
.unmap_shared_pages = xen_be_unmap_shared_pages,
.init_comm_env = xen_be_init_comm_env,
.destroy_comm = xen_be_destroy_comm,
.init_rx_ch = xen_be_init_rx_rbuf,
.init_tx_ch = xen_be_init_tx_rbuf,
.send_req = xen_be_send_req,
};

A section for Hypervisor Backend has been added to

"Documentation/hyper-dmabuf-sharing.txt" accordingly

Signed-off-by: Dongwon Kim 
Signed-off-by: Mateusz Polrola 
---
 drivers/dma-buf/hyper_dmabuf/Kconfig   |   7 +
 drivers/dma-buf/hyper_dmabuf/Makefile  |   7 +
 .../backends/xen/hyper_dmabuf_xen_comm.c   | 941 +
 .../backends/xen/hyper_dmabuf_xen_comm.h   |  78 ++
 .../backends/xen/hyper_dmabuf_xen_comm_list.c  | 158 
 .../backends/xen/hyper_dmabuf_xen_comm_list.h  |  67 ++
 .../backends/xen/hyper_dmabuf_xen_drv.c|  46 +
 .../backends/xen/hyper_dmabuf_xen_drv.h|  53 ++
 .../backends/xen/hyper_dmabuf_xen_shm.c| 525 
 .../backends/xen/hyper_dmabuf_xen_shm.h|  46 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c|  10 +
 11 files changed, 1938 insertions(+)
 create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c
 create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.h
 create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm_list.c
 create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm_list.h
 create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_drv.c
 create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_drv.h
 create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_shm.c
 create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_shm.h

diff --git a/drivers/dma-buf/hyper_dmabuf/Kconfig 
b/drivers/dma-buf/hyper_dmabuf/Kconfig
index 5ebf516d65eb..68f3d6ce2c1f 100644
--- a/drivers/dma-buf/hyper_dmabuf/Kconfig
+++ b/drivers/dma-buf/hyper_dmabuf/Kconfig
@@ -20,4 +20,11 @@ config HYPER_DMABUF_SYSFS
 
  The location of sysfs is under ""
 
+config HYPER_DMABUF_XEN
+bool "Configure hyper_dmabuf for XEN hypervisor"
+default y
+depends on HYPER_DMABUF && XEN && XENFS
+help
+  Enabling Hyper_DMABUF Backend for XEN hypervisor
+
 endmenu
diff --git 

[Xen-devel] [RFC PATCH v2 9/9] hyper_dmabuf: threaded interrupt in Xen-backend

2018-02-13 Thread Dongwon Kim
Use threaded interrupt intead of regular one because most part of ISR
is time-critical and possibly sleeps

Signed-off-by: Dongwon Kim 
---
 .../hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c 
b/drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c
index 30bc4b6304ac..65af5ddfb2d7 100644
--- a/drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c
+++ b/drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c
@@ -332,11 +332,14 @@ int xen_be_init_tx_rbuf(int domid)
}
 
/* setting up interrupt */
-   ret = bind_evtchn_to_irqhandler(alloc_unbound.port,
-   front_ring_isr, 0,
-   NULL, (void *) ring_info);
+   ring_info->irq = bind_evtchn_to_irq(alloc_unbound.port);
 
-   if (ret < 0) {
+   ret = request_threaded_irq(ring_info->irq,
+  NULL,
+  front_ring_isr,
+  IRQF_ONESHOT, NULL, ring_info);
+
+   if (ret != 0) {
dev_err(hy_drv_priv->dev,
"Failed to setup event channel\n");
close.port = alloc_unbound.port;
@@ -348,7 +351,6 @@ int xen_be_init_tx_rbuf(int domid)
}
 
ring_info->rdomain = domid;
-   ring_info->irq = ret;
ring_info->port = alloc_unbound.port;
 
mutex_init(_info->lock);
@@ -535,9 +537,10 @@ int xen_be_init_rx_rbuf(int domid)
if (!xen_comm_find_tx_ring(domid))
ret = xen_be_init_tx_rbuf(domid);
 
-   ret = request_irq(ring_info->irq,
- back_ring_isr, 0,
- NULL, (void *)ring_info);
+   ret = request_threaded_irq(ring_info->irq,
+  NULL,
+  back_ring_isr, IRQF_ONESHOT,
+  NULL, (void *)ring_info);
 
return ret;
 
-- 
2.16.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC PATCH v2 7/9] hyper_dmabuf: query ioctl for retreiving various hyper_DMABUF info

2018-02-13 Thread Dongwon Kim
Add a new ioctl, "IOCTL_HYPER_DMABUF_QUERY" for the userspace to
retreive various information about hyper_DMABUF, currently being shared
across VMs.

Supported query items are as followed:

enum hyper_dmabuf_query {
HYPER_DMABUF_QUERY_TYPE = 0x10,
HYPER_DMABUF_QUERY_EXPORTER,
HYPER_DMABUF_QUERY_IMPORTER,
HYPER_DMABUF_QUERY_SIZE,
HYPER_DMABUF_QUERY_BUSY,
HYPER_DMABUF_QUERY_UNEXPORTED,
HYPER_DMABUF_QUERY_DELAYED_UNEXPORTED,
HYPER_DMABUF_QUERY_PRIV_INFO_SIZE,
HYPER_DMABUF_QUERY_PRIV_INFO,
};

Query IOCTL call with each query item above returns,

HYPER_DMABUF_QUERY_TYPE - type - EXPORTED/IMPORTED of hyper_DMABUF from
current VM's perspective.

HYPER_DMABUF_QUERY_EXPORTER - ID of exporting VM

HYPER_DMABUF_QUERY_IMPORTER - ID of importing VM

HYPER_DMABUF_QUERY_SIZE - size of shared buffer in byte

HYPER_DMABUF_QUERY_BUSY - true if hyper_DMABUF is being actively used
(e.g. attached and mapped by end-consumer)

HYPER_DMABUF_QUERY_UNEXPORTED - true if hyper_DMABUF has been unexported
on exporting VM's side.

HYPER_DMABUF_QUERY_DELAYED_UNEXPORTED - true if hyper_DMABUF is scheduled
to be unexported (still valid but will be unexported soon)

HYPER_DMABUF_QUERY_PRIV_INFO_SIZE - size of private information (given by
user application on exporter's side) attached to hyper_DMABUF

HYPER_DMABUF_QUERY_PRIV_INFO - private information attached to hyper_DMABUF

Signed-off-by: Dongwon Kim 
Signed-off-by: Mateusz Polrola 
---
 drivers/dma-buf/hyper_dmabuf/Makefile |   1 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c |  49 +-
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_query.c | 174 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_query.h |  36 +
 include/uapi/linux/hyper_dmabuf.h |  32 
 5 files changed, 291 insertions(+), 1 deletion(-)
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_query.c
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_query.h

diff --git a/drivers/dma-buf/hyper_dmabuf/Makefile 
b/drivers/dma-buf/hyper_dmabuf/Makefile
index 702696f29215..578a669a0d3e 100644
--- a/drivers/dma-buf/hyper_dmabuf/Makefile
+++ b/drivers/dma-buf/hyper_dmabuf/Makefile
@@ -10,6 +10,7 @@ ifneq ($(KERNELRELEASE),)
 hyper_dmabuf_msg.o \
 hyper_dmabuf_id.o \
 hyper_dmabuf_remote_sync.o \
+hyper_dmabuf_query.o \
 
 ifeq ($(CONFIG_HYPER_DMABUF_XEN), y)
$(TARGET_MODULE)-objs += backends/xen/hyper_dmabuf_xen_comm.o \
diff --git a/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c 
b/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c
index 168ccf98f710..e90e59cd0568 100644
--- a/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c
+++ b/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c
@@ -41,6 +41,7 @@
 #include "hyper_dmabuf_msg.h"
 #include "hyper_dmabuf_sgl_proc.h"
 #include "hyper_dmabuf_ops.h"
+#include "hyper_dmabuf_query.h"
 
 static int hyper_dmabuf_tx_ch_setup_ioctl(struct file *filp, void *data)
 {
@@ -543,7 +544,6 @@ static int hyper_dmabuf_export_fd_ioctl(struct file *filp, 
void *data)
hyper_dmabuf_create_req(req,
HYPER_DMABUF_EXPORT_FD_FAILED,
[0]);
-
bknd_ops->send_req(HYPER_DMABUF_DOM_ID(imported->hid),
   req, false);
kfree(req);
@@ -682,6 +682,51 @@ int hyper_dmabuf_unexport_ioctl(struct file *filp, void 
*data)
return 0;
 }
 
+static int hyper_dmabuf_query_ioctl(struct file *filp, void *data)
+{
+   struct ioctl_hyper_dmabuf_query *query_attr =
+   (struct ioctl_hyper_dmabuf_query *)data;
+   struct exported_sgt_info *exported = NULL;
+   struct imported_sgt_info *imported = NULL;
+   int ret = 0;
+
+   if (HYPER_DMABUF_DOM_ID(query_attr->hid) == hy_drv_priv->domid) {
+   /* query for exported dmabuf */
+   exported = hyper_dmabuf_find_exported(query_attr->hid);
+   if (exported) {
+   ret = hyper_dmabuf_query_exported(exported,
+ query_attr->item,
+ _attr->info);
+   } else {
+   dev_err(hy_drv_priv->dev,
+   "hid {id:%d key:%d %d %d} not in exp list\n",
+   query_attr->hid.id,
+   query_attr->hid.rng_key[0],
+   query_attr->hid.rng_key[1],
+   query_attr->hid.rng_key[2]);
+   return -ENOENT;
+   }
+   } else {
+   /* query for imported dmabuf */
+

[Xen-devel] [RFC PATCH v2 6/9] hyper_dmabuf: hyper_DMABUF synchronization across VM

2018-02-13 Thread Dongwon Kim
All of hyper_DMABUF operations now (hyper_dmabuf_ops.c) send a message
to the exporting VM for synchronization between two VMs. For this, every
mapping done by importer will make exporter perform shadow mapping of
original DMA-BUF. Then all consecutive DMA-BUF operations (attach, detach,
map/unmap and so on) will be mimicked on this shadowed DMA-BUF for tracking
and synchronization purpose (e.g. +-reference count to check the status).

Signed-off-by: Dongwon Kim 
Signed-off-by: Mateusz Polrola 
---
 drivers/dma-buf/hyper_dmabuf/Makefile  |   1 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c|  53 +++-
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.h|   2 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.c| 157 +-
 .../hyper_dmabuf/hyper_dmabuf_remote_sync.c| 324 +
 .../hyper_dmabuf/hyper_dmabuf_remote_sync.h|  32 ++
 6 files changed, 565 insertions(+), 4 deletions(-)
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_remote_sync.c
 create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_remote_sync.h

diff --git a/drivers/dma-buf/hyper_dmabuf/Makefile 
b/drivers/dma-buf/hyper_dmabuf/Makefile
index b9ab4eeca6f2..702696f29215 100644
--- a/drivers/dma-buf/hyper_dmabuf/Makefile
+++ b/drivers/dma-buf/hyper_dmabuf/Makefile
@@ -9,6 +9,7 @@ ifneq ($(KERNELRELEASE),)
 hyper_dmabuf_ops.o \
 hyper_dmabuf_msg.o \
 hyper_dmabuf_id.o \
+hyper_dmabuf_remote_sync.o \
 
 ifeq ($(CONFIG_HYPER_DMABUF_XEN), y)
$(TARGET_MODULE)-objs += backends/xen/hyper_dmabuf_xen_comm.o \
diff --git a/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c 
b/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c
index 7176fa8fb139..1592d5cfaa52 100644
--- a/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c
+++ b/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c
@@ -34,6 +34,7 @@
 #include 
 #include "hyper_dmabuf_drv.h"
 #include "hyper_dmabuf_msg.h"
+#include "hyper_dmabuf_remote_sync.h"
 #include "hyper_dmabuf_list.h"
 
 struct cmd_process {
@@ -92,6 +93,25 @@ void hyper_dmabuf_create_req(struct hyper_dmabuf_req *req,
req->op[i] = op[i];
break;
 
+   case HYPER_DMABUF_OPS_TO_REMOTE:
+   /* notifying dmabuf map/unmap to importer (probably not needed)
+* for dmabuf synchronization
+*/
+   break;
+
+   case HYPER_DMABUF_OPS_TO_SOURCE:
+   /* notifying dmabuf map/unmap to exporter, map will make
+* the driver to do shadow mapping or unmapping for
+* synchronization with original exporter (e.g. i915)
+*
+* command : DMABUF_OPS_TO_SOURCE.
+* op0~3 : hyper_dmabuf_id
+* op4 : map(=1)/unmap(=2)/attach(=3)/detach(=4)
+*/
+   for (i = 0; i < 5; i++)
+   req->op[i] = op[i];
+   break;
+
default:
/* no command found */
return;
@@ -201,6 +221,12 @@ static void cmd_process_work(struct work_struct *work)
 
break;
 
+   case HYPER_DMABUF_OPS_TO_REMOTE:
+   /* notifying dmabuf map/unmap to importer
+* (probably not needed) for dmabuf synchronization
+*/
+   break;
+
default:
/* shouldn't get here */
break;
@@ -217,6 +243,7 @@ int hyper_dmabuf_msg_parse(int domid, struct 
hyper_dmabuf_req *req)
struct imported_sgt_info *imported;
struct exported_sgt_info *exported;
hyper_dmabuf_id_t hid;
+   int ret;
 
if (!req) {
dev_err(hy_drv_priv->dev, "request is NULL\n");
@@ -229,7 +256,7 @@ int hyper_dmabuf_msg_parse(int domid, struct 
hyper_dmabuf_req *req)
hid.rng_key[2] = req->op[3];
 
if ((req->cmd < HYPER_DMABUF_EXPORT) ||
-   (req->cmd > HYPER_DMABUF_NOTIFY_UNEXPORT)) {
+   (req->cmd > HYPER_DMABUF_OPS_TO_SOURCE)) {
dev_err(hy_drv_priv->dev, "invalid command\n");
return -EINVAL;
}
@@ -271,6 +298,30 @@ int hyper_dmabuf_msg_parse(int domid, struct 
hyper_dmabuf_req *req)
return req->cmd;
}
 
+   /* dma buf remote synchronization */
+   if (req->cmd == HYPER_DMABUF_OPS_TO_SOURCE) {
+   /* notifying dmabuf map/unmap to exporter, map will
+* make the driver to do shadow mapping
+* or unmapping for synchronization with original
+* exporter (e.g. i915)
+*
+* command : DMABUF_OPS_TO_SOURCE.
+* op0~3 : hyper_dmabuf_id
+* op1 : enum hyper_dmabuf_ops {}
+*/
+   dev_dbg(hy_drv_priv->dev,
+  

[Xen-devel] [RFC PATCH v2 0/9] hyper_dmabuf: Hyper_DMABUF driver

2018-02-13 Thread Dongwon Kim
This patch series contains the implementation of a new device driver,
hyper_DMABUF driver, which provides a way to expand the boundary of
Linux DMA-BUF sharing to across different VM instances in Multi-OS platform
enabled by a Hypervisor (e.g. XEN)

This version 2 series is basically refactored version of old series starting
with "[RFC PATCH 01/60] hyper_dmabuf: initial working version of hyper_dmabuf
drv"

Implementation details of this driver are described in the reference guide
added by the second patch, "[RFC PATCH v2 2/5] hyper_dmabuf: architecture
specification and reference guide".

Attaching 'Overview' section here as a quick summary.

--
Section 1. Overview
--

Hyper_DMABUF driver is a Linux device driver running on multiple Virtual
achines (VMs), which expands DMA-BUF sharing capability to the VM environment
where multiple different OS instances need to share same physical data without
data-copy across VMs.

To share a DMA_BUF across VMs, an instance of the Hyper_DMABUF drv on the
exporting VM (so called, “exporter”) imports a local DMA_BUF from the original
producer of the buffer, then re-exports it with an unique ID, hyper_dmabuf_id
for the buffer to the importing VM (so called, “importer”).

Another instance of the Hyper_DMABUF driver on importer registers
a hyper_dmabuf_id together with reference information for the shared physical
pages associated with the DMA_BUF to its database when the export happens.

The actual mapping of the DMA_BUF on the importer’s side is done by
the Hyper_DMABUF driver when user space issues the IOCTL command to access
the shared DMA_BUF. The Hyper_DMABUF driver works as both an importing and
exporting driver as is, that is, no special configuration is required.
Consequently, only a single module per VM is needed to enable cross-VM DMA_BUF
exchange.

--

There is a git repository at github.com where this series of patches are all
integrated in Linux kernel tree based on the commit:

commit ae64f9bd1d3621b5e60d7363bc20afb46aede215
Author: Linus Torvalds 
Date:   Sun Dec 3 11:01:47 2018 -0500

Linux 4.15-rc2

https://github.com/downor/linux_hyper_dmabuf.git hyper_dmabuf_integration_v4

Dongwon Kim, Mateusz Polrola (9):
  hyper_dmabuf: initial upload of hyper_dmabuf drv core framework
  hyper_dmabuf: architecture specification and reference guide
  MAINTAINERS: adding Hyper_DMABUF driver section in MAINTAINERS
  hyper_dmabuf: user private data attached to hyper_DMABUF
  hyper_dmabuf: hyper_DMABUF synchronization across VM
  hyper_dmabuf: query ioctl for retreiving various hyper_DMABUF info
  hyper_dmabuf: event-polling mechanism for detecting a new hyper_DMABUF
  hyper_dmabuf: threaded interrupt in Xen-backend
  hyper_dmabuf: default backend for XEN hypervisor

 Documentation/hyper-dmabuf-sharing.txt | 734 
 MAINTAINERS|  11 +
 drivers/dma-buf/Kconfig|   2 +
 drivers/dma-buf/Makefile   |   1 +
 drivers/dma-buf/hyper_dmabuf/Kconfig   |  50 ++
 drivers/dma-buf/hyper_dmabuf/Makefile  |  44 +
 .../backends/xen/hyper_dmabuf_xen_comm.c   | 944 +
 .../backends/xen/hyper_dmabuf_xen_comm.h   |  78 ++
 .../backends/xen/hyper_dmabuf_xen_comm_list.c  | 158 
 .../backends/xen/hyper_dmabuf_xen_comm_list.h  |  67 ++
 .../backends/xen/hyper_dmabuf_xen_drv.c|  46 +
 .../backends/xen/hyper_dmabuf_xen_drv.h|  53 ++
 .../backends/xen/hyper_dmabuf_xen_shm.c| 525 
 .../backends/xen/hyper_dmabuf_xen_shm.h|  46 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c| 410 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.h| 122 +++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_event.c  | 122 +++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_event.h  |  38 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.c | 135 +++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.h |  53 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c  | 794 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.h  |  52 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.c   | 295 +++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.h   |  73 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c| 416 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.h|  89 ++
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.c| 415 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.h|  34 +
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_query.c  | 174 
 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_query.h  |  36 +
 

[Xen-devel] [RFC PATCH v2 3/9] MAINTAINERS: adding Hyper_DMABUF driver section in MAINTAINERS

2018-02-13 Thread Dongwon Kim
Hyper_DMABUF DRIVER
M:  Dongwon Kim 
M:  Mateusz Polrola 
L:  linux-ker...@vger.kernel.org
L:  xen-devel@lists.xenproject.org
S:  Maintained
F:  drivers/dma-buf/hyper_dmabuf*
F:  include/uapi/linux/hyper_dmabuf.h
F:  Documentation/hyper-dmabuf-sharing.txt
T:  https://github.com/downor/linux_hyper_dmabuf/

Signed-off-by: Dongwon Kim 
---
 MAINTAINERS | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d4fdcb12616c..155f7f839201 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6468,6 +6468,17 @@ S:   Maintained
 F: mm/memory-failure.c
 F: mm/hwpoison-inject.c
 
+Hyper_DMABUF DRIVER
+M: Dongwon Kim 
+M: Mateusz Polrola 
+L: linux-ker...@vger.kernel.org
+L: xen-devel@lists.xenproject.org
+S: Maintained
+F: drivers/dma-buf/hyper_dmabuf*
+F: include/uapi/linux/hyper_dmabuf.h
+F: Documentation/hyper-dmabuf-sharing.txt
+T: https://github.com/downor/linux_hyper_dmabuf/
+
 Hyper-V CORE AND DRIVERS
 M: "K. Y. Srinivasan" 
 M: Haiyang Zhang 
-- 
2.16.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Hangs after /etc/init.d/xencommons start

2018-02-13 Thread ls00722
Hi all:
 I am using arm64 ( hikey) to test for xen.
After struggling for two week, I was able to build and install everything 
on the target, xen and dom0 kernel boots up ok. 
However, when I try to use “xl list”, it hangs, I realized I have to start 
the xencommons  service first. But it hangs as well after I typed in 
/etc/init.d/xencommons start. 
   It’s the command “xen-init-dom0” called by the script that hangs.
   Too bad it doesn’t give me any error message.
I am using the most up to date version of xen from github, I know it’s 
unstable, can anybody tell me how do I look further into the problem?  Or is 
this right mailing list to ask for help? This seems more like a user space tool 
problem.

Thanks
Lei


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 119114: regressions - FAIL

2018-02-13 Thread osstest service owner
flight 119114 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/119114/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 119098

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386  1 build-check(1) blocked n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  e139d34a1c4b7775d5855458a325e0e4176bdf7e
baseline version:
 xen  3f491d6873be9caa77f02ad8d98f174f0152b819

Last test of basis   119098  2018-02-13 17:01:30 Z0 days
Testing same since   119108  2018-02-13 20:01:33 Z0 days2 attempts


People who touched revisions under test:
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  fail
 build-armhf  pass
 build-amd64-libvirt  blocked 
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 blocked 
 test-amd64-amd64-libvirt blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit e139d34a1c4b7775d5855458a325e0e4176bdf7e
Author: Jan Beulich 
Date:   Tue Feb 13 18:19:33 2018 +0100

firmware/shim: correctly handle errors during Xen tree setup

"set -e" on a separate Makefile line is meaningless. Glue together all
the lines that this is supposed to cover.

Signed-off-by: Jan Beulich 
Reviewed-by: Roger Pau Monné 
Reviewed-by: Wei Liu 
(qemu changes not included)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 119108: regressions - FAIL

2018-02-13 Thread osstest service owner
flight 119108 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/119108/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 119098

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-debianhvm-i386  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  e139d34a1c4b7775d5855458a325e0e4176bdf7e
baseline version:
 xen  3f491d6873be9caa77f02ad8d98f174f0152b819

Last test of basis   119098  2018-02-13 17:01:30 Z0 days
Testing same since   119108  2018-02-13 20:01:33 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  fail
 build-armhf  pass
 build-amd64-libvirt  blocked 
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 blocked 
 test-amd64-amd64-libvirt blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit e139d34a1c4b7775d5855458a325e0e4176bdf7e
Author: Jan Beulich 
Date:   Tue Feb 13 18:19:33 2018 +0100

firmware/shim: correctly handle errors during Xen tree setup

"set -e" on a separate Makefile line is meaningless. Glue together all
the lines that this is supposed to cover.

Signed-off-by: Jan Beulich 
Reviewed-by: Roger Pau Monné 
Reviewed-by: Wei Liu 
(qemu changes not included)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-linus test] 119064: regressions - FAIL

2018-02-13 Thread osstest service owner
flight 119064 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/119064/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-libvirt   7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-xsm7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
118324
 test-amd64-i386-xl-qemuu-win10-i386  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-freebsd10-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 118324
 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 118324
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start   fail REGR. vs. 118324
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start fail REGR. vs. 118324
 test-amd64-i386-examine   8 reboot   fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 118324
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 118324
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 118324
 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 118324
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-win10-i386  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-rumprun-i386  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-freebsd10-i386  7 xen-boot   fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-libvirt-xsm   7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
118324

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 118324
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 118324
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 118324
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 118324
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 118324
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 118324
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 118324
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 

Re: [Xen-devel] [PATCH] xen/arm: Park CPUs with a MIDR different from the boot CPU.

2018-02-13 Thread Stefano Stabellini
On Tue, 13 Feb 2018, Julien Grall wrote:
> On 02/09/2018 07:12 PM, Julien Grall wrote:
> > Hi,
> > 
> > On 02/09/2018 07:10 PM, Stefano Stabellini wrote:
> > > On Fri, 9 Feb 2018, Julien Grall wrote:
> > > > On 02/09/2018 07:02 PM, Stefano Stabellini wrote:
> > > > > On Fri, 9 Feb 2018, Julien Grall wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > On 02/08/2018 11:49 PM, Stefano Stabellini wrote:
> > > > > > > On Thu, 1 Feb 2018, Julien Grall wrote:
> > > > > > > > On 1 February 2018 at 19:37, Stefano Stabellini
> > > > > > > > 
> > > > > > > > wrote:
> > > > > > > > > On Tue, 30 Jan 2018, Julien Grall wrote:
> > > > > > > > > > Xen does not properly support big.LITTLE platform. All vCPUs
> > > > > > > > > > of a
> > > > > > > > > > guest
> > > > > > > > > > will always have the MIDR of the boot CPU (see
> > > > > > > > > > arch_domain_create).
> > > > > > > > > > At best the guest may see unreliable performance (vCPU
> > > > > > > > > > switching
> > > > > > > > > > between
> > > > > > > > > > big and LITTLE), at worst the guest will become unreliable
> > > > > > > > > > or
> > > > > > > > > > insecure.
> > > > > > > > > > 
> > > > > > > > > > This is becoming more apparent with branch predictor
> > > > > > > > > > hardening in
> > > > > > > > > > Linux
> > > > > > > > > > because they target a specific kind of CPUs and may not work
> > > > > > > > > > on
> > > > > > > > > > other
> > > > > > > > > > CPUs.
> > > > > > > > > > 
> > > > > > > > > > For the time being, park any CPUs with a MDIR different from
> > > > > > > > > > the
> > > > > > > > > > boot
> > > > > > > > > > CPU. This will be revisited in the future once Xen gains
> > > > > > > > > > understanding
> > > > > > > > > > of big.LITTLE.
> > > > > > > > > > 
> > > > > > > > > > [1]
> > > > > > > > > > https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg00826.html
> > > > > > > > > >  
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Julien Grall 
> > > > > > > > > > 
> > > > > > > > > > ---
> > > > > > > > > > 
> > > > > > > > > > We probably want to backport this as part of XSA-254. Using
> > > > > > > > > > big.LITTLE
> > > > > > > > > > on Xen has never been supported but we didn't make it
> > > > > > > > > > clearly.
> > > > > > > > > > This is
> > > > > > > > > > becoming more apparent with code targeting specific CPUs.
> > > > > > > > > > ---
> > > > > > > > > >     xen/arch/arm/smpboot.c | 15 +++
> > > > > > > > > >     1 file changed, 15 insertions(+)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> > > > > > > > > > index 1255185a9c..2c2815f9ee 100644
> > > > > > > > > > --- a/xen/arch/arm/smpboot.c
> > > > > > > > > > +++ b/xen/arch/arm/smpboot.c
> > > > > > > > > > @@ -292,6 +292,21 @@ void start_secondary(unsigned long
> > > > > > > > > > boot_phys_offset,
> > > > > > > > > > 
> > > > > > > > > >     init_traps();
> > > > > > > > > > 
> > > > > > > > > > +    /*
> > > > > > > > > > + * Currently Xen assumes the platform has only one kind
> > > > > > > > > > of
> > > > > > > > > > CPUs.
> > > > > > > > > > + * This assumption does not hold on big.LITTLE platform
> > > > > > > > > > and
> > > > > > > > > > may
> > > > > > > > > > + * result to unstability. Better to park them for now.
> > > > > > > > > > + *
> > > > > > > > > > + * TODO: Add big.LITTLE support.
> > > > > > > > > > + */
> > > > > > > > > > +    if ( current_cpu_data.midr.bits !=
> > > > > > > > > > boot_cpu_data.midr.bits )
> > > > > > > > > > +    {
> > > > > > > > > > +    printk(XENLOG_ERR "CPU%u MIDR (0x%x) does not match
> > > > > > > > > > boot
> > > > > > > > > > CPU
> > > > > > > > > > MIDR (0x%x).\n",
> > > > > > > > > > +   smp_processor_id(),
> > > > > > > > > > current_cpu_data.midr.bits,
> > > > > > > > > > +   boot_cpu_data.midr.bits);
> > > > > > > > > > +    stop_cpu();
> > > > > > > > > > +    }
> > > > > > > > > 
> > > > > > > > > I understand that this patch is the right thing to do from a
> > > > > > > > > correctness
> > > > > > > > > perspective, especially in regards to the SP2 mitigation.
> > > > > > > > > 
> > > > > > > > > At the same time I would also like to give the option for
> > > > > > > > > people
> > > > > > > > > that
> > > > > > > > > want to use big.LITTLE with cpupools / cpu pinning to do so if
> > > > > > > > > they
> > > > > > > > > really want to, but I am not sure what to suggest.
> > > > > > > > > 
> > > > > > > > > Could we introduce a command line to proceed anyway? But then
> > > > > > > > > the
> > > > > > > > > system
> > > > > > > > > would be susceptible to SP2 in the cpus different from the
> > > > > > > > > boot cpu.
> > > > > > > > > Could we make the SP2 mitigation work on big.LITTLE or is it
> > > > > > > > > too
> > > > > > > > > much
> > > > > > > > > trouble? Do you have any other ideas or thoughts about this?
> > > > > > > 

[Xen-devel] [PATCH v6 2/9] x86/mm: move disallow masks to pv/mm.h

2018-02-13 Thread Wei Liu
This is in preparation for adding an extra parameter to
get_page_from_l1e for disallow mask.

No functional change.

Signed-off-by: Wei Liu 
---
 xen/arch/x86/mm.c| 19 +--
 xen/arch/x86/pv/mm.h | 19 +++
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 97ec467002..bfa0a6436c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -153,24 +153,7 @@ bool __read_mostly machine_to_phys_mapping_valid;
 
 struct rangeset *__read_mostly mmio_ro_ranges;
 
-static uint32_t base_disallow_mask;
-/* Global bit is allowed to be set on L1 PTEs. Intended for user mappings. */
-#define L1_DISALLOW_MASK ((base_disallow_mask | _PAGE_GNTTAB) & ~_PAGE_GLOBAL)
-
-#define L2_DISALLOW_MASK base_disallow_mask
-
-#define l3_disallow_mask(d) (!is_pv_32bit_domain(d) ? \
- base_disallow_mask : 0xF198U)
-
-#define L4_DISALLOW_MASK (base_disallow_mask)
-
-#define l1_disallow_mask(d) \
-((d != dom_io) &&   \
- (rangeset_is_empty((d)->iomem_caps) && \
-  rangeset_is_empty((d)->arch.ioport_caps) &&   \
-  !has_arch_pdevs(d) && \
-  is_pv_domain(d)) ?\
- L1_DISALLOW_MASK : (L1_DISALLOW_MASK & ~PAGE_CACHE_ATTRS))
+uint32_t base_disallow_mask;
 
 static s8 __read_mostly opt_mmio_relax;
 
diff --git a/xen/arch/x86/pv/mm.h b/xen/arch/x86/pv/mm.h
index 976209ba4c..84ca71bd08 100644
--- a/xen/arch/x86/pv/mm.h
+++ b/xen/arch/x86/pv/mm.h
@@ -1,6 +1,25 @@
 #ifndef __PV_MM_H__
 #define __PV_MM_H__
 
+extern uint32_t base_disallow_mask;
+/* Global bit is allowed to be set on L1 PTEs. Intended for user mappings. */
+#define L1_DISALLOW_MASK ((base_disallow_mask | _PAGE_GNTTAB) & ~_PAGE_GLOBAL)
+
+#define L2_DISALLOW_MASK base_disallow_mask
+
+#define l3_disallow_mask(d) (!is_pv_32bit_domain(d) ? \
+ base_disallow_mask : 0xF198U)
+
+#define L4_DISALLOW_MASK (base_disallow_mask)
+
+#define l1_disallow_mask(d) \
+((d != dom_io) &&   \
+ (rangeset_is_empty((d)->iomem_caps) && \
+  rangeset_is_empty((d)->arch.ioport_caps) &&   \
+  !has_arch_pdevs(d) && \
+  is_pv_domain(d)) ?\
+ L1_DISALLOW_MASK : (L1_DISALLOW_MASK & ~PAGE_CACHE_ATTRS))
+
 l1_pgentry_t *map_guest_l1e(unsigned long linear, mfn_t *gl1mfn);
 
 int new_guest_cr3(mfn_t mfn);
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 3/9] x86/mm: add disallow_mask parameter to get_page_from_l1e

2018-02-13 Thread Wei Liu
This will make moving pv mm code easier. To retain same behaviour the
base mask is copied to shadow code.

No functional change.

Signed-off-by: Wei Liu 
---
 xen/arch/x86/mm.c   | 13 +++--
 xen/arch/x86/mm/shadow/multi.c  | 15 ---
 xen/arch/x86/pv/ro-page-fault.c |  2 +-
 xen/include/asm-x86/mm.h|  3 ++-
 4 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index bfa0a6436c..53212bcce3 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -831,7 +831,8 @@ static int print_mmio_emul_range(unsigned long s, unsigned 
long e, void *arg)
  */
 int
 get_page_from_l1e(
-l1_pgentry_t l1e, struct domain *l1e_owner, struct domain *pg_owner)
+l1_pgentry_t l1e, struct domain *l1e_owner, struct domain *pg_owner,
+uint32_t disallow_mask)
 {
 unsigned long mfn = l1e_get_pfn(l1e);
 struct page_info *page = mfn_to_page(_mfn(mfn));
@@ -843,10 +844,9 @@ get_page_from_l1e(
 if ( !(l1f & _PAGE_PRESENT) )
 return 0;
 
-if ( unlikely(l1f & l1_disallow_mask(l1e_owner)) )
+if ( unlikely(l1f & disallow_mask) )
 {
-gdprintk(XENLOG_WARNING, "Bad L1 flags %x\n",
- l1f & l1_disallow_mask(l1e_owner));
+gdprintk(XENLOG_WARNING, "Bad L1 flags %x\n", l1f & disallow_mask);
 return -EINVAL;
 }
 
@@ -1318,7 +1318,7 @@ static int alloc_l1_table(struct page_info *page)
 
 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ )
 {
-switch ( ret = get_page_from_l1e(pl1e[i], d, d) )
+switch ( ret = get_page_from_l1e(pl1e[i], d, d, l1_disallow_mask(d)) )
 {
 default:
 goto fail;
@@ -1957,7 +1957,8 @@ static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t 
nl1e,
 return rc ? 0 : -EBUSY;
 }
 
-switch ( rc = get_page_from_l1e(nl1e, pt_dom, pg_dom) )
+switch ( rc = get_page_from_l1e(nl1e, pt_dom, pg_dom,
+l1_disallow_mask(pt_dom)) )
 {
 default:
 if ( page )
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index a6372e3a02..02c2198c9b 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -858,13 +858,21 @@ shadow_get_page_from_l1e(shadow_l1e_t sl1e, struct domain 
*d, p2m_type_t type)
 int res;
 mfn_t mfn;
 struct domain *owner;
+/* The disallow mask is taken from arch/x86/mm.c for HVM guest */
+uint32_t disallow_mask =
+~(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | _PAGE_ACCESSED |
+  _PAGE_DIRTY | _PAGE_AVAIL | _PAGE_AVAIL_HIGH | _PAGE_NX);
 
+disallow_mask = (disallow_mask | _PAGE_GNTTAB) & ~_PAGE_GLOBAL;
+disallow_mask &= ~PAGE_CACHE_ATTRS;
+
+ASSERT(is_hvm_domain(d));
 ASSERT(!sh_l1e_is_magic(sl1e));
 
 if ( !shadow_mode_refcounts(d) )
 return 1;
 
-res = get_page_from_l1e(sl1e, d, d);
+res = get_page_from_l1e(sl1e, d, d, disallow_mask);
 
 // If a privileged domain is attempting to install a map of a page it does
 // not own, we let it succeed anyway.
@@ -877,7 +885,7 @@ shadow_get_page_from_l1e(shadow_l1e_t sl1e, struct domain 
*d, p2m_type_t type)
 {
 res = xsm_priv_mapping(XSM_TARGET, d, owner);
 if ( !res ) {
-res = get_page_from_l1e(sl1e, d, owner);
+res = get_page_from_l1e(sl1e, d, owner, disallow_mask);
 SHADOW_PRINTK("privileged domain %d installs map of mfn %"PRI_mfn" 
"
"which is owned by d%d: %s\n",
d->domain_id, mfn_x(mfn), owner->domain_id,
@@ -896,7 +904,8 @@ shadow_get_page_from_l1e(shadow_l1e_t sl1e, struct domain 
*d, p2m_type_t type)
we can just grab a reference directly. */
 mfn = shadow_l1e_get_mfn(sl1e);
 if ( mfn_valid(mfn) )
-res = get_page_from_l1e(sl1e, d, page_get_owner(mfn_to_page(mfn)));
+res = get_page_from_l1e(sl1e, d, page_get_owner(mfn_to_page(mfn)),
+disallow_mask);
 }
 
 if ( unlikely(res < 0) )
diff --git a/xen/arch/x86/pv/ro-page-fault.c b/xen/arch/x86/pv/ro-page-fault.c
index 7e0e7e8dfc..04b4e455f5 100644
--- a/xen/arch/x86/pv/ro-page-fault.c
+++ b/xen/arch/x86/pv/ro-page-fault.c
@@ -127,7 +127,7 @@ static int ptwr_emulated_update(unsigned long addr, paddr_t 
old, paddr_t val,
 
 /* Check the new PTE. */
 nl1e = l1e_from_intpte(val);
-switch ( ret = get_page_from_l1e(nl1e, d, d) )
+switch ( ret = get_page_from_l1e(nl1e, d, d, l1_disallow_mask(d)) )
 {
 default:
 if ( is_pv_32bit_domain(d) && (bytes == 4) && (unaligned_addr & 4) &&
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 741c98575e..dca1831382 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -376,7 +376,8 @@ int  put_page_type_preemptible(struct page_info *page);
 int  

[Xen-devel] [PATCH v6 1/9] x86/mm: add pv prefix to {alloc, free}_page_type

2018-02-13 Thread Wei Liu
The two functions are only used by PV code paths because:

1. To allocate a PGT_l*_page_table type page, a DomU must explicitly
   request such types via PV MMU hypercall.
2. PV Dom0 builder explicitly asks for PGT_l*_page_table type pages,
   but it is obviously PV only.
3. p2m_alloc_ptp explicitly sets PGT_l1_page_table, but the allocation
   and deallocation of such pages don't go through the two functions
   touched in this patch.
4. shadow_enable explicitly sets PGT_l2_page_table, but the allocation
   and deallocation of such pages don't go through the two functions
   touched in this patch.

Also move the declarations to pv/mm.h. The code will be moved later.
Take the chance to change preemptible to bool.

Signed-off-by: Wei Liu 
---
 xen/arch/x86/domain.c   |  2 +-
 xen/arch/x86/mm.c   | 14 +++---
 xen/include/asm-x86/mm.h|  3 ---
 xen/include/asm-x86/pv/mm.h | 11 +++
 4 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index f93327b0a2..bc80e4f90e 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1905,7 +1905,7 @@ static int relinquish_memory(
 if ( likely(y == x) )
 {
 /* No need for atomic update of type_info here: noone else 
updates it. */
-switch ( ret = free_page_type(page, x, 1) )
+switch ( ret = pv_free_page_type(page, x, true) )
 {
 case 0:
 break;
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 35f204369b..97ec467002 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2301,8 +2301,8 @@ static void get_page_light(struct page_info *page)
 while ( unlikely(y != x) );
 }
 
-static int alloc_page_type(struct page_info *page, unsigned long type,
-   int preemptible)
+int pv_alloc_page_type(struct page_info *page, unsigned long type,
+   bool preemptible)
 {
 struct domain *owner = page_get_owner(page);
 int rc;
@@ -2331,7 +2331,7 @@ static int alloc_page_type(struct page_info *page, 
unsigned long type,
 rc = alloc_segdesc_page(page);
 break;
 default:
-printk("Bad type in alloc_page_type %lx t=%" PRtype_info " c=%lx\n",
+printk("Bad type in %s %lx t=%" PRtype_info " c=%lx\n", __func__,
type, page->u.inuse.type_info,
page->count_info);
 rc = -EINVAL;
@@ -2375,8 +2375,8 @@ static int alloc_page_type(struct page_info *page, 
unsigned long type,
 }
 
 
-int free_page_type(struct page_info *page, unsigned long type,
-   int preemptible)
+int pv_free_page_type(struct page_info *page, unsigned long type,
+  bool preemptible)
 {
 struct domain *owner = page_get_owner(page);
 unsigned long gmfn;
@@ -2433,7 +2433,7 @@ int free_page_type(struct page_info *page, unsigned long 
type,
 static int _put_final_page_type(struct page_info *page, unsigned long type,
 bool preemptible, struct page_info *ptpg)
 {
-int rc = free_page_type(page, type, preemptible);
+int rc = pv_free_page_type(page, type, preemptible);
 
 /* No need for atomic update of type_info here: noone else updates it. */
 if ( rc == 0 )
@@ -2695,7 +2695,7 @@ static int _get_page_type(struct page_info *page, 
unsigned long type,
 page->partial_pte = 0;
 }
 page->linear_pt_count = 0;
-rc = alloc_page_type(page, type, preemptible);
+rc = pv_alloc_page_type(page, type, preemptible);
 }
 
 if ( (x & PGT_partial) && !(nx & PGT_partial) )
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 3013c266fe..741c98575e 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -338,9 +338,6 @@ static inline void *__page_to_virt(const struct page_info 
*pg)
 (PAGE_SIZE / (sizeof(*pg) & -sizeof(*pg;
 }
 
-int free_page_type(struct page_info *page, unsigned long type,
-   int preemptible);
-
 void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d);
 void init_xen_l4_slots(l4_pgentry_t *l4t, mfn_t l4mfn,
const struct domain *d, mfn_t sl4mfn, bool ro_mpt);
diff --git a/xen/include/asm-x86/pv/mm.h b/xen/include/asm-x86/pv/mm.h
index 246b99014c..abf798b541 100644
--- a/xen/include/asm-x86/pv/mm.h
+++ b/xen/include/asm-x86/pv/mm.h
@@ -31,6 +31,10 @@ void pv_destroy_gdt(struct vcpu *v);
 bool pv_map_ldt_shadow_page(unsigned int off);
 bool pv_destroy_ldt(struct vcpu *v);
 
+int pv_alloc_page_type(struct page_info *page, unsigned long type,
+   bool preemptible);
+int pv_free_page_type(struct page_info *page, unsigned long type,
+  bool preemptible);
 #else
 
 #include 
@@ -52,6 +56,13 @@ static inline bool pv_map_ldt_shadow_page(unsigned int off) 
{ return false; }
 static inline 

[Xen-devel] [PATCH v6 6/9] x86/mm: export set_tlbflush_timestamp

2018-02-13 Thread Wei Liu
The function will skip stamping the page when the page is used as page
table in shadow mode. Since it is called both in PV code and common
code we need to export it.

Signed-off-by: Wei Liu 
---
I tried to move it to a header to keep in static inline but couldn't
find a place that works.

This function depends on asm/flushtlb.h and asm/shadow.h;
asm/flushtlb.h depends on xen/mm.h; xen/mm.h depends on asm/mm.h.

The best location would be asm/mm.h, but that creates a circular
dependency.

Moving it to flushtlb.h (and include shadow.h there) breaks
compilation of other files that include flushtlb.h.
---
 xen/arch/x86/mm.c| 2 +-
 xen/include/asm-x86/mm.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index e0dfa58f95..db6b703c56 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -521,7 +521,7 @@ void update_cr3(struct vcpu *v)
 make_cr3(v, cr3_mfn);
 }
 
-static inline void set_tlbflush_timestamp(struct page_info *page)
+void set_tlbflush_timestamp(struct page_info *page)
 {
 /*
  * Record TLB information for flush later. We do not stamp page tables
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index dca1831382..f6399f531b 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -628,4 +628,6 @@ static inline bool arch_mfn_in_directmap(unsigned long mfn)
 return mfn <= (virt_to_mfn(eva - 1) + 1);
 }
 
+void set_tlbflush_timestamp(struct page_info *page);
+
 #endif /* __ASM_X86_MM_H__ */
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 5/9] x86/mm: factor out pv_dec_linear_pt

2018-02-13 Thread Wei Liu
Linear page table is a PV only feature. The functions used to handle
that will be moved.

Create a function for decreasing linear page table count. It is called
unconditionally from common code so the stub is empty.

No functional change.

Signed-off-by: Wei Liu 
---
 xen/arch/x86/mm.c   | 25 ++---
 xen/include/asm-x86/pv/mm.h |  5 +
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 64950354f4..e0dfa58f95 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2413,6 +2413,18 @@ int pv_free_page_type(struct page_info *page, unsigned 
long type,
 return rc;
 }
 
+void pv_dec_linear_pt(struct page_info *ptpg, struct page_info *page,
+  unsigned long type)
+{
+if ( ptpg && PGT_type_equal(type, ptpg->u.inuse.type_info) )
+{
+ASSERT(is_pv_domain(page_get_owner(page)));
+ASSERT(is_pv_domain(page_get_owner(ptpg)));
+
+dec_linear_uses(page);
+dec_linear_entries(ptpg);
+}
+}
 
 int pv_put_final_page_type(struct page_info *page, unsigned long type,
bool preemptible, struct page_info *ptpg)
@@ -2422,11 +2434,7 @@ int pv_put_final_page_type(struct page_info *page, 
unsigned long type,
 /* No need for atomic update of type_info here: noone else updates it. */
 if ( rc == 0 )
 {
-if ( ptpg && PGT_type_equal(type, ptpg->u.inuse.type_info) )
-{
-dec_linear_uses(page);
-dec_linear_entries(ptpg);
-}
+pv_dec_linear_pt(ptpg, page, type);
 ASSERT(!page->linear_pt_count || page_get_owner(page)->is_dying);
 set_tlbflush_timestamp(page);
 smp_wmb();
@@ -2450,7 +2458,6 @@ int pv_put_final_page_type(struct page_info *page, 
unsigned long type,
 return rc;
 }
 
-
 static int _put_page_type(struct page_info *page, bool preemptible,
   struct page_info *ptpg)
 {
@@ -2533,11 +2540,7 @@ static int _put_page_type(struct page_info *page, bool 
preemptible,
 return -EINTR;
 }
 
-if ( ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info) )
-{
-dec_linear_uses(page);
-dec_linear_entries(ptpg);
-}
+pv_dec_linear_pt(ptpg, page, x);
 
 return 0;
 }
diff --git a/xen/include/asm-x86/pv/mm.h b/xen/include/asm-x86/pv/mm.h
index ff9089ec19..08167aa2fd 100644
--- a/xen/include/asm-x86/pv/mm.h
+++ b/xen/include/asm-x86/pv/mm.h
@@ -37,6 +37,8 @@ int pv_free_page_type(struct page_info *page, unsigned long 
type,
   bool preemptible);
 int pv_put_final_page_type(struct page_info *page, unsigned long type,
bool preemptible, struct page_info *ptpg);
+void pv_dec_linear_pt(struct page_info *ptpg, struct page_info *page,
+  unsigned long type);
 #else
 
 #include 
@@ -70,6 +72,9 @@ static inline int pv_put_final_page_type(struct page_info 
*page,
  struct page_info *ptpg)
 { ASSERT_UNREACHABLE(); return -EINVAL; }
 
+static inline void pv_dec_linear_pt(struct page_info *ptpg, struct page_info 
*page,
+unsigned long type) {}
+
 #endif
 
 #endif /* __X86_PV_MM_H__ */
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 7/9] x86/mm: provide put_page_type_ptpg{, _preemptible}

2018-02-13 Thread Wei Liu
And replace open-coded _put_page_type where the parent table parameter
is not null.

This is in preparation for code movement in which various
put_page_from_lNe will be moved to pv/mm.c.

Signed-off-by: Wei Liu 
---
 xen/arch/x86/mm.c| 28 ++--
 xen/include/asm-x86/mm.h |  2 ++
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index db6b703c56..e004350e83 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1137,9 +1137,6 @@ get_page_from_l4e(
 return rc;
 }
 
-static int _put_page_type(struct page_info *page, bool preemptible,
-  struct page_info *ptpg);
-
 void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
 {
 unsigned long pfn = l1e_get_pfn(l1e);
@@ -1223,7 +1220,7 @@ static int put_page_from_l2e(l2_pgentry_t l2e, unsigned 
long pfn)
 else
 {
 struct page_info *pg = l2e_get_page(l2e);
-int rc = _put_page_type(pg, false, mfn_to_page(_mfn(pfn)));
+int rc = put_page_type_ptpg(pg, mfn_to_page(_mfn(pfn)));
 
 ASSERT(!rc);
 put_page(pg);
@@ -1259,7 +1256,7 @@ static int put_page_from_l3e(l3_pgentry_t l3e, unsigned 
long pfn,
 if ( unlikely(partial > 0) )
 {
 ASSERT(!defer);
-return _put_page_type(pg, true, mfn_to_page(_mfn(pfn)));
+return put_page_type_ptpg_preemptible(pg, mfn_to_page(_mfn(pfn)));
 }
 
 if ( defer )
@@ -1269,7 +1266,7 @@ static int put_page_from_l3e(l3_pgentry_t l3e, unsigned 
long pfn,
 return 0;
 }
 
-rc = _put_page_type(pg, true, mfn_to_page(_mfn(pfn)));
+rc = put_page_type_ptpg_preemptible(pg, mfn_to_page(_mfn(pfn)));
 if ( likely(!rc) )
 put_page(pg);
 
@@ -1289,7 +1286,7 @@ static int put_page_from_l4e(l4_pgentry_t l4e, unsigned 
long pfn,
 if ( unlikely(partial > 0) )
 {
 ASSERT(!defer);
-return _put_page_type(pg, true, mfn_to_page(_mfn(pfn)));
+return put_page_type_ptpg_preemptible(pg, mfn_to_page(_mfn(pfn)));
 }
 
 if ( defer )
@@ -1299,7 +1296,7 @@ static int put_page_from_l4e(l4_pgentry_t l4e, unsigned 
long pfn,
 return 0;
 }
 
-rc = _put_page_type(pg, true, mfn_to_page(_mfn(pfn)));
+rc = put_page_type_ptpg_preemptible(pg, mfn_to_page(_mfn(pfn)));
 if ( likely(!rc) )
 put_page(pg);
 }
@@ -2722,6 +2719,17 @@ int put_page_type_preemptible(struct page_info *page)
 return _put_page_type(page, true, NULL);
 }
 
+int put_page_type_ptpg_preemptible(struct page_info *page,
+   struct page_info *ptpg)
+{
+return _put_page_type(page, true, ptpg);
+}
+
+int put_page_type_ptpg(struct page_info *page, struct page_info *ptpg)
+{
+return _put_page_type(page, false, ptpg);
+}
+
 int get_page_type_preemptible(struct page_info *page, unsigned long type)
 {
 ASSERT(!current->arch.old_guest_table);
@@ -2736,8 +2744,8 @@ int put_old_guest_table(struct vcpu *v)
 if ( !v->arch.old_guest_table )
 return 0;
 
-switch ( rc = _put_page_type(v->arch.old_guest_table, true,
- v->arch.old_guest_ptpg) )
+switch ( rc = put_page_type_ptpg_preemptible(v->arch.old_guest_table,
+ v->arch.old_guest_ptpg) )
 {
 case -EINTR:
 case -ERESTART:
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index f6399f531b..9f30b37d29 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -371,8 +371,10 @@ int page_lock(struct page_info *page);
 void page_unlock(struct page_info *page);
 
 void put_page_type(struct page_info *page);
+int  put_page_type_ptpg(struct page_info *page, struct page_info *ptpg);
 int  get_page_type(struct page_info *page, unsigned long type);
 int  put_page_type_preemptible(struct page_info *page);
+int  put_page_type_ptpg_preemptible(struct page_info *page, struct page_info 
*ptpg);
 int  get_page_type_preemptible(struct page_info *page, unsigned long type);
 int  put_old_guest_table(struct vcpu *);
 int  get_page_from_l1e(
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 9/9] x86/mm: remove now unused inclusion of pv/mm.h

2018-02-13 Thread Wei Liu
Signed-off-by: Wei Liu 
---
 xen/arch/x86/mm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 0b5fd199a4..97d2ea17fb 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -128,8 +128,6 @@
 #include 
 #include 
 
-#include "pv/mm.h"
-
 /* Override macros from asm/page.h to make them work with mfn_t */
 #undef mfn_to_page
 #define mfn_to_page(mfn) __mfn_to_page(mfn_x(mfn))
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 0/9] x86: refactor mm.c

2018-02-13 Thread Wei Liu
Hello

This series can be found at:
   https://xenbits.xen.org/git-http/people/liuw/xen.git wip.split-mm-v6.1

Unfortunately there isn't any resemblance to v5 because a lot of things
have changed since Sept last year. And the opinions gathered at the time
would make this version more or less a complete rewrite anyway.

Even after moving more than 2000 lines of code, there is still room for
improvement. But that requires further rewrite some of the common code (not
limited to x86) so that's a task for another day.

Wei.

Wei Liu (9):
  x86/mm: add pv prefix to {alloc,free}_page_type
  x86/mm: move disallow masks to pv/mm.h
  x86/mm: add disallow_mask parameter to get_page_from_l1e
  x86/mm: add pv prefix to _put_final_page_type
  x86/mm: factor out pv_dec_linear_pt
  x86/mm: export set_tlbflush_timestamp
  x86/mm: provide put_page_type_ptpg{,_preemptible}
  x86/mm: move PV code to pv/mm.c
  x86/mm: remove now unused inclusion of pv/mm.h

 xen/arch/x86/domain.c   |2 +-
 xen/arch/x86/mm.c   | 2883 +++
 xen/arch/x86/mm/shadow/multi.c  |   15 +-
 xen/arch/x86/pv/mm.c| 2452 +
 xen/arch/x86/pv/mm.h|   19 +
 xen/arch/x86/pv/ro-page-fault.c |2 +-
 xen/include/asm-x86/mm.h|   10 +-
 xen/include/asm-x86/pv/mm.h |   23 +
 8 files changed, 2733 insertions(+), 2673 deletions(-)

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/xpti: Hide almost all of .text and all .data/.rodata/.bss mappings

2018-02-13 Thread Andrew Cooper
The current XPTI implementation isolates the directmap (and therefore a lot of
guest data), but a large quantity of CPU0's state (including its stack)
remains visible.

Furthermore, an attacker able to read .text is in a vastly superior position
to normal when it comes to fingerprinting Xen for known vulnerabilities, or
scanning for ROP/Spectre gadgets.

Collect together the entrypoints in .text.entry (currently 3x4k frames, but
can almost certainly be slimmed down), and create a common mapping which is
inserted into each per-cpu shadow.  The stubs are also inserted into this
mapping by pointing at the in-use L2.  This allows stubs allocated later (SMP
boot, or CPU hotplug) to work without further changes to the common mappings.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Juergen Gross 

RFC, because I don't think the stubs handling is particularly sensible.

We allocate 4k of virtual address space per CPU, but squash loads of CPUs
together onto a single MFN.  The stubs ought to be isolated as well (as they
leak the virtual addresses of each stack), which can be done by allocating an
MFN per CPU (and simplifies cpu_smpboot_alloc() somewhat).  At this point, we
can't use a common set of mappings, and will have to clone the single stub and
.entry.text into each PCPUs copy of the pagetables.

Also, my plan to cause .text.entry to straddle a 512TB boundary (and therefore
avoid any further pagetable allocations) has come a little unstuck because of
CONFIG_BIGMEM.  I'm still working out whether there is a sensible way to
rearrange the virtual layout for this plan to work.
---
 xen/arch/x86/smpboot.c | 37 -
 xen/arch/x86/x86_64/compat/entry.S |  2 ++
 xen/arch/x86/x86_64/entry.S|  4 +++-
 xen/arch/x86/xen.lds.S |  7 +++
 4 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 2ebef03..2519141 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -622,6 +622,9 @@ unsigned long alloc_stub_page(unsigned int cpu, unsigned 
long *mfn)
 unmap_domain_page(memset(__map_domain_page(pg), 0xcc, PAGE_SIZE));
 }
 
+/* Confirm that all stubs fit in a single L2 pagetable. */
+BUILD_BUG_ON(NR_CPUS * PAGE_SIZE > (1u << L2_PAGETABLE_SHIFT));
+
 stub_va = XEN_VIRT_END - (cpu + 1) * PAGE_SIZE;
 if ( map_pages_to_xen(stub_va, mfn_x(page_to_mfn(pg)), 1,
   PAGE_HYPERVISOR_RX | MAP_SMALL_PAGES) )
@@ -651,9 +654,6 @@ static int clone_mapping(const void *ptr, root_pgentry_t 
*rpt)
 l2_pgentry_t *pl2e;
 l1_pgentry_t *pl1e;
 
-if ( linear < DIRECTMAP_VIRT_START )
-return 0;
-
 flags = l3e_get_flags(*pl3e);
 ASSERT(flags & _PAGE_PRESENT);
 if ( flags & _PAGE_PSE )
@@ -744,6 +744,9 @@ static __read_mostly int8_t opt_xpti = -1;
 boolean_param("xpti", opt_xpti);
 DEFINE_PER_CPU(root_pgentry_t *, root_pgt);
 
+static root_pgentry_t common_pgt;
+extern char _stextentry[], _etextentry[];
+
 static int setup_cpu_root_pgt(unsigned int cpu)
 {
 root_pgentry_t *rpt;
@@ -764,8 +767,32 @@ static int setup_cpu_root_pgt(unsigned int cpu)
 idle_pg_table[root_table_offset(RO_MPT_VIRT_START)];
 /* SH_LINEAR_PT inserted together with guest mappings. */
 /* PERDOMAIN inserted during context switch. */
-rpt[root_table_offset(XEN_VIRT_START)] =
-idle_pg_table[root_table_offset(XEN_VIRT_START)];
+
+/* One-time setup of common_pgt, which maps .text.entry and the stubs. */
+if ( unlikely(!root_get_intpte(common_pgt)) )
+{
+unsigned long stubs_linear = XEN_VIRT_END - 1;
+l3_pgentry_t *stubs_main, *stubs_shadow;
+char *ptr;
+
+for ( rc = 0, ptr = _stextentry;
+  !rc && ptr < _etextentry; ptr += PAGE_SIZE )
+rc = clone_mapping(ptr, rpt);
+
+if ( rc )
+return rc;
+
+stubs_main = l4e_to_l3e(idle_pg_table[l4_table_offset(stubs_linear)]);
+stubs_shadow = l4e_to_l3e(rpt[l4_table_offset(stubs_linear)]);
+
+/* Splice into the regular L2 mapping the stubs. */
+stubs_shadow[l3_table_offset(stubs_linear)] =
+stubs_main[l3_table_offset(stubs_linear)];
+
+common_pgt = rpt[root_table_offset(XEN_VIRT_START)];
+}
+
+rpt[root_table_offset(XEN_VIRT_START)] = common_pgt;
 
 /* Install direct map page table entries for stack, IDT, and TSS. */
 for ( off = rc = 0; !rc && off < STACK_SIZE; off += PAGE_SIZE )
diff --git a/xen/arch/x86/x86_64/compat/entry.S 
b/xen/arch/x86/x86_64/compat/entry.S
index 707c746..b001e79 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -13,6 +13,8 @@
 #include 
 #include 
 
+.section .text.entry, "ax", @progbits
+
 ENTRY(entry_int82)
 ASM_CLAC
 pushq $0
diff --git 

Re: [Xen-devel] [PATCH] xen/arm: Park CPUs with a MIDR different from the boot CPU.

2018-02-13 Thread Julien Grall



On 02/09/2018 07:12 PM, Julien Grall wrote:

Hi,

On 02/09/2018 07:10 PM, Stefano Stabellini wrote:

On Fri, 9 Feb 2018, Julien Grall wrote:

On 02/09/2018 07:02 PM, Stefano Stabellini wrote:

On Fri, 9 Feb 2018, Julien Grall wrote:

Hi,

On 02/08/2018 11:49 PM, Stefano Stabellini wrote:

On Thu, 1 Feb 2018, Julien Grall wrote:

On 1 February 2018 at 19:37, Stefano Stabellini

wrote:

On Tue, 30 Jan 2018, Julien Grall wrote:

Xen does not properly support big.LITTLE platform. All vCPUs of a
guest
will always have the MIDR of the boot CPU (see
arch_domain_create).
At best the guest may see unreliable performance (vCPU switching
between
big and LITTLE), at worst the guest will become unreliable or
insecure.

This is becoming more apparent with branch predictor hardening in
Linux
because they target a specific kind of CPUs and may not work on
other
CPUs.

For the time being, park any CPUs with a MDIR different from the
boot
CPU. This will be revisited in the future once Xen gains
understanding
of big.LITTLE.

[1]
https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg00826.html 



Signed-off-by: Julien Grall 

---

We probably want to backport this as part of XSA-254. Using
big.LITTLE
on Xen has never been supported but we didn't make it clearly.
This is
becoming more apparent with code targeting specific CPUs.
---
    xen/arch/arm/smpboot.c | 15 +++
    1 file changed, 15 insertions(+)

diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 1255185a9c..2c2815f9ee 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -292,6 +292,21 @@ void start_secondary(unsigned long
boot_phys_offset,

    init_traps();

+    /*
+ * Currently Xen assumes the platform has only one kind of
CPUs.
+ * This assumption does not hold on big.LITTLE platform and
may
+ * result to unstability. Better to park them for now.
+ *
+ * TODO: Add big.LITTLE support.
+ */
+    if ( current_cpu_data.midr.bits != boot_cpu_data.midr.bits )
+    {
+    printk(XENLOG_ERR "CPU%u MIDR (0x%x) does not match boot
CPU
MIDR (0x%x).\n",
+   smp_processor_id(), current_cpu_data.midr.bits,
+   boot_cpu_data.midr.bits);
+    stop_cpu();
+    }


I understand that this patch is the right thing to do from a
correctness
perspective, especially in regards to the SP2 mitigation.

At the same time I would also like to give the option for people
that
want to use big.LITTLE with cpupools / cpu pinning to do so if they
really want to, but I am not sure what to suggest.

Could we introduce a command line to proceed anyway? But then the
system
would be susceptible to SP2 in the cpus different from the boot 
cpu.

Could we make the SP2 mitigation work on big.LITTLE or is it too
much
trouble? Do you have any other ideas or thoughts about this?


This patch is here to prevent to spread instability/insecurity or 
give

the feeling we do support big.LITTLE.

Even outside of SP2, there are possibility for instability 
because CPU

errata
would not be applied correctly in the guest or because Xen is not 
able

to
know that non CPUs may have a different cacheline size...

I want to end this idea that Xen may support big.LITTLE.

The first thing to modify is the vpdir (virtual MIDR), at the moment
we
always
use the boot MIDR. What would you choose now? The MIDR of the CPU
where
the hypercall happen?

There is no shortcut for big.LITTLE. The right thing is to implement
what
has
been discussed in the design document written by Dario. But that's a
new
feature and would require some work to do it properly.

A command line option might be a good idea, but I would be more 
of the

opinion
to delay that and see who is screaming about it.

My hunch is not many people will scream because today they tend to
disable
one set of CPUs in the DT directly.


As discussed, are you going to resend with a command line option 
such as

biglittle=unsafe or something like that?


I would prefer to avoid term big.LITTLE in the command line option 
because

it
might be possible to have platform with more than two kind of CPUs. 
How

about
"smp=unsafe"?


I am fine with not using big.LITTLE but smp=unsafe is a bit confusing.
What do you think of: "heterogeneous=unsafe" it is a bit of a mouthful
but it should be clearer.


Heterogeneous does not tell you what you are trying to do. I think it 
needs to

be qualified with the smp (or something similar).\

How about mp_unsafe_heterogeneous=yes/no.


it's getting longer and longer, but OK :-)  At least it's descriptive.


It will be easier to spot in the logs :). I will resend the patch next 
week with the command line added.


After looking at the design doc from Dario, I was wondering if we should 
name the option: hmp_unsafe (or amp_unsafe). This would make the name 
slightly shorter.


The description of the command line would be:

"Say yes at your own risk if you want to enable heterogenous 

[Xen-devel] [xen-unstable-smoke test] 119098: tolerable all pass - PUSHED

2018-02-13 Thread osstest service owner
flight 119098 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/119098/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  3f491d6873be9caa77f02ad8d98f174f0152b819
baseline version:
 xen  1a42ffa3476ab433da9dc27c6d36f051b70592ed

Last test of basis   119079  2018-02-13 13:26:08 Z0 days
Testing same since   119098  2018-02-13 17:01:30 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Daniel De Graaf 
  George Dunlap 
  Jan Beulich 
  Roger Pau Monné 
  Sameer Goel 
  Tim Deegan 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   1a42ffa347..3f491d6873  3f491d6873be9caa77f02ad8d98f174f0152b819 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction

2018-02-13 Thread Rich Persaud
On Feb 7, 2018, at 11:05, Jan Beulich  wrote:
> 
> 1: slightly reduce Meltdown band-aid overhead
> 2: remove CR reads from exit-to-guest path
> 3: introduce altinstruction_nop assembler macro
> 4: NOP out most XPTI entry/exit code when it's not in use
> 5: avoid double CR3 reload when switching to guest user mode
> 6: disable XPTI when RDCL_NO
> 7: x86: log XPTI enabled status

Since work on XPTI is ongoing, will these improvements to XPTI-stage-1 be 
published via http://xenbits.xen.org/xsa/xsa254/README.pti?

Rich___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/3] x86/svm: add EFER SVME support for VGIF/VLOAD

2018-02-13 Thread Woods, Brian
Pardon any weird formatting, I'm replying on my phone. 

Because they are two different things.  One is an assert to make sure nothing 
wrong is happening with the EFER.SVME bit, and the other changes what features 
are enabled.  

IIRC, most asserts are on their on ifs and not in a if statement with something 
else.  I guess I should have put the assert higher in the function though but 
that's a small detail.  

Yes, you can squeeze both in one if statement but, but it being cleaner and 
easier to read (at least more logical) is better than getting rid of one if in 
my opinion.  Plus assuming asserts are disabled for release, I'd assume the 
extra if would get optimized out by gcc anyway. 

Brian


On February 13, 2018 03:31:40 Jan Beulich  wrote:

 On 08.02.18 at 18:01,  wrote:
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -611,6 +611,12 @@ static void svm_update_guest_efer(struct vcpu *v)
>>  if ( lma )
>>  new_efer |= EFER_LME;
>>  vmcb_set_efer(vmcb, new_efer);
>> +
>> +if ( !nestedhvm_enabled(v->domain) )
>> +ASSERT(!(v->arch.hvm_vcpu.guest_efer & EFER_SVME));
>> +
>> +if ( nestedhvm_enabled(v->domain) )
>> +svm_nested_features_on_efer_update(v);
>>  }
>
> Why not
>
> if ( nestedhvm_enabled(v->domain) )
> svm_nested_features_on_efer_update(v);
> else
> ASSERT(!(v->arch.hvm_vcpu.guest_efer & EFER_SVME));
>
> ?
>
> Jan
>
>
>

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 27/49] ARM: new VGIC: Add MMIO handling framework

2018-02-13 Thread Andre Przywara
Hi,

On 13/02/18 16:52, Julien Grall wrote:
> Hi Andre,7
> 
> On 09/02/18 14:39, Andre Przywara wrote:
>> Add an MMIO handling framework to the VGIC emulation:
>> Each register is described by its offset, size (or number of bits per
>> IRQ, if applicable) and the read/write handler functions. We provide
>> initialization macros to describe each GIC register later easily.
>>
>> Separate dispatch functions for read and write accesses are connected
>> to Xen's MMIO handling framework and binary-search for the responsible
>> register handler based on the offset address within the region.
>>
>> The register handler prototype are courtesy of Christoffer Dall.
>>
>> This is based on Linux commit 4493b1c4866a, written by Marc Zyngier.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>   xen/arch/arm/vgic/vgic-mmio.c | 192
>> ++
>>   xen/arch/arm/vgic/vgic-mmio.h | 145 +++
>>   xen/arch/arm/vgic/vgic.h  |   4 +
>>   3 files changed, 341 insertions(+)
>>   create mode 100644 xen/arch/arm/vgic/vgic-mmio.c
>>   create mode 100644 xen/arch/arm/vgic/vgic-mmio.h
>>
>> diff --git a/xen/arch/arm/vgic/vgic-mmio.c
>> b/xen/arch/arm/vgic/vgic-mmio.c
>> new file mode 100644
>> index 00..3c70945466
>> --- /dev/null
>> +++ b/xen/arch/arm/vgic/vgic-mmio.c
>> @@ -0,0 +1,192 @@
>> +/*
>> + * VGIC MMIO handling functions
>> + * Imported from Linux ("new" KVM VGIC) and heavily adapted to Xen.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "vgic.h"
>> +#include "vgic-mmio.h"
>> +
>> +unsigned long vgic_mmio_read_raz(struct vcpu *vcpu,
>> + paddr_t addr, unsigned int len)
> 
> Indentation.
> 
>> +{
>> +    return 0;
>> +}
>> +
>> +unsigned long vgic_mmio_read_rao(struct vcpu *vcpu,
>> + paddr_t addr, unsigned int len)
> 
> Indentation.
> 
>> +{
>> +    return -1UL;
>> +}
>> +
>> +void vgic_mmio_write_wi(struct vcpu *vcpu, paddr_t addr,
>> +    unsigned int len, unsigned long val)
> 
> Indentation.
> 
>> +{
>> +    /* Ignore */
>> +}
>> +
>> +static int match_region(const void *key, const void *elt)
>> +{
>> +    const unsigned int offset = (unsigned long)key;
>> +    const struct vgic_register_region *region = elt;
>> +
>> +    if ( offset < region->reg_offset )
>> +    return -1;
>> +
>> +    if ( offset >= region->reg_offset + region->len )
>> +    return 1;
>> +
>> +    return 0;
>> +}
>> +
>> +const struct vgic_register_region *
>> +vgic_find_mmio_region(const struct vgic_register_region *regions,
> 
> Any reason to export this?

Good catch, this is needed in KVM to do the user space access, where we
re-use these functions to call into the MMIO handlers.
So I can make them static and then loose the prototype down below as well.

> 
>> +  int nr_regions, unsigned int offset)
> 
> Indentation.
> 
>> +{
>> +    return bsearch((void *)(uintptr_t)offset, regions, nr_regions,
>> +   sizeof(regions[0]), match_region);
>> +}
>> +
>> +static bool check_region(const struct domain *d,
>> + const struct vgic_register_region *region,
>> + paddr_t addr, int len)
> 
> Indentation.
> 
>> +{
>> +    int flags, nr_irqs = d->arch.vgic.nr_spis + VGIC_NR_PRIVATE_IRQS;
>> + > +    switch (len)
> 
> switch ( ... )
> 
>> +    {
>> +    case sizeof(u8):
> 
> s/u8/uint8_t/ here an below.
> 
>> +    flags = VGIC_ACCESS_8bit;
>> +    break;
>> +    case sizeof(u32):
>> +    flags = VGIC_ACCESS_32bit;
>> +    break;
>> +    case sizeof(u64):
>> +    flags = VGIC_ACCESS_64bit;
>> +    break;
>> +    default:
>> +    return false;
>> +    }
>> +
>> +    if ( (region->access_flags & flags) && IS_ALIGNED(addr, len) )
>> +    {
>> +    if ( !region->bits_per_irq )
>> +    return true;
>> +
>> +    /* Do we access a non-allocated IRQ? */
>> +    return VGIC_ADDR_TO_INTID(addr, region->bits_per_irq) < nr_irqs;
>> +    }
>> +
>> +    return false;
>> +}
>> +
>> +const struct vgic_register_region *
>> +vgic_get_mmio_region(struct vcpu *vcpu, struct vgic_io_device *iodev,
> 
> 
> Any reason to export this?
> 
>> + paddr_t addr, int len)
> 
> Indentation and unsigned int please.
> 
>> +{
>> +    const struct vgic_register_region *region;
>> +
>> +    region = vgic_find_mmio_region(iodev->regions, iodev->nr_regions,
>> +   addr - iodev->base_addr);
>> +    if ( !region || 

Re: [Xen-devel] [PATCH] x86/srat: fix end calculation in nodes_cover_memory()

2018-02-13 Thread Andrew Cooper
On 13/02/18 17:11, Jan Beulich wrote:
> Along the lines of commit 7226486767 ("x86/srat: fix the end pfn check
> in valid_numa_range()") nodes_cover_memory() also doesn't consistently
> use "end": It's set to an inclusive value initially, but then compared
> to the exclusive "end" field of struct node and also possibly set to
> nodes[j].start, making it exclusive too. Change the initialization to
> make the variable consistently exclusive.
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/srat: fix end calculation in nodes_cover_memory()

2018-02-13 Thread Jan Beulich
Along the lines of commit 7226486767 ("x86/srat: fix the end pfn check
in valid_numa_range()") nodes_cover_memory() also doesn't consistently
use "end": It's set to an inclusive value initially, but then compared
to the exclusive "end" field of struct node and also possibly set to
nodes[j].start, making it exclusive too. Change the initialization to
make the variable consistently exclusive.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -368,7 +368,7 @@ static int __init nodes_cover_memory(voi
}
 
start = e820.map[i].addr;
-   end = e820.map[i].addr + e820.map[i].size - 1;
+   end = e820.map[i].addr + e820.map[i].size;
 
do {
found = 0;




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 0/7] LLVM coverage support for Xen

2018-02-13 Thread Jan Beulich
>>> On 13.02.18 at 17:28,  wrote:
> On Tue, Feb 13, 2018 at 09:16:19AM -0700, Jan Beulich wrote:
>> >>> On 13.02.18 at 16:53,  wrote:
>> > On Wed, Jan 24, 2018 at 10:01:18AM +, Roger Pau Monne wrote:
>> >> Hello,
>> >> 
>> >> The following patch series enables LLVM coverage support for the Xen
>> >> hypervisor. A sample coverage report obtained after booting a PVHv2 Dom0
>> >> can be found at:
>> >> 
>> >> http://xenbits.xen.org/people/royger/xen_profile/ 
>> >> 
>> >> I know the time is not the most appropriate given all the security work
>> >> going on, but it seems like the series is quite close, and I would ike
>> >> to avoid it bitrotting.
>> > 
>> > Patches 5, 6 and 7 have already been reviewed/acked by the relevant
>> > maintainers, is there anything preventing them from going in?
>> 
>> I didn't keep them in my inbox when it became clear that patch 4
>> needs another version, which is why I did apply only that single
>> patch.
> 
> Oh, since patch 4 was the only one that had comments I only sent that.
> Would you like me to send or push to a git branch the remaining ones?

Normally I think it would have been best if you resent them
without asking, rather than just sending a singleton 4/7 patch.
This time, however, I've managed to fish them out of the mail
client's waste basket, and I've just committed and pushed them.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 27/49] ARM: new VGIC: Add MMIO handling framework

2018-02-13 Thread Julien Grall

Hi Andre,7

On 09/02/18 14:39, Andre Przywara wrote:

Add an MMIO handling framework to the VGIC emulation:
Each register is described by its offset, size (or number of bits per
IRQ, if applicable) and the read/write handler functions. We provide
initialization macros to describe each GIC register later easily.

Separate dispatch functions for read and write accesses are connected
to Xen's MMIO handling framework and binary-search for the responsible
register handler based on the offset address within the region.

The register handler prototype are courtesy of Christoffer Dall.

This is based on Linux commit 4493b1c4866a, written by Marc Zyngier.

Signed-off-by: Andre Przywara 
---
  xen/arch/arm/vgic/vgic-mmio.c | 192 ++
  xen/arch/arm/vgic/vgic-mmio.h | 145 +++
  xen/arch/arm/vgic/vgic.h  |   4 +
  3 files changed, 341 insertions(+)
  create mode 100644 xen/arch/arm/vgic/vgic-mmio.c
  create mode 100644 xen/arch/arm/vgic/vgic-mmio.h

diff --git a/xen/arch/arm/vgic/vgic-mmio.c b/xen/arch/arm/vgic/vgic-mmio.c
new file mode 100644
index 00..3c70945466
--- /dev/null
+++ b/xen/arch/arm/vgic/vgic-mmio.c
@@ -0,0 +1,192 @@
+/*
+ * VGIC MMIO handling functions
+ * Imported from Linux ("new" KVM VGIC) and heavily adapted to Xen.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vgic.h"
+#include "vgic-mmio.h"
+
+unsigned long vgic_mmio_read_raz(struct vcpu *vcpu,
+ paddr_t addr, unsigned int len)


Indentation.


+{
+return 0;
+}
+
+unsigned long vgic_mmio_read_rao(struct vcpu *vcpu,
+ paddr_t addr, unsigned int len)


Indentation.


+{
+return -1UL;
+}
+
+void vgic_mmio_write_wi(struct vcpu *vcpu, paddr_t addr,
+unsigned int len, unsigned long val)


Indentation.


+{
+/* Ignore */
+}
+
+static int match_region(const void *key, const void *elt)
+{
+const unsigned int offset = (unsigned long)key;
+const struct vgic_register_region *region = elt;
+
+if ( offset < region->reg_offset )
+return -1;
+
+if ( offset >= region->reg_offset + region->len )
+return 1;
+
+return 0;
+}
+
+const struct vgic_register_region *
+vgic_find_mmio_region(const struct vgic_register_region *regions,


Any reason to export this?


+  int nr_regions, unsigned int offset)


Indentation.


+{
+return bsearch((void *)(uintptr_t)offset, regions, nr_regions,
+   sizeof(regions[0]), match_region);
+}
+
+static bool check_region(const struct domain *d,
+ const struct vgic_register_region *region,
+ paddr_t addr, int len)


Indentation.


+{
+int flags, nr_irqs = d->arch.vgic.nr_spis + VGIC_NR_PRIVATE_IRQS;
+ > +switch (len)


switch ( ... )


+{
+case sizeof(u8):


s/u8/uint8_t/ here an below.


+flags = VGIC_ACCESS_8bit;
+break;
+case sizeof(u32):
+flags = VGIC_ACCESS_32bit;
+break;
+case sizeof(u64):
+flags = VGIC_ACCESS_64bit;
+break;
+default:
+return false;
+}
+
+if ( (region->access_flags & flags) && IS_ALIGNED(addr, len) )
+{
+if ( !region->bits_per_irq )
+return true;
+
+/* Do we access a non-allocated IRQ? */
+return VGIC_ADDR_TO_INTID(addr, region->bits_per_irq) < nr_irqs;
+}
+
+return false;
+}
+
+const struct vgic_register_region *
+vgic_get_mmio_region(struct vcpu *vcpu, struct vgic_io_device *iodev,



Any reason to export this?


+ paddr_t addr, int len)


Indentation and unsigned int please.


+{
+const struct vgic_register_region *region;
+
+region = vgic_find_mmio_region(iodev->regions, iodev->nr_regions,
+   addr - iodev->base_addr);
+if ( !region || !check_region(vcpu->domain, region, addr, len) )
+return NULL;
+
+return region;
+}
+
+static int dispatch_mmio_read(struct vcpu *vcpu, mmio_info_t *info,
+  register_t *r, void *priv)


Indentation.


+{
+struct vgic_io_device *iodev = priv;
+const struct vgic_register_region *region;
+unsigned long data = 0;
+paddr_t addr = info->gpa;
+int len = 1U << info->dabt.size;
+
+region = vgic_get_mmio_region(vcpu, iodev, addr, len);
+if ( !region )
+{
+memset(r, 0, len);
+return 0;
+}
+
+switch (iodev->iodev_type)
+{
+case IODEV_CPUIF:
+data = region->read(vcpu, addr, len);
+break;
+

Re: [Xen-devel] [PATCH v3 0/7] LLVM coverage support for Xen

2018-02-13 Thread Roger Pau Monné
On Tue, Feb 13, 2018 at 09:16:19AM -0700, Jan Beulich wrote:
> >>> On 13.02.18 at 16:53,  wrote:
> > On Wed, Jan 24, 2018 at 10:01:18AM +, Roger Pau Monne wrote:
> >> Hello,
> >> 
> >> The following patch series enables LLVM coverage support for the Xen
> >> hypervisor. A sample coverage report obtained after booting a PVHv2 Dom0
> >> can be found at:
> >> 
> >> http://xenbits.xen.org/people/royger/xen_profile/ 
> >> 
> >> I know the time is not the most appropriate given all the security work
> >> going on, but it seems like the series is quite close, and I would ike
> >> to avoid it bitrotting.
> > 
> > Patches 5, 6 and 7 have already been reviewed/acked by the relevant
> > maintainers, is there anything preventing them from going in?
> 
> I didn't keep them in my inbox when it became clear that patch 4
> needs another version, which is why I did apply only that single
> patch.

Oh, since patch 4 was the only one that had comments I only sent that.
Would you like me to send or push to a git branch the remaining ones?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] pvh/dom0: whitelist PVH Dom0 ACPI tables

2018-02-13 Thread Andrew Cooper
On 13/02/18 15:48, Roger Pau Monné wrote:
> On Tue, Feb 13, 2018 at 08:22:33AM -0700, Jan Beulich wrote:
> On 13.02.18 at 16:11,  wrote:
>>> On Tue, Feb 13, 2018 at 06:41:14AM -0700, Jan Beulich wrote:
>>> On 13.02.18 at 12:27,  wrote:
> On Tue, Feb 13, 2018 at 04:04:17AM -0700, Jan Beulich wrote:
> On 13.02.18 at 10:59,  wrote:
>>> On Tue, Feb 13, 2018 at 02:29:08AM -0700, Jan Beulich wrote:
>>> On 08.02.18 at 13:25,  wrote:
> Signed-off-by: Roger Pau Monné 
 A change like this should not come without description, providing a
 reason for the change. Otherwise how will someone wanting to
 understand the change in a couple of years actually be able to
 make any sense of it. This is in particular because I continue to be
 not fully convinced that white listing is appropriate in the Dom0
 case (and for the record I'm similarly unconvinced that black listing
 is the best choice, yet obviously we need to pick on of the two).
>>> I'm sorry, I thought we agreed at the summit to convert this to
>>> whitelisting because it was likely better to simply not expose unknown
>>> ACPI tables to guests.
>> "to guests" != "to Dom0".
>>
>>> I guess the commit message could be something like:
>>>
>>> "The following list of whitelisted APIC tables are either known to work
>>> or don't require any resources to be mapped in either the IO or the
>>> memory space.
>> Even if the white listing vs black listing question wasn't still
>> undecided, I think we should revert the patch in favor of one
>> with a description. The one above might be fine with "ACPI" in
>> place of "APIC" as far as tables actively white listed are
>> concerned, but then it still remains open why certain tables
>> haven't been included. I'm in particular worried about various
>> APEI related tables, but invisibility of e.g. an IBFT could also
>> lead to boot problems.
> Regarding APEI I think ERST, EINJ and HEST could be passed through,
> BERT however requires that the BOOT Error Region is mapped into Dom0
> p2m.
>
> Since PVH Dom0 creation still ends up in a panic, I see no problem in
> adding those in follow up patches.
>
> IBFT also looks safe to pass through.
 But you realize I've named only the few that came to mind
 immediately?
>>> Sure, what I have in this patch is just the minimal set (plus a few
>>> others that seem completely fine) needed in order to boot on my two
>>> test boxes.
>>>
>>> I know we will certainly have to expand this, but I see no issue in
>>> adding them as we go, the more that this is all still unused.
>> Unused - sure. But how will we learn which ones we need to add?
> I already have a kind of drafted list, of which ones could be added,
> which ones need some handlers in order to make sure relevant areas are
> mapped and finally a list of tables that will never be exposed to
> Dom0.
>
> This is based on the tables currently known to Xen from the actblX.h
> headers, there are probably more in the wild, even ones not documented
> in http://www.uefi.org/acpi at all.
>
> I can cleanup and send that list.
>
>> Surely waiting for problem reports from the field is not an
>> acceptable model.

APCI tables are no different to CPUID values, or MSRs, etc.  Such
reports from the field would be missing features, not bugs.  (TBF, I
wouldn't even put IBFT in to begin with, because I'm not convinced the
IOMMU logic is good enough to work in the common case.  We still don't
account for ACS/ARI errata in the PLX bridges, and iommu=dom0-strict
mode breaks horribly on all hardware I've ever tried using it on.)

Dom0 should not be treated specially.  It just has a bit more hardware
and permissions by default.  The current PV interactions, and especially
the workarounds we've had to maintain the hypervisor, demonstrate
precisely why a whitelist approach is better than a blacklist.

This is all completely brand new work, so we've got the opportunity to
do things correctly from the beginning.  I'd much rather it takes longer
to do properly, than inheriting the same mistakes and problems that PV
dom0 has.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 26/49] ARM: new VGIC: Implement vgic_vcpu_pending_irq

2018-02-13 Thread Julien Grall



On 13/02/18 16:35, Julien Grall wrote:

Hi,

On 09/02/18 14:39, Andre Przywara wrote:

Tell Xen whether a particular VCPU has an IRQ that needs handling
in the guest. This is used to decide whether a VCPU is runnable.


I forgot to mention one thing. This is not the main usage of this 
function in Xen. That function will mostly be used to check whether we 
need to preempt an hypercall to run interrupt.


Please update the commit message accordingly.



This is based on Linux commit 90eee56c5f90, written by Eric Auger.

Signed-off-by: Andre Przywara 
---
  xen/arch/arm/vgic/vgic.c | 32 
  1 file changed, 32 insertions(+)

diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
index f4f2a04a60..9e7fb1edcb 100644
--- a/xen/arch/arm/vgic/vgic.c
+++ b/xen/arch/arm/vgic/vgic.c
@@ -646,6 +646,38 @@ void gic_inject(void)
  vgic_restore_state(current);
  }
+static int vgic_vcpu_pending_irq(struct vcpu *vcpu)
+{
+    struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+    struct vgic_irq *irq;
+    bool pending = false;
+    unsigned long flags;
+
+    if ( !vcpu->domain->arch.vgic.enabled )
+    return false;
+
+    spin_lock_irqsave(_cpu->ap_list_lock, flags);
+
+    list_for_each_entry(irq, _cpu->ap_list_head, ap_list)
+    {
+    spin_lock(>irq_lock);
+    pending = irq_is_pending(irq) && irq->enabled;
+    spin_unlock(>irq_lock);
+
+    if ( pending )
+    break;
+    }
+
+    spin_unlock_irqrestore(_cpu->ap_list_lock, flags);
+
+    return pending;
+}
+
+int gic_events_need_delivery(void)


You probably want to rename that function or just expose 
vgic_vcpu_pending_irq().


Cheers,



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 26/49] ARM: new VGIC: Implement vgic_vcpu_pending_irq

2018-02-13 Thread Julien Grall

Hi,

On 09/02/18 14:39, Andre Przywara wrote:

Tell Xen whether a particular VCPU has an IRQ that needs handling
in the guest. This is used to decide whether a VCPU is runnable.

This is based on Linux commit 90eee56c5f90, written by Eric Auger.

Signed-off-by: Andre Przywara 
---
  xen/arch/arm/vgic/vgic.c | 32 
  1 file changed, 32 insertions(+)

diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
index f4f2a04a60..9e7fb1edcb 100644
--- a/xen/arch/arm/vgic/vgic.c
+++ b/xen/arch/arm/vgic/vgic.c
@@ -646,6 +646,38 @@ void gic_inject(void)
  vgic_restore_state(current);
  }
  
+static int vgic_vcpu_pending_irq(struct vcpu *vcpu)

+{
+struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+struct vgic_irq *irq;
+bool pending = false;
+unsigned long flags;
+
+if ( !vcpu->domain->arch.vgic.enabled )
+return false;
+
+spin_lock_irqsave(_cpu->ap_list_lock, flags);
+
+list_for_each_entry(irq, _cpu->ap_list_head, ap_list)
+{
+spin_lock(>irq_lock);
+pending = irq_is_pending(irq) && irq->enabled;
+spin_unlock(>irq_lock);
+
+if ( pending )
+break;
+}
+
+spin_unlock_irqrestore(_cpu->ap_list_lock, flags);
+
+return pending;
+}
+
+int gic_events_need_delivery(void)


You probably want to rename that function or just expose 
vgic_vcpu_pending_irq().


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 23/49] ARM: new VGIC: Add IRQ sorting

2018-02-13 Thread Christoffer Dall
On Tue, Feb 13, 2018 at 3:56 PM, Andre Przywara
 wrote:
> Hi,
>
> Christoffer, Eric, Marc,
> a question about locking order between multiple IRQs below. Could you
> have a brief look, please?
>
> On 13/02/18 12:30, Julien Grall wrote:
>> Hi Andre,
>>
>> On 09/02/18 14:39, Andre Przywara wrote:
>>> Adds the sorting function to cover the case where you have more IRQs
>>> to consider than you have LRs. We consider their priorities.
>>> This pulls in Linux' list_sort.c , which is a merge sort implementation
>>> for linked lists.
>>>
>>> This is based on Linux commit 8e4447457965, written by Christoffer Dall.
>>>
>>> Signed-off-by: Andre Przywara 
>>> ---
>>>   xen/arch/arm/vgic/vgic.c|  59 +++
>>>   xen/common/list_sort.c  | 170
>>> 
>>>   xen/include/xen/list_sort.h |  11 +++
>>
>> You need to CC "THE REST" maintainers for this code. It would also make
>> sense to have a separate patch for adding list_sort.c
>
> Yeah, will do.
>
>>>   3 files changed, 240 insertions(+)
>>>   create mode 100644 xen/common/list_sort.c
>>>   create mode 100644 xen/include/xen/list_sort.h
>>>
>>> diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
>>> index f517df6d00..a4efd1fd03 100644
>>> --- a/xen/arch/arm/vgic/vgic.c
>>> +++ b/xen/arch/arm/vgic/vgic.c
>>> @@ -16,6 +16,7 @@
>>>*/
>>> #include 
>>> +#include 
>>>   #include 
>>> #include 
>>> @@ -163,6 +164,64 @@ static struct vcpu *vgic_target_oracle(struct
>>> vgic_irq *irq)
>>>   return NULL;
>>>   }
>>>   +/*
>>> + * The order of items in the ap_lists defines how we'll pack things
>>> in LRs as
>>> + * well, the first items in the list being the first things populated
>>> in the
>>> + * LRs.
>>> + *
>>> + * A hard rule is that active interrupts can never be pushed out of
>>> the LRs
>>> + * (and therefore take priority) since we cannot reliably trap on
>>> deactivation
>>> + * of IRQs and therefore they have to be present in the LRs.
>>> + *
>>> + * Otherwise things should be sorted by the priority field and the GIC
>>> + * hardware support will take care of preemption of priority groups etc.
>>> + *
>>> + * Return negative if "a" sorts before "b", 0 to preserve order, and
>>> positive
>>> + * to sort "b" before "a".
>>
>> Finally a good explanation of the return value of a sort function :). I
>> always get confused what the return is supposed to be.
>>
>>> + */
>>> +static int vgic_irq_cmp(void *priv, struct list_head *a, struct
>>> list_head *b)
>>> +{
>>> +struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
>>> +struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
>>> +bool penda, pendb;
>>> +int ret;
>>> +
>>> +spin_lock(>irq_lock);
>>> +spin_lock(>irq_lock);
>>
>> I guess the locking order does not matter here because this is the only
>> place where two IRQs lock have to be taken?
>
> Mmh, good question. I guess indeed in practice this will not be a problem:
> - As you mentioned this should be the only(?) place where we take
> multiple IRQ locks, but that sounds fragile.
> - A certain IRQ should only be on one VCPU list at a given point in
> time. So there would be no race with two instances of this compare
> function trying to lock the same IRQ.
>
> But that sounds a bit dodgy to rely on. It should be relatively straight
> forward to fix this with a simple comparison, shouldn't it?
> CC:ing Christoffer, Marc and Eric here to see if we should add this (in
> KVM as well).
>

The only concern about holding two locks at the same time is the risk
of another thread attempting to hold a number of locks at the same
time in a different order, leading to a deadlock (either directly or
via a circular dependency).

As you point out, the only place where we take two irq locks at the
same time is in vgic_irq_cmp().  Now, the concern can be reduced to
calling this function more than once in parallel, operating on the
same set of struct irqs.

An IRQ can only be on a single AP list at any time, and we call
vgic_irq_cmp() from exactly one place in the KVM code, which holds the
ap_list_lock, and our locking order defines that the ap_list_lock must
be taken before irq locks.  This means that vgic_irq_cmp() can only
execute in parallel on different AP lists and therefore not operate on
the same set of struct irqs.

There is no need to change anything in the implementation.

Thanks,
-Christoffer

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 0/7] LLVM coverage support for Xen

2018-02-13 Thread Jan Beulich
>>> On 13.02.18 at 16:53,  wrote:
> On Wed, Jan 24, 2018 at 10:01:18AM +, Roger Pau Monne wrote:
>> Hello,
>> 
>> The following patch series enables LLVM coverage support for the Xen
>> hypervisor. A sample coverage report obtained after booting a PVHv2 Dom0
>> can be found at:
>> 
>> http://xenbits.xen.org/people/royger/xen_profile/ 
>> 
>> I know the time is not the most appropriate given all the security work
>> going on, but it seems like the series is quite close, and I would ike
>> to avoid it bitrotting.
> 
> Patches 5, 6 and 7 have already been reviewed/acked by the relevant
> maintainers, is there anything preventing them from going in?

I didn't keep them in my inbox when it became clear that patch 4
needs another version, which is why I did apply only that single
patch.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 0/7] LLVM coverage support for Xen

2018-02-13 Thread Roger Pau Monné
On Wed, Jan 24, 2018 at 10:01:18AM +, Roger Pau Monne wrote:
> Hello,
> 
> The following patch series enables LLVM coverage support for the Xen
> hypervisor. A sample coverage report obtained after booting a PVHv2 Dom0
> can be found at:
> 
> http://xenbits.xen.org/people/royger/xen_profile/
> 
> I know the time is not the most appropriate given all the security work
> going on, but it seems like the series is quite close, and I would ike
> to avoid it bitrotting.

Patches 5, 6 and 7 have already been reviewed/acked by the relevant
maintainers, is there anything preventing them from going in?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] pvh/dom0: whitelist PVH Dom0 ACPI tables

2018-02-13 Thread Roger Pau Monné
On Tue, Feb 13, 2018 at 08:22:33AM -0700, Jan Beulich wrote:
> >>> On 13.02.18 at 16:11,  wrote:
> > On Tue, Feb 13, 2018 at 06:41:14AM -0700, Jan Beulich wrote:
> >> >>> On 13.02.18 at 12:27,  wrote:
> >> > On Tue, Feb 13, 2018 at 04:04:17AM -0700, Jan Beulich wrote:
> >> >> >>> On 13.02.18 at 10:59,  wrote:
> >> >> > On Tue, Feb 13, 2018 at 02:29:08AM -0700, Jan Beulich wrote:
> >> >> >> >>> On 08.02.18 at 13:25,  wrote:
> >> >> >> > Signed-off-by: Roger Pau Monné 
> >> >> >> 
> >> >> >> A change like this should not come without description, providing a
> >> >> >> reason for the change. Otherwise how will someone wanting to
> >> >> >> understand the change in a couple of years actually be able to
> >> >> >> make any sense of it. This is in particular because I continue to be
> >> >> >> not fully convinced that white listing is appropriate in the Dom0
> >> >> >> case (and for the record I'm similarly unconvinced that black listing
> >> >> >> is the best choice, yet obviously we need to pick on of the two).
> >> >> > 
> >> >> > I'm sorry, I thought we agreed at the summit to convert this to
> >> >> > whitelisting because it was likely better to simply not expose unknown
> >> >> > ACPI tables to guests.
> >> >> 
> >> >> "to guests" != "to Dom0".
> >> >> 
> >> >> > I guess the commit message could be something like:
> >> >> > 
> >> >> > "The following list of whitelisted APIC tables are either known to 
> >> >> > work
> >> >> > or don't require any resources to be mapped in either the IO or the
> >> >> > memory space.
> >> >> 
> >> >> Even if the white listing vs black listing question wasn't still
> >> >> undecided, I think we should revert the patch in favor of one
> >> >> with a description. The one above might be fine with "ACPI" in
> >> >> place of "APIC" as far as tables actively white listed are
> >> >> concerned, but then it still remains open why certain tables
> >> >> haven't been included. I'm in particular worried about various
> >> >> APEI related tables, but invisibility of e.g. an IBFT could also
> >> >> lead to boot problems.
> >> > 
> >> > Regarding APEI I think ERST, EINJ and HEST could be passed through,
> >> > BERT however requires that the BOOT Error Region is mapped into Dom0
> >> > p2m.
> >> > 
> >> > Since PVH Dom0 creation still ends up in a panic, I see no problem in
> >> > adding those in follow up patches.
> >> > 
> >> > IBFT also looks safe to pass through.
> >> 
> >> But you realize I've named only the few that came to mind
> >> immediately?
> > 
> > Sure, what I have in this patch is just the minimal set (plus a few
> > others that seem completely fine) needed in order to boot on my two
> > test boxes.
> > 
> > I know we will certainly have to expand this, but I see no issue in
> > adding them as we go, the more that this is all still unused.
> 
> Unused - sure. But how will we learn which ones we need to add?

I already have a kind of drafted list, of which ones could be added,
which ones need some handlers in order to make sure relevant areas are
mapped and finally a list of tables that will never be exposed to
Dom0.

This is based on the tables currently known to Xen from the actblX.h
headers, there are probably more in the wild, even ones not documented
in http://www.uefi.org/acpi at all.

I can cleanup and send that list.

> Surely waiting for problem reports from the field is not an
> acceptable model.

That's last resort of course, but given that PVH Dom0 doesn't exist
yet I don't see it as bad to receive reports of missing tables (or
things not working as expected) during the experimental version of
it.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-13 Thread Roger Pau Monné
On Tue, Feb 13, 2018 at 06:40:20AM -0700, Jan Beulich wrote:
> >>> On 13.02.18 at 12:13,  wrote:
> > On Tue, Feb 13, 2018 at 04:05:45AM -0700, Jan Beulich wrote:
> >> >>> On 13.02.18 at 11:29,  wrote:
> >> > On Tue, Feb 13, 2018 at 03:06:24AM -0700, Jan Beulich wrote:
> >> >> >>> On 12.02.18 at 11:05,  wrote:
> >> >> > If you map the NVDIMM as MMIO to Dom0 you don't need the M2P entries
> >> >> > IIRC, and if it's mapped using 1GB pages it shouldn't use that much
> >> >> > memory for the page tables (ie: you could just use normal RAM for the
> >> >> > page tables that map the NVDIMM IMO). Of course that only applies to
> >> >> > PVH/HVM.
> >> >> 
> >> >> But in order to use (part of) it in a RAM-like manner we need struct
> >> >> page_info for it.
> >> > 
> >> > I guess the main use of this would be to grant NVDIMM pages? And
> >> > without a page_info that's not possible.
> >> 
> >> Why grant? Simply giving such a page as RAM to a guest would
> >> already be a problem without struct page_info (as then we can't
> >> track the page owner, nor can we refcount the page).
> > 
> > My point was to avoid doing that, and always assign the pages as
> > MMIO, which IIRC doesn't require a struct page_info.
> 
> MMIO pages can't be used for things like page tables, because of
> the refcounting that's needed. The page being like RAM, however,
> implies that the guest needs to be able to use it as anything a RAM
> page can be used for.

OK, I'm quite unsure about what people actually use NVDIMM for, I
thought it was mostly used as some kind of storage, but if it's
actually used as plain RAM then yes, we likely need struct page_info
for them, which is a PITA.

My worries are that if you boot bare metal Linux and use NVDIMM, and
then reboot into Xen you won't be able to access the NVDIMM data
anymore AFAICT because Xen will have taken over it, and already used
part of it to store it's own page tables, which is problematic IMO.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 24/49] ARM: new VGIC: Add IRQ sync/flush framework

2018-02-13 Thread Andre Przywara
Hi,

On 13/02/18 12:41, Julien Grall wrote:
> Hi Andre,
> 
> On 09/02/18 14:39, Andre Przywara wrote:
>> Implement the framework for syncing IRQs between our emulation and the
>> list registers, which represent the guest's view of IRQs.
>> This is done in kvm_vgic_flush_hwstate and kvm_vgic_sync_hwstate, which
> 
> You probably want to update the names here.

Sure.

>> gets called on guest entry and exit.
>> The code talking to the actual GICv2/v3 hardware is added in the
>> following patches.
>>
>> This is based on Linux commit 0919e84c0fc1, written by Marc Zyngier.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>   xen/arch/arm/vgic/vgic.c | 246
>> +++
>>   xen/arch/arm/vgic/vgic.h |   2 +
>>   2 files changed, 248 insertions(+)
>>
>> diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
>> index a4efd1fd03..a1f77130d4 100644
>> --- a/xen/arch/arm/vgic/vgic.c
>> +++ b/xen/arch/arm/vgic/vgic.c
>> @@ -380,6 +380,252 @@ int vgic_inject_irq(struct domain *d, struct
>> vcpu *vcpu, unsigned int intid,
>>   return 0;
>>   }
>>   +/**
>> + * vgic_prune_ap_list - Remove non-relevant interrupts from the list
>> + *
>> + * @vcpu: The VCPU pointer
>> + *
>> + * Go over the list of "interesting" interrupts, and prune those that we
>> + * won't have to consider in the near future.
>> + */
>> +static void vgic_prune_ap_list(struct vcpu *vcpu)
>> +{
>> +    struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
>> +    struct vgic_irq *irq, *tmp;
>> +    unsigned long flags;
>> +
>> +retry:
>> +    spin_lock_irqsave(_cpu->ap_list_lock, flags);
>> +
>> +    list_for_each_entry_safe( irq, tmp, _cpu->ap_list_head,
>> ap_list )
> 
> See my comment on patch #22, this is where I am worry about going
> through the list every time we enter to the hypervisor from the guest.

I am not sure we can avoid this here, as this function is crucial to the
VGIC state machine.
We might later look into if we can avoid iterating through the whole
list or if we can shortcut some interrupts somehow, but I really would
be careful tinkering with this function too much.

>> +    {
>> +    struct vcpu *target_vcpu, *vcpuA, *vcpuB;
>> +
>> +    spin_lock(>irq_lock);
>> +
>> +    BUG_ON(vcpu != irq->vcpu);
>> +
>> +    target_vcpu = vgic_target_oracle(irq);
>> +
>> +    if ( !target_vcpu )
>> +    {
>> +    /*
>> + * We don't need to process this interrupt any
>> + * further, move it off the list.
>> + */
>> +    list_del(>ap_list);
>> +    irq->vcpu = NULL;
>> +    spin_unlock(>irq_lock);
>> +
>> +    /*
>> + * This vgic_put_irq call matches the
>> + * vgic_get_irq_kref in vgic_queue_irq_unlock,
>> + * where we added the LPI to the ap_list. As
>> + * we remove the irq from the list, we drop
>> + * also drop the refcount.
>> + */
>> +    vgic_put_irq(vcpu->domain, irq);
>> +    continue;
>> +    }
>> +
>> +    if ( target_vcpu == vcpu )
>> +    {
>> +    /* We're on the right CPU */
>> +    spin_unlock(>irq_lock);
>> +    continue;
>> +    }
>> +
>> +    /* This interrupt looks like it has to be migrated. */
>> +
>> +    spin_unlock(>irq_lock);
>> +    spin_unlock_irqrestore(_cpu->ap_list_lock, flags);
>> +
>> +    /*
>> + * Ensure locking order by always locking the smallest
>> + * ID first.
>> + */
>> +    if ( vcpu->vcpu_id < target_vcpu->vcpu_id )
>> +    {
>> +    vcpuA = vcpu;
>> +    vcpuB = target_vcpu;
>> +    }
>> +    else
>> +    {
>> +    vcpuA = target_vcpu;
>> +    vcpuB = vcpu;
>> +    }
>> +
>> +    spin_lock_irqsave(>arch.vgic_cpu.ap_list_lock, flags);
>> +    spin_lock(>arch.vgic_cpu.ap_list_lock);
>> +    spin_lock(>irq_lock);
>> +
>> +    /*
>> + * If the affinity has been preserved, move the
>> + * interrupt around. Otherwise, it means things have
>> + * changed while the interrupt was unlocked, and we
>> + * need to replay this.
>> + *
>> + * In all cases, we cannot trust the list not to have
>> + * changed, so we restart from the beginning.
>> + */
>> +    if ( target_vcpu == vgic_target_oracle(irq) )
>> +    {
>> +    struct vgic_cpu *new_cpu = _vcpu->arch.vgic_cpu;
>> +
>> +    list_del(>ap_list);
>> +    irq->vcpu = target_vcpu;
>> +    list_add_tail(>ap_list, _cpu->ap_list_head);
>> +    }
>> +
>> +    spin_unlock(>irq_lock);
>> +    spin_unlock(>arch.vgic_cpu.ap_list_lock);
>> +    spin_unlock_irqrestore(>arch.vgic_cpu.ap_list_lock,
>> flags);
>> +    goto retry;
>> +    }
>> +
>> +    spin_unlock_irqrestore(_cpu->ap_list_lock, flags);
>> +}
>> +
>> +static inline void 

[Xen-devel] [xen-unstable-smoke test] 119079: tolerable all pass - PUSHED

2018-02-13 Thread osstest service owner
flight 119079 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/119079/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  1a42ffa3476ab433da9dc27c6d36f051b70592ed
baseline version:
 xen  8b1a5268daf0ff1ddca49d2e683e5bfabf6b9988

Last test of basis   118995  2018-02-12 12:02:43 Z1 days
Testing same since   119079  2018-02-13 13:26:08 Z0 days1 attempts


People who touched revisions under test:
  Juergen Gross 
  Paul Semel 
  Simon Gaiser 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   8b1a5268da..1a42ffa347  1a42ffa3476ab433da9dc27c6d36f051b70592ed -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/7] xen: xsm: flask: introduce XENMAPSPACE_gmfn_share for memory sharing

2018-02-13 Thread Jan Beulich
>>> On 13.02.18 at 16:15,  wrote:
> I've updated the comments according to your previous suggestions,
> do they look good to you?

The one in the public header is way too verbose. I specifically don't
see why you would need to spell out XSM privilege requirements
there. Please make new comments match existing ones in style and
verbosity if at all possible, while still conveying all necessary /
relevant information.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] pvh/dom0: whitelist PVH Dom0 ACPI tables

2018-02-13 Thread Jan Beulich
>>> On 13.02.18 at 16:11,  wrote:
> On Tue, Feb 13, 2018 at 06:41:14AM -0700, Jan Beulich wrote:
>> >>> On 13.02.18 at 12:27,  wrote:
>> > On Tue, Feb 13, 2018 at 04:04:17AM -0700, Jan Beulich wrote:
>> >> >>> On 13.02.18 at 10:59,  wrote:
>> >> > On Tue, Feb 13, 2018 at 02:29:08AM -0700, Jan Beulich wrote:
>> >> >> >>> On 08.02.18 at 13:25,  wrote:
>> >> >> > Signed-off-by: Roger Pau Monné 
>> >> >> 
>> >> >> A change like this should not come without description, providing a
>> >> >> reason for the change. Otherwise how will someone wanting to
>> >> >> understand the change in a couple of years actually be able to
>> >> >> make any sense of it. This is in particular because I continue to be
>> >> >> not fully convinced that white listing is appropriate in the Dom0
>> >> >> case (and for the record I'm similarly unconvinced that black listing
>> >> >> is the best choice, yet obviously we need to pick on of the two).
>> >> > 
>> >> > I'm sorry, I thought we agreed at the summit to convert this to
>> >> > whitelisting because it was likely better to simply not expose unknown
>> >> > ACPI tables to guests.
>> >> 
>> >> "to guests" != "to Dom0".
>> >> 
>> >> > I guess the commit message could be something like:
>> >> > 
>> >> > "The following list of whitelisted APIC tables are either known to work
>> >> > or don't require any resources to be mapped in either the IO or the
>> >> > memory space.
>> >> 
>> >> Even if the white listing vs black listing question wasn't still
>> >> undecided, I think we should revert the patch in favor of one
>> >> with a description. The one above might be fine with "ACPI" in
>> >> place of "APIC" as far as tables actively white listed are
>> >> concerned, but then it still remains open why certain tables
>> >> haven't been included. I'm in particular worried about various
>> >> APEI related tables, but invisibility of e.g. an IBFT could also
>> >> lead to boot problems.
>> > 
>> > Regarding APEI I think ERST, EINJ and HEST could be passed through,
>> > BERT however requires that the BOOT Error Region is mapped into Dom0
>> > p2m.
>> > 
>> > Since PVH Dom0 creation still ends up in a panic, I see no problem in
>> > adding those in follow up patches.
>> > 
>> > IBFT also looks safe to pass through.
>> 
>> But you realize I've named only the few that came to mind
>> immediately?
> 
> Sure, what I have in this patch is just the minimal set (plus a few
> others that seem completely fine) needed in order to boot on my two
> test boxes.
> 
> I know we will certainly have to expand this, but I see no issue in
> adding them as we go, the more that this is all still unused.

Unused - sure. But how will we learn which ones we need to add?
Surely waiting for problem reports from the field is not an
acceptable model.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] pvh/dom0: whitelist PVH Dom0 ACPI tables

2018-02-13 Thread Roger Pau Monné
On Tue, Feb 13, 2018 at 06:41:14AM -0700, Jan Beulich wrote:
> >>> On 13.02.18 at 12:27,  wrote:
> > On Tue, Feb 13, 2018 at 04:04:17AM -0700, Jan Beulich wrote:
> >> >>> On 13.02.18 at 10:59,  wrote:
> >> > On Tue, Feb 13, 2018 at 02:29:08AM -0700, Jan Beulich wrote:
> >> >> >>> On 08.02.18 at 13:25,  wrote:
> >> >> > Signed-off-by: Roger Pau Monné 
> >> >> 
> >> >> A change like this should not come without description, providing a
> >> >> reason for the change. Otherwise how will someone wanting to
> >> >> understand the change in a couple of years actually be able to
> >> >> make any sense of it. This is in particular because I continue to be
> >> >> not fully convinced that white listing is appropriate in the Dom0
> >> >> case (and for the record I'm similarly unconvinced that black listing
> >> >> is the best choice, yet obviously we need to pick on of the two).
> >> > 
> >> > I'm sorry, I thought we agreed at the summit to convert this to
> >> > whitelisting because it was likely better to simply not expose unknown
> >> > ACPI tables to guests.
> >> 
> >> "to guests" != "to Dom0".
> >> 
> >> > I guess the commit message could be something like:
> >> > 
> >> > "The following list of whitelisted APIC tables are either known to work
> >> > or don't require any resources to be mapped in either the IO or the
> >> > memory space.
> >> 
> >> Even if the white listing vs black listing question wasn't still
> >> undecided, I think we should revert the patch in favor of one
> >> with a description. The one above might be fine with "ACPI" in
> >> place of "APIC" as far as tables actively white listed are
> >> concerned, but then it still remains open why certain tables
> >> haven't been included. I'm in particular worried about various
> >> APEI related tables, but invisibility of e.g. an IBFT could also
> >> lead to boot problems.
> > 
> > Regarding APEI I think ERST, EINJ and HEST could be passed through,
> > BERT however requires that the BOOT Error Region is mapped into Dom0
> > p2m.
> > 
> > Since PVH Dom0 creation still ends up in a panic, I see no problem in
> > adding those in follow up patches.
> > 
> > IBFT also looks safe to pass through.
> 
> But you realize I've named only the few that came to mind
> immediately?

Sure, what I have in this patch is just the minimal set (plus a few
others that seem completely fine) needed in order to boot on my two
test boxes.

I know we will certainly have to expand this, but I see no issue in
adding them as we go, the more that this is all still unused.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/7] xen: xsm: flask: introduce XENMAPSPACE_gmfn_share for memory sharing

2018-02-13 Thread Zhongze Liu
Hi Jan,

I've updated the comments according to your previous suggestions,
do they look good to you?

2018-02-01 18:23 GMT+08:00 Jan Beulich :
 On 30.01.18 at 18:50,  wrote:
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -4126,6 +4126,10 @@ int xenmem_add_to_physmap_one(
>>  }
>>  case XENMAPSPACE_gmfn_foreign:
>>  return p2m_add_foreign(d, idx, gfn_x(gpfn), 
>> extra.foreign_domid);
>> +case XENMAPSPACE_gmfn_share:
>> +gdprintk(XENLOG_WARNING,
>> + "XENMAPSPACE_gmfn_share is currently not supported on 
>> x86\n");
>> +break;
>
> Please don't - a hypervisor log message isn't really useful here. It
> should be the tool stack to disallow respective options on x86
> until that's implemented.
>
>> --- a/xen/include/public/memory.h
>> +++ b/xen/include/public/memory.h
>> @@ -227,6 +227,10 @@ DEFINE_XEN_GUEST_HANDLE(xen_machphys_mapping_t);
>>Stage-2 using the Normal Memory
>>Inner/Outer Write-Back Cacheable
>>memory attribute. */

/*
 * GMFN from another dom, but with different privilege requirements.
 *
 * Suppose that (c), the current domain, is trying to map pages from (t) into
 * (d). gmfn_share dosen't require that (d) has privilege over (t). This
 * enables some usecases such as dom0 trying to share memory pages between
 * two unprivileged guests, which will otherwise be impossible using
 * gmfn_foreign. This is XENMEM_add_to_physmap_batch only, and currently ARM
 * only.
 *
 * The exact XSM privilege requirements are as follows:
 *
 * == == = =
 *(c) over (d)   (c) over (t)  (d) over (t)
 * == == = =
 * _foreign (dummy)   XSM_TARGET NOXSM_TARGET
 * _share (dummy) XSM_TARGET XSM_TARGETNO
 * _foreign (flask)   MMU__PHYSMAP   NOMMU__MAP_READ/WRITE
 * _share (flask) MMU__PHYSMAP   MMU__MAP_READ/WRITE   MMU__SHARE
 * == == = =
 */
>> +#define XENMAPSPACE_gmfn_share   6 /* Same as *_gmfn_foreign, but this is
>> +  for a privileged dom to share pages
>> +  between two doms. */
>> +
>
> The comment doesn't make clear why XENMAPSPACE_gmfn_foreign
> then can't be used. In particular it is left open how _both_ domains
> would be specified.
>
> Also XENMAPSPACE_gmfn_foreign is restricted to
> XENMEM_add_to_physmap_batch (a comment says so) - how about
> this new one? According to the actual code changes you do, there's
> no meaningful difference, in which case the restriction should be
> named here as well.
>
>> --- a/xen/include/xsm/dummy.h
>> +++ b/xen/include/xsm/dummy.h
>> @@ -521,6 +521,12 @@ static XSM_INLINE int 
>> xsm_map_gmfn_foreign(XSM_DEFAULT_ARG struct domain *d, str
>>  return xsm_default_action(action, d, t);
>>  }
>>
>> +static XSM_INLINE int xsm_map_gmfn_share(XSM_DEFAULT_ARG struct domain *d, 
>> struct domain *t)
>
> Line length.
>
>> +{

  /*
   * This action also requires that @current targets @d, but it has already been
   * checked somewhere higher in the call stack.
   *
   * Be aware that this is not an exact default equivalence of its flask variant
   * which also checks if @d and @t "are allowed to share memory pages", for we
   * don't have a proper default equivalence of such a check.
   */
>> +XSM_ASSERT_ACTION(XSM_TARGET);
>> +return xsm_default_action(action, current->domain, t);
>
> How does this represent a proper default equivalent of ...
>
> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -1196,6 +1196,12 @@ static int flask_map_gmfn_foreign(struct domain *d, 
> struct domain *t)
>  return domain_has_perm(d, t, SECCLASS_MMU, MMU__MAP_READ | 
> MMU__MAP_WRITE);
>  }
>
> +static int flask_map_gmfn_share(struct domain *d, struct domain *t)
> +{
/*
 * This action also requires that @current has MMU__MAP_READ/WRITE over @d,
 * but that has already been checked somewhere higher in the call stack (for
 * example, by flask_add_to_physmap()).
 */
> +return current_has_perm(t, SECCLASS_MMU, MMU__MAP_READ | MMU__MAP_WRITE) 
> ?:
> +domain_has_perm(d, t, SECCLASS_MMU, MMU__SHARE_MEM);
>
> ... this?


Cheers,

Zhongze Liu.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 15/49] ARM: GIC: Allow tweaking the active state of an IRQ

2018-02-13 Thread Andre Przywara
Hi,

On 13/02/18 12:02, Julien Grall wrote:
> On 12/02/18 17:53, Andre Przywara wrote:
>> Hi,
> 
> Hi Andre,
> 
>> On 12/02/18 13:55, Julien Grall wrote:
>>> Hi Andre,
>>>
>>> On 09/02/18 14:39, Andre Przywara wrote:
 When playing around with hardware mapped, level triggered virtual IRQs,
 there is the need to explicitly set the active state of an interrupt at
 some point in time.
 To prepare the GIC for that, we introduce a set_active_state() function
 to let the VGIC manipulate the state of an associated hardware IRQ.

 Signed-off-by: Andre Przywara 
 ---
    xen/arch/arm/gic-v2.c |  9 +
    xen/arch/arm/gic-v3.c | 16 
    xen/arch/arm/gic.c    |  5 +
    xen/include/asm-arm/gic.h |  5 +
    4 files changed, 35 insertions(+)

 diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
 index 2e35892881..5339f69fbc 100644
 --- a/xen/arch/arm/gic-v2.c
 +++ b/xen/arch/arm/gic-v2.c
 @@ -235,6 +235,14 @@ static unsigned int gicv2_read_irq(void)
    return (readl_gicc(GICC_IAR) & GICC_IA_IRQ);
    }
    +static void gicv2_set_active_state(int irq, bool active)
>>>
>>> I would much prefer to have an irq_desc in parameter. This is matching
>>> the other interface
>>
>> ... and that's why I had it just like this in my first version. However
>> this proved to be nasty because I now need to get this irq_desc pointer
>> first, as the caller doesn't have it already. Since all we have and need
>> is the actual hardware IRQ number, I found it more straight-forward to
>> just use that number directly instead of going via the pointer and back
>> (h/w intid => irq_desc => irq).
>>
>>> and you could update the flags such as
>>> _IRQ_INPROGRESS which you don't do at the moment.
>>
>> Mmh, interesting point. I guess I should also clear this bit in the new
>> VGIC. At least once I wrapped my head around what this flag is
>> *actually* for (in conjunction with _IRQ_GUEST).
>> Anyway I guess this bit would still be set in our case.
> 
> For IRQ routed to the guest, the flag is used to know whether you need
> to EOI the interrupt on domain destruction.

Yeah, I found that. In general I am a bit suspicious of replicating and
tracking the hardware IRQ state in software.

> In general, I would like to keep desc->status in sync for the guest IRQ.
> This is useful for debugging and potentially some ratelimit on interrupt
> (I am thinking for ITS).
> 
>>
>>> Also, who is preventing two CPUs to clear the active bit at the same
>>> time?
>>
>> A certain hardware IRQ is assigned to one virtual IRQ on one VCPU at one
>> time only. Besides, GICD_ICACTIVERn has wired NAND semantics, so that's
>> naturally race free (as it was designed to be).
>> Unless I miss something here (happy to be pointed to an example where it
>> causes problems).
> 
> You could potentially have a race between ICACTIVER an ISACTIVER.

I don't see why this would be a problem:
Either you activate the IRQ or you deactivate it. The
wired-OR/wired-NAND semantics makes sure this never gets inconsistent on
the hardware side. If you issue two conflicting requests at the same
time, that's a benign race, which you either don't care about or handle
via locking in the code which triggers these requests.

Besides, we only do one direction in the code at the moment anyway.
And this should be *clearing* the active state, and not setting it,
which is a bug I discovered yesterday.

> is very similar to the enable/disable part. This matters a lot when
> updating desc->status.

Which is one reason why I am suspicious of this whole state replication.
But the desc lock should take care of this in general, no?

 +}
 +
    static void gicv2_set_irq_type(struct irq_desc *desc, unsigned int
 type)
    {
    uint32_t cfg, actual, edgebit;
 @@ -1241,6 +1249,7 @@ const static struct gic_hw_operations
 gicv2_ops = {
    .eoi_irq = gicv2_eoi_irq,
    .deactivate_irq  = gicv2_dir_irq,
    .read_irq    = gicv2_read_irq,
 +    .set_active_state    = gicv2_set_active_state,
    .set_irq_type    = gicv2_set_irq_type,
    .set_irq_priority    = gicv2_set_irq_priority,
    .send_SGI    = gicv2_send_SGI,
 diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
 index 08d4703687..595eaef43a 100644
 --- a/xen/arch/arm/gic-v3.c
 +++ b/xen/arch/arm/gic-v3.c
 @@ -475,6 +475,21 @@ static unsigned int gicv3_read_irq(void)
    return irq;
    }
    +static void gicv3_set_active_state(int irq, bool active)
 +{
 +    void __iomem *base;
 +
 +    if ( irq >= NR_GIC_LOCAL_IRQS)
 +    base = GICD + (irq / 32) * 4;
 +    else
 +    base = GICD_RDIST_SGI_BASE;
 +
 +    if ( active )
 +    writel(1U << (irq % 32), base + GICD_ISACTIVER);
 +    

Re: [Xen-devel] [RFC PATCH 24/49] ARM: new VGIC: Add IRQ sync/flush framework

2018-02-13 Thread Julien Grall



On 13/02/18 14:56, Andre Przywara wrote:

Hi,

On 13/02/18 14:31, Julien Grall wrote:

Hi Andre,

On 09/02/18 14:39, Andre Przywara wrote:

+/* Requires the VCPU's ap_list_lock to be held. */
+static void vgic_flush_lr_state(struct vcpu *vcpu)
+{
+    struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+    struct vgic_irq *irq;
+    int count = 0;
+
+    ASSERT(spin_is_locked(_cpu->ap_list_lock));
+
+    if ( compute_ap_list_depth(vcpu) > gic_get_nr_lrs() )
+    vgic_sort_ap_list(vcpu);
+
+    list_for_each_entry( irq, _cpu->ap_list_head, ap_list )
+    {
+    spin_lock(>irq_lock);
+
+    if ( unlikely(vgic_target_oracle(irq) != vcpu) )
+    goto next;
+
+    /*
+ * If we get an SGI with multiple sources, try to get
+ * them in all at once.
+ */
+    do
+    {
+    vgic_populate_lr(vcpu, irq, count++);
+    } while ( irq->source && count < gic_get_nr_lrs() );
+
+next:
+    spin_unlock(>irq_lock);
+
+    if ( count == gic_get_nr_lrs() )
+    {
+    if ( !list_is_last(>ap_list, _cpu->ap_list_head) )
+    vgic_set_underflow(vcpu);
+    break;
+    }
+    }
+
+    vcpu->arch.vgic_cpu.used_lrs = count;
+
+    /* Nuke remaining LRs */
+    for ( ; count < gic_get_nr_lrs(); count++)
+    vgic_clear_lr(vcpu, count);


Why do you need to nuke the LRs here, don't you always zero them when
clearing it?


We nuke our internal LR copies in here.
It might be interesting to see if we can get rid of those in Xen,
because we can always write to the LRs directly. But this is an
optimization I am not too keen on addressing too early, because this
deviates from the KVM VGIC architecture.


Oh, I thought you were writing back in the hardware when clearing. My 
mistake.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 23/49] ARM: new VGIC: Add IRQ sorting

2018-02-13 Thread Julien Grall



On 13/02/18 14:56, Andre Przywara wrote:

diff --git a/xen/common/list_sort.c b/xen/common/list_sort.c
new file mode 100644
index 00..9c5cc58e43
--- /dev/null
+++ b/xen/common/list_sort.c
@@ -0,0 +1,170 @@
+/*
+ * list_sort.c: merge sort implementation for linked lists
+ * Copied from the Linux kernel (lib/list_sort.c)
+ * (without specific copyright notice there)


I can see you moved from Linux to Xen coding style. Is there any other
changes made?


Just the list of include files, but I didn't touch any actual code.
Will mention this in the commit message for this separate patch.


Can you keep the list coding style in that case please?

Thank you.

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 24/49] ARM: new VGIC: Add IRQ sync/flush framework

2018-02-13 Thread Andre Przywara
Hi,

On 13/02/18 14:31, Julien Grall wrote:
> Hi Andre,
> 
> On 09/02/18 14:39, Andre Przywara wrote:
>> +/* Requires the VCPU's ap_list_lock to be held. */
>> +static void vgic_flush_lr_state(struct vcpu *vcpu)
>> +{
>> +    struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
>> +    struct vgic_irq *irq;
>> +    int count = 0;
>> +
>> +    ASSERT(spin_is_locked(_cpu->ap_list_lock));
>> +
>> +    if ( compute_ap_list_depth(vcpu) > gic_get_nr_lrs() )
>> +    vgic_sort_ap_list(vcpu);
>> +
>> +    list_for_each_entry( irq, _cpu->ap_list_head, ap_list )
>> +    {
>> +    spin_lock(>irq_lock);
>> +
>> +    if ( unlikely(vgic_target_oracle(irq) != vcpu) )
>> +    goto next;
>> +
>> +    /*
>> + * If we get an SGI with multiple sources, try to get
>> + * them in all at once.
>> + */
>> +    do
>> +    {
>> +    vgic_populate_lr(vcpu, irq, count++);
>> +    } while ( irq->source && count < gic_get_nr_lrs() );
>> +
>> +next:
>> +    spin_unlock(>irq_lock);
>> +
>> +    if ( count == gic_get_nr_lrs() )
>> +    {
>> +    if ( !list_is_last(>ap_list, _cpu->ap_list_head) )
>> +    vgic_set_underflow(vcpu);
>> +    break;
>> +    }
>> +    }
>> +
>> +    vcpu->arch.vgic_cpu.used_lrs = count;
>> +
>> +    /* Nuke remaining LRs */
>> +    for ( ; count < gic_get_nr_lrs(); count++)
>> +    vgic_clear_lr(vcpu, count);
> 
> Why do you need to nuke the LRs here, don't you always zero them when
> clearing it?

We nuke our internal LR copies in here.
It might be interesting to see if we can get rid of those in Xen,
because we can always write to the LRs directly. But this is an
optimization I am not too keen on addressing too early, because this
deviates from the KVM VGIC architecture.

Cheers,
Andre.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 23/49] ARM: new VGIC: Add IRQ sorting

2018-02-13 Thread Andre Przywara
Hi,

Christoffer, Eric, Marc,
a question about locking order between multiple IRQs below. Could you
have a brief look, please?

On 13/02/18 12:30, Julien Grall wrote:
> Hi Andre,
> 
> On 09/02/18 14:39, Andre Przywara wrote:
>> Adds the sorting function to cover the case where you have more IRQs
>> to consider than you have LRs. We consider their priorities.
>> This pulls in Linux' list_sort.c , which is a merge sort implementation
>> for linked lists.
>>
>> This is based on Linux commit 8e4447457965, written by Christoffer Dall.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>   xen/arch/arm/vgic/vgic.c    |  59 +++
>>   xen/common/list_sort.c  | 170
>> 
>>   xen/include/xen/list_sort.h |  11 +++
> 
> You need to CC "THE REST" maintainers for this code. It would also make
> sense to have a separate patch for adding list_sort.c

Yeah, will do.

>>   3 files changed, 240 insertions(+)
>>   create mode 100644 xen/common/list_sort.c
>>   create mode 100644 xen/include/xen/list_sort.h
>>
>> diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c
>> index f517df6d00..a4efd1fd03 100644
>> --- a/xen/arch/arm/vgic/vgic.c
>> +++ b/xen/arch/arm/vgic/vgic.c
>> @@ -16,6 +16,7 @@
>>    */
>>     #include 
>> +#include 
>>   #include 
>>     #include 
>> @@ -163,6 +164,64 @@ static struct vcpu *vgic_target_oracle(struct
>> vgic_irq *irq)
>>   return NULL;
>>   }
>>   +/*
>> + * The order of items in the ap_lists defines how we'll pack things
>> in LRs as
>> + * well, the first items in the list being the first things populated
>> in the
>> + * LRs.
>> + *
>> + * A hard rule is that active interrupts can never be pushed out of
>> the LRs
>> + * (and therefore take priority) since we cannot reliably trap on
>> deactivation
>> + * of IRQs and therefore they have to be present in the LRs.
>> + *
>> + * Otherwise things should be sorted by the priority field and the GIC
>> + * hardware support will take care of preemption of priority groups etc.
>> + *
>> + * Return negative if "a" sorts before "b", 0 to preserve order, and
>> positive
>> + * to sort "b" before "a".
> 
> Finally a good explanation of the return value of a sort function :). I
> always get confused what the return is supposed to be.
> 
>> + */
>> +static int vgic_irq_cmp(void *priv, struct list_head *a, struct
>> list_head *b)
>> +{
>> +    struct vgic_irq *irqa = container_of(a, struct vgic_irq, ap_list);
>> +    struct vgic_irq *irqb = container_of(b, struct vgic_irq, ap_list);
>> +    bool penda, pendb;
>> +    int ret;
>> +
>> +    spin_lock(>irq_lock);
>> +    spin_lock(>irq_lock);
> 
> I guess the locking order does not matter here because this is the only
> place where two IRQs lock have to be taken?

Mmh, good question. I guess indeed in practice this will not be a problem:
- As you mentioned this should be the only(?) place where we take
multiple IRQ locks, but that sounds fragile.
- A certain IRQ should only be on one VCPU list at a given point in
time. So there would be no race with two instances of this compare
function trying to lock the same IRQ.

But that sounds a bit dodgy to rely on. It should be relatively straight
forward to fix this with a simple comparison, shouldn't it?
CC:ing Christoffer, Marc and Eric here to see if we should add this (in
KVM as well).

> Also, this will be done with irq disabled right? In that case, may I ask
> for an ASSERT(!local_irq_is_enabled())? Or maybe in vgic_sort_ap_list.

OK.

>> +
>> +    if ( irqa->active || irqb->active )
>> +    {
>> +    ret = (int)irqb->active - (int)irqa->active;
>> +    goto out;
>> +    }
>> +
>> +    penda = irqa->enabled && irq_is_pending(irqa);
>> +    pendb = irqb->enabled && irq_is_pending(irqb);
>> +
>> +    if ( !penda || !pendb )
>> +    {
>> +    ret = (int)pendb - (int)penda;
>> +    goto out;
>> +    }
>> +
>> +    /* Both pending and enabled, sort by priority */
>> +    ret = irqa->priority - irqb->priority;
>> +out:
>> +    spin_unlock(>irq_lock);
>> +    spin_unlock(>irq_lock);
>> +    return ret;
>> +}
>> +
>> +/* Must be called with the ap_list_lock held */
>> +static void vgic_sort_ap_list(struct vcpu *vcpu)
>> +{
>> +    struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
>> +
>> +    ASSERT(spin_is_locked(_cpu->ap_list_lock));
>> +
>> +    list_sort(NULL, _cpu->ap_list_head, vgic_irq_cmp);
>> +}
>> +
>>   /*
>>    * Only valid injection if changing level for level-triggered IRQs
>> or for a
>>    * rising edge.
>> diff --git a/xen/common/list_sort.c b/xen/common/list_sort.c
>> new file mode 100644
>> index 00..9c5cc58e43
>> --- /dev/null
>> +++ b/xen/common/list_sort.c
>> @@ -0,0 +1,170 @@
>> +/*
>> + * list_sort.c: merge sort implementation for linked lists
>> + * Copied from the Linux kernel (lib/list_sort.c)
>> + * (without specific copyright notice there)
> 
> I can see you moved from Linux to Xen coding style. Is there any other
> changes made?


[Xen-devel] [seabios test] 119060: regressions - FAIL

2018-02-13 Thread osstest service owner
flight 119060 seabios real [real]
http://logs.test-lab.xenproject.org/osstest/logs/119060/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop   fail REGR. vs. 115539

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115539
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 115539
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 115539
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 seabios  4a6dbcea3e412fe12effa2f812f50dd7eae90955
baseline version:
 seabios  0ca6d6277dfafc671a5b3718cbeb5c78e2a888ea

Last test of basis   115539  2017-11-03 20:48:58 Z  101 days
Failing since115733  2017-11-10 17:19:59 Z   94 days  119 attempts
Testing same since   118668  2018-02-08 04:50:43 Z5 days7 attempts


People who touched revisions under test:
  Kevin O'Connor 
  Marcel Apfelbaum 
  Michael S. Tsirkin 
  Nikolay Nikolov 
  Paul Menzel 
  Stefan Berger 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-ws16-amd64 fail
 test-amd64-i386-xl-qemuu-ws16-amd64  fail
 test-amd64-amd64-xl-qemuu-win10-i386 fail
 test-amd64-i386-xl-qemuu-win10-i386  fail
 test-amd64-amd64-qemuu-nested-intel  pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 4a6dbcea3e412fe12effa2f812f50dd7eae90955
Author: Nikolay Nikolov 
Date:   Sun Feb 4 17:27:01 2018 +0200

floppy: Use timer_check() in floppy_wait_irq()

Use timer_check() instead of using floppy_motor_counter in BDA for the
timeout check in floppy_wait_irq().

The problem with using floppy_motor_counter was that, after it reaches
0, it immediately stops the floppy motors, which is not what is
supposed to happen on real hardware. Instead, after a timeout (like in
the end of every floppy operation, regardless of the result - success,
timeout or error), the floppy motors must be kept spinning for
additional 2 seconds (the FLOPPY_MOTOR_TICKS). So, now the
floppy_motor_counter is initialized to 255 (the max value) in the
beginning of the floppy operation. For IRQ timeouts, a different
timeout is used, 

Re: [Xen-devel] [PATCH 1/7] x86/alt: Drop unused alternative infrastructure

2018-02-13 Thread Andrew Cooper
On 13/02/18 14:22, Jan Beulich wrote:
 On 12.02.18 at 12:23,  wrote:
>> --- a/xen/include/asm-x86/alternative.h
>> +++ b/xen/include/asm-x86/alternative.h
>> @@ -65,11 +65,6 @@ extern void alternative_instructions(void);
>>  ALTERNATIVE(oldinstr, newinstr1, feature1)\
>>  ALTERNATIVE_N(newinstr2, feature2, 2)
>>  
>> -#define ALTERNATIVE_3(oldinstr, newinstr1, feature1, newinstr2, feature2, \
>> -  newinstr3, feature3)\
>> -ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
>> -ALTERNATIVE_N(newinstr3, feature3, 3)
>> -
>>  /*
>>   * Alternative instructions for different CPU types or capabilities.
>>   *
> While this one is fine, ...
>
>> @@ -118,26 +113,6 @@ extern void alternative_instructions(void);
>> newinstr2, feature2) \
>>   : output : input)
>>  
>> -/*
>> - * This is similar to alternative_io. But it has three features and
>> - * respective instructions.
>> - *
>> - * If CPU has feature3, newinstr3 is used.
>> - * Otherwise, if CPU has feature2, newinstr2 is used.
>> - * Otherwise, if CPU has feature1, newinstr1 is used.
>> - * Otherwise, oldinstr is used.
>> - */
>> -#define alternative_io_3(oldinstr, newinstr1, feature1, newinstr2,  \
>> - feature2, newinstr3, feature3, output, \
>> - input...)  \
>> -asm volatile(ALTERNATIVE_3(oldinstr, newinstr1, feature1,   \
>> -   newinstr2, feature2, newinstr3,  \
>> -   feature3)\
>> - : output : input)
>> -
>> -/* Use this macro(s) if you need more than one output parameter. */
>> -#define ASM_OUTPUT2(a...) a
> ... I'm having patches to post which use both of these, so I'd
> very much prefer them to not go away. It is simply a lack of time
> which resulted in me not having posted that series already.

In which case I'll need to review that patch before commenting on this
one (i.e. whether it actually needs to be an alternative_3, or whether
there is a shorter way to do it).

The problem is that the gas_max() expression expands its parameters 5
times, and ISTR a report on LKML saying that older version of GAS
couldn't cope with the eventual expansion of the 3-replacement version.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/7] x86/alt: Clean up the assembly used to generate alternatives

2018-02-13 Thread Jan Beulich
 >>> On 12.02.18 at 12:23,  wrote:
> --- a/xen/include/asm-x86/alternative-asm.h
> +++ b/xen/include/asm-x86/alternative-asm.h
> @@ -9,60 +9,67 @@
>   * enough information for the alternatives patching code to patch an
>   * instruction. See apply_alternatives().
>   */
> -.macro altinstruction_entry orig alt feature orig_len alt_len
> +.macro altinstruction_entry orig repl feature orig_len repl_len
>  .long \orig - .
> -.long \alt - .
> +.long \repl - .
>  .word \feature
>  .byte \orig_len
> -.byte \alt_len
> +.byte \repl_len
>  .endm
>  
> +#define orig_len   (.L\@_orig_e   - .L\@_orig_s)
> +#define repl_len(nr)   (.L\@_repl_e\()nr  - .L\@_repl_s\()nr)
> +#define decl_repl(insn, nr) .L\@_repl_s\()nr: insn; .L\@_repl_e\()nr:

Wouldn't it work equally well but look slightly less odd if you used
\(nr) instead of \()nr?

> --- a/xen/include/asm-x86/alternative.h
> +++ b/xen/include/asm-x86/alternative.h
> @@ -26,44 +26,50 @@ extern void apply_alternatives(const struct alt_instr 
> *start,
> const struct alt_instr *end);
>  extern void alternative_instructions(void);
>  
> -#define OLDINSTR(oldinstr)  "661:\n\t" oldinstr "\n662:\n"
> +#define OLDINSTR(oldinstr)  ".L%=_orig_s:\n\t" oldinstr 
> "\n.L%=_orig_e:\n"

Isn't this too similar a naming scheme to what the assembler side
uses? I.e. is it entirely certain that no C file will ever (indirectly)
include alternative-asm.h, potentially resulting in a label name
clash then?

Here please also don't forget that you're competing with the
compiler for the .L name space, so some better disambiguation
may be advisable (e.g. starting the names with .LXEN).

> -#define b_replacement(number)   "663"#number
> -#define e_replacement(number)   "664"#number
> +#define repl_s(num) ".L%=_repl_s"#num
> +#define repl_e(num) ".L%=_repl_e"#num

Since you don't (and can't) #undef them, how about alt_repl_s()
and alt_repl_e()?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 24/49] ARM: new VGIC: Add IRQ sync/flush framework

2018-02-13 Thread Julien Grall

Hi Andre,

On 09/02/18 14:39, Andre Przywara wrote:

+/* Requires the VCPU's ap_list_lock to be held. */
+static void vgic_flush_lr_state(struct vcpu *vcpu)
+{
+struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+struct vgic_irq *irq;
+int count = 0;
+
+ASSERT(spin_is_locked(_cpu->ap_list_lock));
+
+if ( compute_ap_list_depth(vcpu) > gic_get_nr_lrs() )
+vgic_sort_ap_list(vcpu);
+
+list_for_each_entry( irq, _cpu->ap_list_head, ap_list )
+{
+spin_lock(>irq_lock);
+
+if ( unlikely(vgic_target_oracle(irq) != vcpu) )
+goto next;
+
+/*
+ * If we get an SGI with multiple sources, try to get
+ * them in all at once.
+ */
+do
+{
+vgic_populate_lr(vcpu, irq, count++);
+} while ( irq->source && count < gic_get_nr_lrs() );
+
+next:
+spin_unlock(>irq_lock);
+
+if ( count == gic_get_nr_lrs() )
+{
+if ( !list_is_last(>ap_list, _cpu->ap_list_head) )
+vgic_set_underflow(vcpu);
+break;
+}
+}
+
+vcpu->arch.vgic_cpu.used_lrs = count;
+
+/* Nuke remaining LRs */
+for ( ; count < gic_get_nr_lrs(); count++)
+vgic_clear_lr(vcpu, count);


Why do you need to nuke the LRs here, don't you always zero them when 
clearing it?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 25/49] ARM: new VGIC: Add GICv2 world switch backend

2018-02-13 Thread Julien Grall

Hi,

On 09/02/18 14:39, Andre Przywara wrote:

Processing maintenance interrupts and accessing the list registers
are dependent on the host's GIC version.
Introduce vgic-v2.c to contain GICv2 specific functions.
Implement the GICv2 specific code for syncing the emulation state
into the VGIC registers.
This also adds the hook to let Xen setup the host GIC addresses.

This is based on Linux commit 140b086dd197, written by Marc Zyngier.

Signed-off-by: Andre Przywara 
---
  xen/arch/arm/vgic/vgic-v2.c | 261 
  xen/arch/arm/vgic/vgic.c|  20 
  xen/arch/arm/vgic/vgic.h|   8 ++
  3 files changed, 289 insertions(+)
  create mode 100644 xen/arch/arm/vgic/vgic-v2.c

diff --git a/xen/arch/arm/vgic/vgic-v2.c b/xen/arch/arm/vgic/vgic-v2.c
new file mode 100644
index 00..10fc467ffa
--- /dev/null
+++ b/xen/arch/arm/vgic/vgic-v2.c
@@ -0,0 +1,261 @@
+/*
+ * Copyright (C) 2015, 2016 ARM Ltd.
+ * Imported from Linux ("new" KVM VGIC) and heavily adapted to Xen.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vgic.h"
+
+#define GICH_ELRSR0 0x30
+#define GICH_ELRSR1 0x34
+#define GICH_LR00x100
+
+#define GICH_LR_VIRTUALID   (0x3ff << 0)
+#define GICH_LR_PHYSID_CPUID_SHIFT  (10)
+#define GICH_LR_PHYSID_CPUID(0x3ff << GICH_LR_PHYSID_CPUID_SHIFT)
+#define GICH_LR_PRIORITY_SHIFT  23
+#define GICH_LR_STATE   (3 << 28)
+#define GICH_LR_PENDING_BIT (1 << 28)
+#define GICH_LR_ACTIVE_BIT  (1 << 29)
+#define GICH_LR_EOI (1 << 19)
+#define GICH_LR_HW  (1 << 31)


Can we define them in either in gic.h or a new header gic-v2.h?


+
+static struct {
+bool enabled;
+paddr_t dbase;  /* Distributor interface address */
+paddr_t cbase;  /* CPU interface address & size */
+paddr_t csize;
+paddr_t vbase;  /* Virtual CPU interface address */
+void __iomem *hbase;/* Hypervisor control interface */
+
+/* Offset to add to get an 8kB contiguous region if GIC is aliased */
+uint32_t aliased_offset;
+} gic_v2_hw_data;
+
+void vgic_v2_setup_hw(paddr_t dbase, paddr_t cbase, paddr_t csize,
+  paddr_t vbase, void __iomem *hbase,
+  uint32_t aliased_offset)
+{
+gic_v2_hw_data.enabled = true;
+gic_v2_hw_data.dbase = dbase;
+gic_v2_hw_data.cbase = cbase;
+gic_v2_hw_data.csize = csize;
+gic_v2_hw_data.vbase = vbase;
+gic_v2_hw_data.hbase = hbase;
+gic_v2_hw_data.aliased_offset = aliased_offset;
+}
+
+void vgic_v2_set_underflow(struct vcpu *vcpu)
+{
+gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 1);
+}
+
+/*
+ * transfer the content of the LRs back into the corresponding ap_list:
+ * - active bit is transferred as is
+ * - pending bit is
+ *   - transferred as is in case of edge sensitive IRQs
+ *   - set to the line-level (resample time) for level sensitive IRQs
+ */
+void vgic_v2_fold_lr_state(struct vcpu *vcpu)


I am wondering how much we could share this code with vgic_v3_fold_lr_state.


+{
+struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+struct vgic_v2_cpu_if *cpuif = _cpu->vgic_v2;
+int lr;


unsigned please.


+unsigned long flags;
+
+cpuif->vgic_hcr &= ~GICH_HCR_UIE;
+
+for ( lr = 0; lr < vgic_cpu->used_lrs; lr++ )
+{
+u32 val = cpuif->vgic_lr[lr];
+u32 intid = val & GICH_LR_VIRTUALID;
+struct vgic_irq *irq;
+
+irq = vgic_get_irq(vcpu->domain, vcpu, intid);
+
+spin_lock_irqsave(>irq_lock, flags);
+
+/* Always preserve the active bit */
+irq->active = !!(val & GICH_LR_ACTIVE_BIT);
+
+/* Edge is the only case where we preserve the pending bit */
+if ( irq->config == VGIC_CONFIG_EDGE && (val & GICH_LR_PENDING_BIT) )
+{
+irq->pending_latch = true;
+
+if ( vgic_irq_is_sgi(intid) )
+{
+u32 cpuid = val & GICH_LR_PHYSID_CPUID;
+
+cpuid >>= GICH_LR_PHYSID_CPUID_SHIFT;
+irq->source |= (1 << cpuid);
+}
+}
+


May I ask to keep the big comments from KVM around? It looks quite 
useful to have it.



+if ( irq->hw && irq->config == VGIC_CONFIG_LEVEL &&


You 

Re: [Xen-devel] [PATCH v3 00/17] Alternative Meltdown mitigation

2018-02-13 Thread Juergen Gross
On 13/02/18 15:16, Jan Beulich wrote:
 On 13.02.18 at 12:36,  wrote:
>> On 12/02/18 18:54, Dario Faggioli wrote:
>>> On Fri, 2018-02-09 at 15:01 +0100, Juergen Gross wrote:
 This series is available via github:

 https://github.com/jgross1/xen.git xpti

 Dario wants to do some performance tests for this series to compare
 performance with Jan's series with all optimizations posted.

>>> And some of this is indeed ready.
>>>
>>> So, this is again on my testbox, with 16 pCPUs and 12GB of RAM, and I
>>> used a guest with 16 vCPUs and 10GB of RAM.
>>>
>>> I benchmarked Jan's patch *plus* all the optimizations and overhead
>>> mitigation patches he posted on xen-devel (the ones that are already in
>>> staging, and also the ones that are not yet there). That's "XPTI-Light" 
>>> in the table and in the graphs. Booting this with 'xpti=false' is
>>> considered the baseline, while booting with 'xpti=true' is the actual
>>> thing we want to measure. :-)
>>>
>>> Then I ran the same benchmarks on Juergen's branch above, enabled at
>>> boot. That's "XPYI" in the table and graphs (yes, I know, sorry for the
>>> typo!).
>>>
>>> http://openbenchmarking.org/result/1802125-DARI-180211144 
>>>
>> http://openbenchmarking.org/result/1802125-DARI-180211144_hgv=XPTI-Light+x
>>  
>> pti%3Dfalse_nor=y_hgv=XPTI-Light+xpti%3Dfalse
>>
>> ...
>>
>>> Or, actually, that's not it! :-O In fact, right while I was writing
>>> this report, it came out on IRC that something can be done, on
>>> Juergen's XPTI series, to mitigate the performance impact a bit.
>>>
>>> Juergen sent me a patch already, and I'm re-running the benchmarks with
>>> that applied. I'll let know how the results ends up looking like.
>>
>> It turned out the results are not basically different. So the general
>> problem with context switches is still there (which I expected, BTW).
>>
>> So I guess the really bad results with benchmarks triggering a lot of
>> vcpu scheduling show that my approach isn't going to fly, as the most
>> probable cause for the slow context switches are the introduced
>> serializing instructions (LTR, WRMSRs) which can't be avoided when we
>> want to use per-vcpu stacks.
>>
>> OTOH the results of the other benchmarks showing some advantage over
>> Jan's solution indicate there is indeed an aspect which can be improved.
>>
>> Instead of preferring one approach over the other I have thought about
>> a way to use the best parts of each solution in a combined variant. In
>> case nobody is feeling strong to pursue my current approach further I'd
>> like to suggest the following scheme:
>>
>> - Whenever a L4 page table of the guest is in use on one physical cpu
>>   only use the L4 shadow cache of my series in order to avoid having to
>>   copy the L4 contents each time the hypervisor is left.
>>
>> - As soon as a L4 page table is being activated on a second cpu fall
>>   back to use the per-cpu page table on that cpu (the cpu already using
>>   the L4 page table can continue doing so).
> 
> Would the first of these CPUs continue to run on the shadow L4 in
> that case? If so, would there be no synchronization issues? If not,
> how do you envision "telling" it to move to the per-CPU L4 (which,
> afaict, includes knowing which vCPU / pCPU that is)?

I thought to let the CPU running on the shadow L4. This L4 already is
configured for the CPU it is being used on, so we just have to avoid
to activate it on a second CPU.

I don't see synchronization issues as all guest L4 modifications would
be mirrored in the shadow, as done in my series already.

>> - Before activation of a L4 shadow page table it is modified to map the
>>   per-cpu data needed in guest mode for the local cpu only.
> 
> I had been considering to do this in XPTI light for other purposes
> too (for example it might be possible to short circuit the guest
> system call path to get away without multiple page table switches).
> We really first need to settle on how much we feel is safe to expose
> while the guest is running. So far I've been under the impression
> that people actually think we should further reduce exposed pieces
> of code/data, rather than widen the "window".

I would like to have some prepared L3 page tables for each cpu meant to
be hooked into the correct shadow L4 slots. The shadow L4 should map as
few hypervisor parts as possible (again like in my current series).

>> - Use INVPCID instead of %cr4 PGE toggling to speed up purging global
>>   TLB entries (depending on the availability of the feature, of course).
> 
> That's something we should do independent of what XPTI model
> we'd like to retain long term.

Right. That was just for completeness.

>> - Use the PCID feature for being able to avoid purging TLB entries which
>>   might be needed later (depending on hardware again).
> 
> Which first of all raises the question: Does PCID (other than the U
> bit) prevent use of TLB entries in the wrong context? IOW is the
> PCID 

Re: [Xen-devel] [PATCH 2/7] x86/alt: Clean up struct alt_instr and its users

2018-02-13 Thread Jan Beulich
 >>> On 12.02.18 at 12:23,  wrote:
> * Rename some fields for consistency and clarity, and use standard types.
>  * Don't opencode the use of ALT_{ORIG,REPL}_PTR().
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/7] x86/alt: Drop unused alternative infrastructure

2018-02-13 Thread Jan Beulich
>>> On 12.02.18 at 12:23,  wrote:
> --- a/xen/include/asm-x86/alternative.h
> +++ b/xen/include/asm-x86/alternative.h
> @@ -65,11 +65,6 @@ extern void alternative_instructions(void);
>   ALTERNATIVE(oldinstr, newinstr1, feature1)\
>   ALTERNATIVE_N(newinstr2, feature2, 2)
>  
> -#define ALTERNATIVE_3(oldinstr, newinstr1, feature1, newinstr2, feature2, \
> -   newinstr3, feature3)\
> - ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
> - ALTERNATIVE_N(newinstr3, feature3, 3)
> -
>  /*
>   * Alternative instructions for different CPU types or capabilities.
>   *

While this one is fine, ...

> @@ -118,26 +113,6 @@ extern void alternative_instructions(void);
>  newinstr2, feature2) \
>: output : input)
>  
> -/*
> - * This is similar to alternative_io. But it has three features and
> - * respective instructions.
> - *
> - * If CPU has feature3, newinstr3 is used.
> - * Otherwise, if CPU has feature2, newinstr2 is used.
> - * Otherwise, if CPU has feature1, newinstr1 is used.
> - * Otherwise, oldinstr is used.
> - */
> -#define alternative_io_3(oldinstr, newinstr1, feature1, newinstr2,   \
> -  feature2, newinstr3, feature3, output, \
> -  input...)  \
> - asm volatile(ALTERNATIVE_3(oldinstr, newinstr1, feature1,   \
> -newinstr2, feature2, newinstr3,  \
> -feature3)\
> -  : output : input)
> -
> -/* Use this macro(s) if you need more than one output parameter. */
> -#define ASM_OUTPUT2(a...) a

... I'm having patches to post which use both of these, so I'd
very much prefer them to not go away. It is simply a lack of time
which resulted in me not having posted that series already.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 00/17] Alternative Meltdown mitigation

2018-02-13 Thread Jan Beulich
>>> On 13.02.18 at 12:36,  wrote:
> On 12/02/18 18:54, Dario Faggioli wrote:
>> On Fri, 2018-02-09 at 15:01 +0100, Juergen Gross wrote:
>>> This series is available via github:
>>>
>>> https://github.com/jgross1/xen.git xpti
>>>
>>> Dario wants to do some performance tests for this series to compare
>>> performance with Jan's series with all optimizations posted.
>>>
>> And some of this is indeed ready.
>> 
>> So, this is again on my testbox, with 16 pCPUs and 12GB of RAM, and I
>> used a guest with 16 vCPUs and 10GB of RAM.
>> 
>> I benchmarked Jan's patch *plus* all the optimizations and overhead
>> mitigation patches he posted on xen-devel (the ones that are already in
>> staging, and also the ones that are not yet there). That's "XPTI-Light" 
>> in the table and in the graphs. Booting this with 'xpti=false' is
>> considered the baseline, while booting with 'xpti=true' is the actual
>> thing we want to measure. :-)
>> 
>> Then I ran the same benchmarks on Juergen's branch above, enabled at
>> boot. That's "XPYI" in the table and graphs (yes, I know, sorry for the
>> typo!).
>> 
>> http://openbenchmarking.org/result/1802125-DARI-180211144 
>> 
> http://openbenchmarking.org/result/1802125-DARI-180211144_hgv=XPTI-Light+x
>  
> pti%3Dfalse_nor=y_hgv=XPTI-Light+xpti%3Dfalse
> 
> ...
> 
>> Or, actually, that's not it! :-O In fact, right while I was writing
>> this report, it came out on IRC that something can be done, on
>> Juergen's XPTI series, to mitigate the performance impact a bit.
>> 
>> Juergen sent me a patch already, and I'm re-running the benchmarks with
>> that applied. I'll let know how the results ends up looking like.
> 
> It turned out the results are not basically different. So the general
> problem with context switches is still there (which I expected, BTW).
> 
> So I guess the really bad results with benchmarks triggering a lot of
> vcpu scheduling show that my approach isn't going to fly, as the most
> probable cause for the slow context switches are the introduced
> serializing instructions (LTR, WRMSRs) which can't be avoided when we
> want to use per-vcpu stacks.
> 
> OTOH the results of the other benchmarks showing some advantage over
> Jan's solution indicate there is indeed an aspect which can be improved.
> 
> Instead of preferring one approach over the other I have thought about
> a way to use the best parts of each solution in a combined variant. In
> case nobody is feeling strong to pursue my current approach further I'd
> like to suggest the following scheme:
> 
> - Whenever a L4 page table of the guest is in use on one physical cpu
>   only use the L4 shadow cache of my series in order to avoid having to
>   copy the L4 contents each time the hypervisor is left.
> 
> - As soon as a L4 page table is being activated on a second cpu fall
>   back to use the per-cpu page table on that cpu (the cpu already using
>   the L4 page table can continue doing so).

Would the first of these CPUs continue to run on the shadow L4 in
that case? If so, would there be no synchronization issues? If not,
how do you envision "telling" it to move to the per-CPU L4 (which,
afaict, includes knowing which vCPU / pCPU that is)?

> - Before activation of a L4 shadow page table it is modified to map the
>   per-cpu data needed in guest mode for the local cpu only.

I had been considering to do this in XPTI light for other purposes
too (for example it might be possible to short circuit the guest
system call path to get away without multiple page table switches).
We really first need to settle on how much we feel is safe to expose
while the guest is running. So far I've been under the impression
that people actually think we should further reduce exposed pieces
of code/data, rather than widen the "window".

> - Use INVPCID instead of %cr4 PGE toggling to speed up purging global
>   TLB entries (depending on the availability of the feature, of course).

That's something we should do independent of what XPTI model
we'd like to retain long term.

> - Use the PCID feature for being able to avoid purging TLB entries which
>   might be needed later (depending on hardware again).

Which first of all raises the question: Does PCID (other than the U
bit) prevent use of TLB entries in the wrong context? IOW is the
PCID check done early (during TLB lookup) rather than late (during
insn retirement)?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] pvh/dom0: whitelist PVH Dom0 ACPI tables

2018-02-13 Thread Jan Beulich
>>> On 13.02.18 at 12:27,  wrote:
> On Tue, Feb 13, 2018 at 04:04:17AM -0700, Jan Beulich wrote:
>> >>> On 13.02.18 at 10:59,  wrote:
>> > On Tue, Feb 13, 2018 at 02:29:08AM -0700, Jan Beulich wrote:
>> >> >>> On 08.02.18 at 13:25,  wrote:
>> >> > Signed-off-by: Roger Pau Monné 
>> >> 
>> >> A change like this should not come without description, providing a
>> >> reason for the change. Otherwise how will someone wanting to
>> >> understand the change in a couple of years actually be able to
>> >> make any sense of it. This is in particular because I continue to be
>> >> not fully convinced that white listing is appropriate in the Dom0
>> >> case (and for the record I'm similarly unconvinced that black listing
>> >> is the best choice, yet obviously we need to pick on of the two).
>> > 
>> > I'm sorry, I thought we agreed at the summit to convert this to
>> > whitelisting because it was likely better to simply not expose unknown
>> > ACPI tables to guests.
>> 
>> "to guests" != "to Dom0".
>> 
>> > I guess the commit message could be something like:
>> > 
>> > "The following list of whitelisted APIC tables are either known to work
>> > or don't require any resources to be mapped in either the IO or the
>> > memory space.
>> 
>> Even if the white listing vs black listing question wasn't still
>> undecided, I think we should revert the patch in favor of one
>> with a description. The one above might be fine with "ACPI" in
>> place of "APIC" as far as tables actively white listed are
>> concerned, but then it still remains open why certain tables
>> haven't been included. I'm in particular worried about various
>> APEI related tables, but invisibility of e.g. an IBFT could also
>> lead to boot problems.
> 
> Regarding APEI I think ERST, EINJ and HEST could be passed through,
> BERT however requires that the BOOT Error Region is mapped into Dom0
> p2m.
> 
> Since PVH Dom0 creation still ends up in a panic, I see no problem in
> adding those in follow up patches.
> 
> IBFT also looks safe to pass through.

But you realize I've named only the few that came to mind
immediately?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC XEN PATCH v4 00/41] Add vNVDIMM support to HVM domains

2018-02-13 Thread Jan Beulich
>>> On 13.02.18 at 12:13,  wrote:
> On Tue, Feb 13, 2018 at 04:05:45AM -0700, Jan Beulich wrote:
>> >>> On 13.02.18 at 11:29,  wrote:
>> > On Tue, Feb 13, 2018 at 03:06:24AM -0700, Jan Beulich wrote:
>> >> >>> On 12.02.18 at 11:05,  wrote:
>> >> > If you map the NVDIMM as MMIO to Dom0 you don't need the M2P entries
>> >> > IIRC, and if it's mapped using 1GB pages it shouldn't use that much
>> >> > memory for the page tables (ie: you could just use normal RAM for the
>> >> > page tables that map the NVDIMM IMO). Of course that only applies to
>> >> > PVH/HVM.
>> >> 
>> >> But in order to use (part of) it in a RAM-like manner we need struct
>> >> page_info for it.
>> > 
>> > I guess the main use of this would be to grant NVDIMM pages? And
>> > without a page_info that's not possible.
>> 
>> Why grant? Simply giving such a page as RAM to a guest would
>> already be a problem without struct page_info (as then we can't
>> track the page owner, nor can we refcount the page).
> 
> My point was to avoid doing that, and always assign the pages as
> MMIO, which IIRC doesn't require a struct page_info.

MMIO pages can't be used for things like page tables, because of
the refcounting that's needed. The page being like RAM, however,
implies that the guest needs to be able to use it as anything a RAM
page can be used for.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 119036: tolerable FAIL - PUSHED

2018-02-13 Thread osstest service owner
flight 119036 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/119036/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 118942

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 118942
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 118942
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 118942
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 118942
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 118942
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 118942
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 qemuu7d848450b6e2a3e14a776b4c93704710e7f3d233
baseline version:
 qemuuc7b02d7d032d6022060e4b393827c963c93ce63f

Last test of basis   118942  2018-02-11 21:18:32 Z1 days
Failing since119000  2018-02-12 13:17:55 Z0 days2 attempts
Testing same since   119036  2018-02-13 00:48:14 Z0 days1 attempts


People who touched revisions under test:
  Alexey Kardashevskiy 
  Andreas Gustafsson 
  Cole Robinson 
  Daniel Henrique Barboza 
  Daniel P. Berrangé 
  David Gibson 
  Eric Blake 
  Greg Kurz 
  Laurent Vivier 
  Mark Cave-Ayland 
  Michael Tokarev 
  Peter Maydell 

Re: [Xen-devel] [PATCH] libxl: add libxl__is_driver_domain function

2018-02-13 Thread Oleksandr Grytsov
On Tue, Feb 13, 2018 at 2:06 PM, Wei Liu  wrote:

> On Tue, Feb 06, 2018 at 03:08:45PM +0200, Oleksandr Grytsov wrote:
> > On Tue, Feb 6, 2018 at 2:36 PM, Wei Liu  wrote:
> >
> > > On Thu, Dec 14, 2017 at 04:14:12PM +0200, Oleksandr Grytsov wrote:
> > > > From: Oleksandr Grytsov 
> > > >
> > > > We have following arm-based setup:
> > > >
> > > > - Dom0 with xen and xen tools;
> > > > - Dom1 with device backends (but it is not the driver domain);
> > >
> > > What is your definition of a "driver domain"? What does it do in this
> > > case?
> > >
> > > I seem to have seen people use this term in different contexts to mean
> > > slightly different things. I need to figure out what you actually mean
> > > first.
> > >
> > >
> > I see in the libxl/xl sources that closing PV devices is done differently
> > in case backends are in Dom0 and are in other domain. It is called as
> > driver domain in the sources. So, I don't have clear understanding
> > what does it mean. In our setup backends are in Dom1 and xl is in Dom0.
> > And I see that xl dosn't close PV device on domain reboot or shutdown.
>
> Do you run xl devd in your backend domain?
>
> Wei.
>

No I don't

-- 
Best Regards,
Oleksandr Grytsov.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  1   2   >