date:20180410

[Xen-devel] [ovmf test] 122158: all pass - PUSHED

2018-04-10 Thread osstest service owner

flight 122158 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122158/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 13d909f89a3cee1c1f6b851a4cda7bd1a44e90ae
baseline version:
 ovmf 64797018df0cf5c1f11523bb575355aba918b940

Last test of basis   122135  2018-04-09 12:51:49 Z1 days
Testing same since   122158  2018-04-10 09:24:15 Z0 days1 attempts


People who touched revisions under test:
  Carsey, Jaben 
  Feng, YunhuaX 
  Jaben Carsey 
  Yonghong Zhu 
  Yunhua Feng 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   64797018df..13d909f89a  13d909f89a3cee1c1f6b851a4cda7bd1a44e90ae -> 
xen-tested-master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.11 0/7] SUPPORT.md: Format as part of html docs

2018-04-10 Thread Juergen Gross

On 10/04/18 19:22, Ian Jackson wrote:
> The SUPPORT.md document (introduced in 4.10) does not appear here
>   http://xenbits.xen.org/docs/
> In this series I fix this.
> 
> This is a prerequisite for my work to generate a matrix representing
> the cross-version feature support status, because that cross-version
> matrix wants to contain hyperlinks into the appropriate bits of (html)
> SUPPORT.md.
> 
> This series should be backported to 4.10.  If and when a SUPPORT.md is
> provided for earlier releases, it should be backported to those too.
> 
> There are three patches fixing minor syntax trouble in SUPPORT.md, and
> four build system changes.  I hope the release ack will be a formality
> :-).
> 
> Ian.
> 

For the series:

Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [libvirt test] 122154: regressions - trouble: blocked/broken/pass

2018-04-10 Thread osstest service owner

flight 122154 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122154/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf  broken
 build-armhf   5 host-build-prep  fail REGR. vs. 122005

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-libvirt-qcow2 12 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-qcow2 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  3f204e4de401d8263f9d790cd55f1579d959f94f
baseline version:
 libvirt  4300a56378cb4401ac2b66be5da985e94a4ca90c

Last test of basis   122005  2018-04-07 03:34:15 Z3 days
Testing same since   122154  2018-04-10 04:23:02 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Daniel P. BerrangÃ© 
  Erik Skultety 
  Jim Fehlig 
  John Ferlan 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  broken  
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  blocked 
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm blocked 
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt pass
 test-armhf-armhf-libvirt blocked 
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   pass
 test-armhf-armhf-libvirt-raw blocked 
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master

Re: [Xen-devel] [PATCH for-4.11] x86/VT-x: Fix determination of EFER.LMA in vmcs_dump_vcpu()

2018-04-10 Thread Tian, Kevin

> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Tuesday, April 10, 2018 4:44 PM
> 
> >>> On 09.04.18 at 19:56,  wrote:
> > --- a/xen/arch/x86/hvm/vmx/vmcs.c
> > +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> > @@ -1788,7 +1788,10 @@ void vmcs_dump_vcpu(struct vcpu *v)
> >  vmentry_ctl = vmr32(VM_ENTRY_CONTROLS),
> >  vmexit_ctl = vmr32(VM_EXIT_CONTROLS);
> >  cr4 = vmr(GUEST_CR4);
> > -efer = vmr(GUEST_EFER);
> > +
> > +/* EFER.LMA is read as zero, and is loaded from vmentry_ctl on entry.
> */
> > +BUILD_BUG_ON(VM_ENTRY_IA32E_MODE << 1 != EFER_LMA);
> > +efer = vmr(GUEST_EFER) | ((vmentry_ctl & VM_ENTRY_IA32E_MODE)
> << 1);
> 
> I have to admit that - despite the BUILD_BUG_ON() - I dislike the
> literal 1 here, which would better be
> (_EFER_LMA - _VM_ENTRY_IA32E_MODE), albeit the latter doesn't
> exist, so perhaps
> 
> efer = vmr(GUEST_EFER) | ((vmentry_ctl & VM_ENTRY_IA32E_MODE) *
> (EFER_LMA / VM_ENTRY_IA32E_MODE));
> 
> or the same expressed through MASK_EXTR() / MASK_INSR()? But
> it's the VMX maintainers to judge anyway.
> 

using 1 is fine to me, with intention well explained with BUILD_BUG_ON.
as long as BUILD_BUG_ON is still a valid usage, I'm OK with current one:

Acked-by: Kevin Tian 

Thanks
kevin

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-3.18 test] 122145: regressions - FAIL

2018-04-10 Thread osstest service owner

flight 122145 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122145/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-pvops 6 kernel-build fail REGR. vs. 121320

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 121320
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 121320
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 121320
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 121320
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 build-arm64-pvops 6 kernel-build fail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 linux3f2968010fda1eb82de1ff79c7384e3329f96673
baseline version:
 linux9764536dc592144beee43c987fef45d2e91ca55c

Last test of basis   121320  2018-03-28 02:34:55 Z   13 days
Testing same since   122094  2018-04-08 10:18:48 Z2 days4 attempts


People who touched revisions under test:
  Alexander Gerasiov 
  Alexey Kodanev 
  Andrew Morton 
  Andri Yngvason 
  Andy Lutomirski 
  Arend van Spriel 
  Arkadi Sharshevsky 
  Arvind Yadav 
  Ben Hutchings 
  Boris Brezillon 
  Christophe JAILLET 
  Clemens Werther 
  Colin Ian King 
  Dan Carpenter 
  Daniel Mentz 
  Daniel Vetter 
  David Ahern 
  David Lechner 
  David S. Miller 
  Dennis Wassenberg 
  Dmitry Torokhov 
  Doug Gilbert 
  Eric Biggers 
  Eric Dumazet

[Xen-devel] [PATCH] xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_reg_add

2018-04-10 Thread Jia-Ju Bai

pcistub_reg_add() is never called in atomic context.

pcistub_reg_add() is only called by pcistub_quirk_add, which is 
only set in DRIVER_ATTR().

Despite never getting called from atomic context,
pcistub_reg_add() calls kzalloc() with GFP_ATOMIC,
which does not sleep for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL,
which can sleep and improve the possibility of sucessful allocation.

This is found by a static analysis tool named DCNS written by myself.
And I also manually check it.

Signed-off-by: Jia-Ju Bai 
---
 drivers/xen/xen-pciback/pci_stub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c 
b/drivers/xen/xen-pciback/pci_stub.c
index 9e480fd..9d92bed 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -1149,7 +1149,7 @@ static int pcistub_reg_add(int domain, int bus, int slot, 
int func,
}
dev = psdev->dev;
 
-   field = kzalloc(sizeof(*field), GFP_ATOMIC);
+   field = kzalloc(sizeof(*field), GFP_KERNEL);
if (!field) {
err = -ENOMEM;
goto out;
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/4] xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_probe

2018-04-10 Thread Jia-Ju Bai




On 2018/4/10 23:01, Boris Ostrovsky wrote:

On 04/10/2018 10:31 AM, Jia-Ju Bai wrote:



On 2018/4/10 22:27, Boris Ostrovsky wrote:

On 04/09/2018 11:03 AM, Jia-Ju Bai wrote:

pcistub_probe() is never called in atomic context.
This function is only set as ".probe" in struct pci_driver.

Despite never getting called from atomic context,
pcistub_probe() calls kmalloc() with GFP_ATOMIC,
which does not sleep for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL,
which can sleep and improve the possibility of sucessful allocation.

This is found by a static analysis tool named DCNS written by myself.
And I also manually check it.

Signed-off-by: Jia-Ju Bai 

What about use of GFP_ATOMIC in pcistub_reg_add()?

Thanks for your reply :)
I find pcistub_reg_add() is called by pcistub_quirk_add().
And pcistub_quirk_add() is called in the macro DRIVER_ATTR().
I am not sure whether DRIVER_ATTR() can make the function called in
atomic context,
so I do not analyze it in my tool.

I don't see why it needs to be ATOMIC, it's sysfs access. Can you send a
patch to fix it as well?


Okay, I will send a patch for it soon.
You can have a look :)


Best wishes,
Jia-Ju Bai

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 122144: tolerable FAIL - PUSHED

2018-04-10 Thread osstest service owner

flight 122144 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122144/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 120095
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 120095
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 120095
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 120095
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 120095
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 120095
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 qemuu915d34c5f99b0ab91517c69f54272bfdb6ca2b32
baseline version:
 qemuu6697439794f72b3501ee16bb95d16854f9981421

Last test of basis   120095  2018-02-28 13:46:33 Z   41 days
Failing since120146  2018-03-02 10:10:57 Z   39 days   27 attempts
Testing same since   122144  2018-04-09 19:01:09 Z1 days1 attempts


People who touched revisions under test:
  Alberto Garcia 
  Alex BennÃÂ©e 
  Alex BennÃ©e 
  Alex Williamson 
  Alexandro Sanchez Bach 
  Alexey Kardashevskiy 
  Alistair Francis 
  Alistair Francis 
  Andrew Jones 
  Andrey Smirnov 
  Anton Nefedov 
  BALATON Zoltan 
  Bastian Koppelmann

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

[Adding Andrew, not because I expect anything, but just because we've 
 chatted about this issue on IRC :-) ]

On Tue, 2018-04-10 at 22:37 +0200, Olaf Hering wrote:
> On Tue, Apr 10, Dario Faggioli wrote:
> 
> BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc)));
>
> (XEN) Xen BUG at sched_credit.c:876
> (XEN) [ Xen-4.11.20180410T125709.50f8ba84a5-
> 3.bug1087289_411  x86_64  debug=y   Not tainted ]
> (XEN) CPU:118
> (XEN) RIP:e008:[]
> sched_credit.c#csched_vcpu_migrate+0x27/0x51
> ...
> (XEN) Xen call trace:
> (XEN)[]
> sched_credit.c#csched_vcpu_migrate+0x27/0x51
> (XEN)[] schedule.c#vcpu_move_locked+0xbb/0xc2
> (XEN)[] schedule.c#vcpu_migrate+0x226/0x25b
> (XEN)[] context_saved+0x8d/0x94
> (XEN)[] context_switch+0xe66/0xeb0
> (XEN)[] schedule.c#schedule+0x5f4/0x627
> (XEN)[] softirq.c#__do_softirq+0x85/0x90
> (XEN)[] do_softirq+0x13/0x15
> (XEN)[] vmx_asm_do_vmentry+0x2b/0x30
>
Hey... unless I've really put there a totally bogous BUG_ON(), this
looks interesting and potentially useful.

It says that the vcpu which is being context switched out, and on which
we are calling vcpu_migrate() on, because we found it to be
VPF_migrating, is actually in the runqueue already when we actually get
to execute vcpu_migrate()->vcpu_move_locked().

Mmm... let's see.

 CPU A  CPU B
 .  .
 schedule(current == v) vcpu_set_affinity(v)
  prev = current // == v .
  schedule_lock(CPU A)   .
   csched_schedule() schedule_lock(CPU A)
   if (runnable(v))  //YES   x
runq_insert(v)   x
   return next != v  x
  schedule_unlock(CPU A) x // takes the lock
  context_switch(prev,next)  set_bit(v, VPF_migrating)  [*]
   context_saved(prev) // still == v .
v->is_running = 0schedule_unlock(CPU A)
SMP_MB   .
if (test_bit(v, VPF_migrating)) // YES!!
 vcpu_migrate(v) .
  for {  .
   schedule_lock(CPU A)  .
   SCHED_OP(v, pick_cpu) .
set_bit(v, CSCHED_MIGRATING) .
return CPU C .
   pick_called = 1   .
   schedule_unlock(CPU A).
   schedule_lock(CPU A + CPU C)  .
   if (pick_called && ...) // YES.
break.
  }  .
  // v->is_running is 0  .
  //!test_and_clear(v, VPF_migrating)) is false!!
  clear_bit(v, VPF_migrating).
  vcpu_move_locked(v, CPU C) .
  BUG_ON(__vcpu_on_runq(v))  .

[*] after this point, and until someone manages to call vcpu_sleep(),  
  v sits in CPU A's runqueue with the VPF_migrating pause flag set

So, basically, the race is between context_saved() and
vcpu_set_affinity(). Basically, vcpu_set_affinity() sets the
VPF_migrating pause flags on a vcpu in a runqueue, with the intent of
letting either a vcpu_sleep_nosync() or a reschedule remove it from
there, but context_saved() manage to see the flag, before the removal
can happen.

And I think this explains also the original BUG at sched_credit.c:1694
(it's just a bit more involved).

As it can be seen above (and also in the code comment in ) there is a
barrier (which further testify that this is indeed a tricky passage),
but I guess it is not that effective! :-/

TBH, I have actually never fully understood what that comment really
meant, what the barrier was protecting, and how... e.g., isn't it
missing its paired one? In fact, there's another comment, clearly
related, right in vcpu_set_affinity(). But again I'm a bit at loss at
properly figuring out what the big idea is.

George, what do you think? Does this make sense?

Well, I'll think more about this, and to a possible fix, tomorrow
morning.

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf baseline-only test] 74572: all pass

2018-04-10 Thread Platform Team regression test user

This run is configured for baseline tests only.

flight 74572 ovmf real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/74572/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 64797018df0cf5c1f11523bb575355aba918b940
baseline version:
 ovmf 95cc4962167572089a99be324574094ba22415ad

Last test of basis74565  2018-04-09 12:19:59 Z1 days
Testing same since74572  2018-04-10 09:24:06 Z0 days1 attempts


People who touched revisions under test:
  Carsey, Jaben 
  Jaben Carsey 
  Yonghong Zhu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


commit 64797018df0cf5c1f11523bb575355aba918b940
Author: Yonghong Zhu 
Date:   Mon Apr 2 11:18:40 2018 +0800

BaseTools: Pcds in [Components] are not display correct in the report

The Pcd used in [Components] section, the PCD value is displayed
incorrect in the build report because the PCD default value was not
override.

Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Yonghong Zhu 
Reviewed-by: Liming Gao 

commit b91b8ee4c97a8ac52986f850e05765bffcfc9479
Author: Yonghong Zhu 
Date:   Mon Apr 2 11:15:27 2018 +0800

BaseTools: Pcd not used info should not in Module PCD section

Pcds in Conditional Directives and Pcds not used are Platform Level
info, it should not display in Module PCD Section.

Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Yonghong Zhu 
Reviewed-by: Liming Gao 

commit 175a4b5db39f57721022990ac8b92cf33015fa0b
Author: Carsey, Jaben 
Date:   Thu Apr 5 22:00:24 2018 +0800

BaseTools: dont make temporary dict

just make the key list directly

Cc: Liming Gao 
Cc: Yonghong Zhu 
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Jaben Carsey 
Reviewed-by: Yonghong Zhu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-linus test] 122143: regressions - FAIL

2018-04-10 Thread osstest service owner

flight 122143 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122143/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-xsm7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
118324
 test-amd64-i386-libvirt   7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-win10-i386  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 118324
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-examine   8 reboot   fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 118324
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 118324
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 118324
 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-win10-i386  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-rumprun-i386  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-libvirt-xsm   7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-freebsd10-i386  7 xen-boot   fail REGR. vs. 118324
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 118324
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
118324
 test-amd64-i386-freebsd10-amd64  7 xen-boot  fail REGR. vs. 118324
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-boot  fail REGR. vs. 118324

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 118324
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 118324
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 118324
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 118324
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 118324
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 118324
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 118324
 test-amd64-i386-xl-pvshim 7 xen-boot fail   never pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  7 xen-bootfail never pass
 test-amd64-i386-xl-shadow 7 xen-boot fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

Il Mar 10 Apr 2018, 22:16 Olaf Hering  ha scritto:

> On Tue, Apr 10, Olaf Hering wrote:
>
> > On Tue, Apr 10, Dario Faggioli wrote:
> >
> > > In the meanwhile --let me repeat myself-- just go ahead with "node:2",
> > > "node:3", etc. :-D
> >
> > I did, and that fails.
>
> I think the man page is not that clear, to me. If there is a difference
> between 'node' vs. 'nodes' for a single digit it may need a dedicated
> sentence to state that fact.

Mmm... I honestly don't recall, and I don't have the code in front of me
any longer.

I remember specifically wanting for it to support not only "nodes:", but
also "node:", because I thought that, e.g., "nodes:3" would have sound
weird to users.

I'd also say, however, that both "node:0-4" and "nodes:3" should work, but
I may be wrong.

Sorry for the manpage not being clear... I tried hard, back then, to come
up with a nice interface, and to describe it properly, but it is very much
possible that I failed. :-/

Regards,
Dario

I will try that once it comes back from reboot.
>
> Olaf
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering

On Tue, Apr 10, Dario Faggioli wrote:

> So, Olaf, if you're fancy giving this a tray anyway, well, go ahead.

BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc)));

(XEN) Xen BUG at sched_credit.c:876
(XEN) [ Xen-4.11.20180410T125709.50f8ba84a5-3.bug1087289_411  x86_64  
debug=y   Not tainted ]
(XEN) CPU:118
(XEN) RIP:e008:[] 
sched_credit.c#csched_vcpu_migrate+0x27/0x51
(XEN) RFLAGS: 00010006   CONTEXT: hypervisor
(XEN) rax: 83087b8f5010   rbx: 830779cc6188   rcx: 82d080803640
(XEN) rdx: 005f   rsi: 83007ba37000   rdi: 82d080803640
(XEN) rbp: 831c7d877d18   rsp: 831c7d877d18   r8:  0004
(XEN) r9:     r10:    r11: 
(XEN) r12: 830779cc6188   r13: 005f   r14: 0076
(XEN) r15: 83007ba37000   cr0: 80050033   cr4: 001526e0
(XEN) cr3: 000bf4af5000   cr2: 7f377e8fd594
(XEN) fsb:    gsb:    gss: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen code around  
(sched_credit.c#csched_vcpu_migrate+0x27/0x51):
(XEN)  00 00 00 48 3b 00 74 02 <0f> 0b 48 8d 15 43 56 73 00 48 63 76 04 48 8d 0d
(XEN) Xen stack trace from rsp=831c7d877d18:
(XEN)831c7d877d28 82d080236348 831c7d877da8 82d08023764c
(XEN) 82d08095f0e0 82d08095f100 830779da8188
(XEN)83007ba37000 005f0100  0296
(XEN)830779cc602c 83007ba37000 83007ba37000 83077a6c4000
(XEN)0076 83087bb8b000 831c7d877dc8 82d08023935f
(XEN)83077a6c4000 83005d1d 831c7d877e18 82d08027797d
(XEN)831c7d877de8 82d0802a4f50 831c7d877e18 83007ba37000
(XEN)83005d1d 830779cc6188 0baa8fa4f354 0001
(XEN)831c7d877ea8 82d080236943 82d08031f411 830779cc61a0
(XEN)007600b8b000 830779cc6180 831c7d877e68 82d0802f8fd3
(XEN)83007ba37000 83005d1d  
(XEN)831c7d877ee8 82d080937700 82d080933c00 
(XEN)831c7d877fff  831c7d877ed8 82d080239f15
(XEN)83007ba37000   
(XEN)831c7d877ee8 82d080239f6a 7ce3827880e7 82d08031f5db
(XEN)88011e034000 88011e034000 88011e034000 
(XEN)000d 81d4c180 0008 0013bb9ba8f8
(XEN)0001  81020e50 
(XEN)   beefbeef
(XEN)81060182 00bfbeef 0246 88011e037ed8
(XEN) Xen call trace:
(XEN)[] sched_credit.c#csched_vcpu_migrate+0x27/0x51
(XEN)[] schedule.c#vcpu_move_locked+0xbb/0xc2
(XEN)[] schedule.c#vcpu_migrate+0x226/0x25b
(XEN)[] context_saved+0x8d/0x94
(XEN)[] context_switch+0xe66/0xeb0
(XEN)[] schedule.c#schedule+0x5f4/0x627
(XEN)[] softirq.c#__do_softirq+0x85/0x90
(XEN)[] do_softirq+0x13/0x15
(XEN)[] vmx_asm_do_vmentry+0x2b/0x30
(XEN) 
(XEN) Panic on CPU 118:
(XEN) Xen BUG at sched_credit.c:876
(XEN) 
(XEN) Reboot in five seconds...


Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering

On Tue, Apr 10, Olaf Hering wrote:

> On Tue, Apr 10, Dario Faggioli wrote:
> 
> > In the meanwhile --let me repeat myself-- just go ahead with "node:2",
> > "node:3", etc. :-D
> 
> I did, and that fails.

I think the man page is not that clear, to me. If there is a difference
between 'node' vs. 'nodes' for a single digit it may need a dedicated
sentence to state that fact. I will try that once it comes back from reboot.

Olaf

signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering

On Tue, Apr 10, Dario Faggioli wrote:

> In the meanwhile --let me repeat myself-- just go ahead with "node:2",
> "node:3", etc. :-D

I did, and that fails.

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

On Tue, 2018-04-10 at 21:03 +0200, Olaf Hering wrote:
> On Tue, Apr 10, Dario Faggioli wrote:
> 
> > As said, its cpus= and cpus_soft=, and you probably just need
> > cpus="node:1"
> > cpus_soft="node:1"
> > Or, even just:
> > cpus="node:1"
> > as, if soft-affinity is set to be equal to hard, it is just
> > ignored.
> 
> Well, that was a noop. But xl.cfg states "nodes:0-3,^node:2", so this
> should work:
> cpus="nodes:3,^node:0"
> cpus_soft="nodes:3,^node:0"
> 
Well, but "nodes:0-3,^node:2" is a way to say that you want nodes 0, 1
and 3. I.e., you are defining a set made up of 0,1,2,3, and then you
remove 2.

With "nodes:3,^node:0", you're saying that you want node 3, but not
node 0. I.e., basically, you are creating a set with 3 in it, and then
trying to remove 0... I agree this is not technically wrong, but it
does not make much sense. Why you're not using
"node:3,^node:0,^node:1,^node:2" then?

So, really, the way to achieve what you seem to me to be wanting to
achieve is:

cpus="node:3"

All that being said, yes, "nodes:3,^node:0" should work (and behave
exactly as "node:3" :-) ).

And in fact...

> xl create -f fv_sles12sp1.f.tst.cfg
>
... parsing, at the xl level, worked, or xl itself would have errored
out, with its own message.

> libxl: error: libxl_sched.c:62:libxl__set_vcpuaffinity: Domain
> 16:Setting vcpu affinity: Invalid argument
> libxl: error: libxl_dom.c:461:libxl__build_pre: setting affinity
> failed on vcpu `0'
>
This is xc_vcpu_setaffinity() failing with EINVAL, in
libxl__set_vcpuaffinity().

> Same for nodes:2..., just nodes:1... works.
> 
> And after some attempts, cpus="nodes:2/3" fails too.
>
"nodes:2/3" is not supported.

> There is no indication what is invalid.
> 
Mmm... I seem to recall having tested the parser against various corner
cases and/or ill-defined input. Still, my guess is that using ^ like
that (i.e., excluding something which was not there in the first
place), may result in a weird/corrupted cpumask.

If that is the case, it indeed would be a bug. I'll check the code
tomorrow.

In the meanwhile --let me repeat myself-- just go ahead with "node:2",
"node:3", etc. :-D

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/2] libxl: Implement the handler to handle unrecoverable AER errors

2018-04-10 Thread Venu Busireddy

On 2017-09-21 18:12:54 +0100, Ian Jackson wrote:
> Venu Busireddy writes ("Re: [PATCH v3 1/2] libxl: Implement the handler to 
> handle unrecoverable AER errors"):
> > On 2017-08-08 15:33:01 +0100, Wei Liu wrote:
> > > I think a bigger question is whether you agree with Ian's comments
> > > regarding API design and whether you have more questions?
> > 
> > Ian suggested that I document the use of the API (about the event loop),
> > and I believe I addressed it. I don't have any more questions. Just
> > waiting for Ian's "Ack", or more comments.
> 
> I'm afraid that I still have reservations about the design questions.
> Evidently I didn't make my questions clear enough.
> 
> The most important question that seems unanswered to me is this:
> 
>   Why is this only sometimes the right thing to do ?  On what basis
>   might a user choose ?
> 
> To which you answered:
> 
>   This is not an "only sometimes" thing. User doesn't choose it. We always
>   want to watch for AER errors.
> 
> But this leads to more fundamental questions.
> 
> If this behaviour is always required, why do we have an API call to
> request it ?  It sounds like not calling this new function of yours is
> always a mistake.  Ie this function (which has an obscure name) is

Yes, this behavior is always required. I will remove the API calls,
and in their place, create wrapper functions that can be called from
different places in libxenlight as needed.

libxl_reg_aer_events_handler() will be created as a wrapper function
that hides the internal details of calling libxl__ev_xswatch_register().

libxl_unreg_aer_events_handler() will be created as a wrapper function
that hides the internal details of calling libxl__ev_xswatch_deregister().

> like "IAC DONT RANDOMY-LOSE" (see RFC748, from 1st April 1978)
> except that you are making DO RANDOMLY-LOSE the default (in violation
> of the RFC, should anyone talk to the server over telnet...)
> 
> If you are inventing a new kind of monitoring process that must be run
> for all domains, that is a thing that libxl does not have right now.
> At least, it doesn't have it in this form.  (xl has the reboot
> monitor, and this is done differently in libvirt.)

I am not inventing a new kind of monitoring process. My understanding
is that each domain already has its own monitoring process.

If the toolstack used is xl, I am proposing to use the 'xl' command as
the monitoring process (whether daemonized or not).

If the toolstack used is libvirt/virsh, I am proposing to use the daemon
'libvirtd' as the monitoring process. That process always stays around.

> It was indeed a design principle of libxl that it should (at least,
> wherever possible) be possible to run a domain _without_ a monitoring
> process imposed by libxl.
> 
> So: why is what this API call requests, not done automatically by
> pciback or by Xen ?

When you say "Xen", I am assuming you meant "libxenlight." If so, the
answer is Yes. I am proposing to register/unregister the event handler
in libxenlight, so that all toolstacks can make use of the monitoring.

> And: if you are inventing a new monitoring process that must be run
> for every domain, you should call this out much more explicitly as a
> fundamental design change.
> 
> We will then have to think about more questions: should this process
> be run automatically by libxl, without special application request
> (like the way that libxl runs qemu) ?

I am not inventing a new monitoring process, and hence, this isn't
applicable.

> If not, how do we ensure that exactly one of these processes is
> running for each guest ?

As I mentioned in the earlier paragraphs above, there will only be one
process per domain. Never more than one process per domain. When using
the xl toolstack, that one process per domain is the 'xl' command. When
using the libvirt/virsh toolstack, then there is only one process (the
'libvirtd' daemon) that is monitoring all the domains. Therefore, there
will never be more than process per domain.

> If your new design involves new behaviour in callers of libxl, do you
> intend to send patches for libvirt to enable it ?

The new design does not involve any new behavior in the callers of
libxl. Registration of the event handler happens transparently, without
any explicit request from the libxl callers.

> Looking at the code:
> 
> You handle errors by logging and continuing.  Why is that correct ?

In the unlikely case of failure to register the event handler, we felt
that it is better to continue with the creation of the guest without
monitoring (which is what happens today). But if you would like to see the
creation of the guest fail, that is an easy change to accommodate. Would
you like to fail the creation of the guest?

> If we are to keep the current API for the client, it needs to have
> better doc comments.

I will be removing the API, and hence, no doc comments.

> Is the xenstore watch implementation vulnerable to unexpected paths
> appearing in watch events ?

The

Re: [Xen-devel] [PATCH v3 1/2] libxl: Implement the handler to handle unrecoverable AER errors [and 1 more messages]

2018-04-10 Thread Venu Busireddy

On 2018-04-03 17:51:50 +0100, Ian Jackson wrote:
> Venu Busireddy writes ("Re: [PATCH v3 1/2] libxl: Implement the handler to 
> handle unrecoverable AER errors [and 1 more messages]"):
> > On 2018-04-03 16:06:17 +0100, Ian Jackson wrote:
> > > Ian Jackson writes ("Re: [PATCH v3 1/2] libxl: Implement the handler to 
> > > handle unrecoverable AER errors"):
> > > > I'm afraid that I still have reservations about the design questions.
> > > > Evidently I didn't make my questions clear enough.
> > > > 
> > > > [ 64 lines of detailed discussion elided ]
> > > 
> > > I haven't seen a reply to that.
> > 
> > Reply to that is the v5 patch. Your concern in v4 was, "why is this
> > error handling done only in some cases?" Meaning, the error handling
> > happens only for guests created using xl, but it does not happen for
> > guests created using libvirt. I addressed that in the v5 patch. Please
> > see below for more details.
> 
> Oh.  I see.
> 
> > > I'm confused by the responses in the thread which relate to libvirt.
> > > ISTM that a libvirt patch is also required.  Do you mean that in v5
> > > there is also a libvirt patch ?
> > 
> > libvirt ends up calling do_domain_create() in tools/libxl/libxl_create.c,
> > and that is where I am registering the error handler. That change takes
> > care of guests created using xl command as well as libvirt. Hence there
> > is no change in libvirt.
> 
> I'm sorry to say that this is completely wrong.  I didn't spot that
> hunk in the v5 2/2 patch.  I don't think your description in your v4
> to v5 changes summary really highlights the substantial design change.
> 
> I think it would have been better to reply to my prose email.  We
> would have been able to explore the design possibilities.
> 
> What you have done is wrong because:
> 
>  * You have removed the libxl__aer_watch from the
>libxl_reg_aer_events_handler API which means the effect is now
>global for the ctx.  This is not correct for a libxl event
>generation request function.  (Although this isn't one.)
> 
>  * Not all callers of libxl will necessarily retain the process, or
>the ctx, in which they called libxl_domain_create.  I think libvirt
>does (but I'm not sure), and xl usually does, but it's not
>guaranteed.
> 
>  * It's quite unclear why this function is a public one.
> 
> The entire approach is wrong, I'm afraid.
> 
> We need to go back to the design.  Please would you reply to my mail
> from September.

Sure. I will reply to your email from September.

Venu

> Thanks,
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering

On Tue, Apr 10, Dario Faggioli wrote:

> On Tue, 2018-04-10 at 17:59 +0200, Olaf Hering wrote:
> > memory=
> > vcpus=36
> > cpu="nodes:1,^node:0"
> > cpu_soft="nodes:1,^node:0"
> As said, its cpus= and cpus_soft=, and you probably just need
> cpus="node:1"
> cpus_soft="node:1"
> Or, even just:
> cpus="node:1"
> as, if soft-affinity is set to be equal to hard, it is just ignored.

Well, that was a noop. But xl.cfg states "nodes:0-3,^node:2", so this
should work:
cpus="nodes:3,^node:0"
cpus_soft="nodes:3,^node:0"

xl create -f fv_sles12sp1.f.tst.cfg
libxl: error: libxl_sched.c:62:libxl__set_vcpuaffinity: Domain 16:Setting vcpu 
affinity: Invalid argument
libxl: error: libxl_dom.c:461:libxl__build_pre: setting affinity failed on vcpu 
`0'
libxl: error: libxl_create.c:1265:domcreate_rebuild_done: Domain 16:cannot 
(re-)build domain: -3
libxl: error: libxl_domain.c:1034:libxl__destroy_domid: Domain 16:Non-existant 
domain
libxl: error: libxl_domain.c:993:domain_destroy_callback: Domain 16:Unable to 
destroy guest
libxl: error: libxl_domain.c:920:domain_destroy_cb: Domain 16:Destruction of 
domain failed

Same for nodes:2..., just nodes:1... works.

And after some attempts, cpus="nodes:2/3" fails too.
There is no indication what is invalid.

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 2/7] add current_time function to time manager

2018-04-10 Thread Paul Semel

this function returns the "epoch" time

Signed-off-by: Paul Semel 
---

Notes:
v4:
- new patch version

 common/time.c  | 39 +++
 include/xtf/time.h |  5 +
 2 files changed, 44 insertions(+)

diff --git a/common/time.c b/common/time.c
index 79abc7e..c1b7cd1 100644
--- a/common/time.c
+++ b/common/time.c
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 
 /* This function was taken from mini-os source code */
 /* It returns ((delta << shift) * mul_frac) >> 32 */
@@ -70,6 +71,44 @@ uint64_t since_boot_time(void)
 return system_time;
 }
 
+static void get_time_info(uint64_t *boot_time, uint64_t *sec, uint32_t *nsec)
+{
+uint32_t ver1, ver2;
+do {
+ver1 = ACCESS_ONCE(shared_info.wc_version);
+smp_rmb();
+*boot_time = since_boot_time();
+#if defined(__i386__)
+*sec = (uint64_t)ACCESS_ONCE(shared_info.wc_sec);
+#else
+*sec = ((uint64_t)ACCESS_ONCE(shared_info.wc_sec_hi) << 32)
+| ACCESS_ONCE(shared_info.wc_sec);
+#endif
+*nsec = (uint64_t)ACCESS_ONCE(shared_info.wc_nsec);
+smp_rmb();
+ver2 = ACCESS_ONCE(shared_info.wc_version);
+smp_rmb();
+} while ( (ver1 & 1) != 0 && ver1 != ver2 );
+}
+
+/* This function return the epoch time (number of seconds elapsed
+ * since Juanary 1, 1970) */
+uint64_t current_time(void)
+{
+uint32_t nsec;
+uint64_t boot_time, sec;
+
+get_time_info(_time, , );
+
+#if defined(__i386__)
+divmod64(_time, SEC_TO_NSEC(1));
+#else
+boot_time /= SEC_TO_NSEC(1);
+#endif
+
+return sec + boot_time;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/include/xtf/time.h b/include/xtf/time.h
index 8180e07..e33dc8a 100644
--- a/include/xtf/time.h
+++ b/include/xtf/time.h
@@ -8,9 +8,14 @@
 
 #include 
 
+#define SEC_TO_NSEC(x) ((x) * 10ul)
+
+
 /* Time from boot in nanoseconds */
 uint64_t since_boot_time(void);
 
+uint64_t current_time(void);
+
 #endif /* XTF_TIME_H */
 
 /*
-- 
2.16.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 5/7] add spin_sleep function to time manager

2018-04-10 Thread Paul Semel

this function uses nspin_sleep to spin sleep for t seconds

Signed-off-by: Paul Semel 
---

Notes:
v4:
- new patch version

 common/time.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/common/time.c b/common/time.c
index 232e134..87db124 100644
--- a/common/time.c
+++ b/common/time.c
@@ -151,6 +151,12 @@ static inline void nspin_sleep(uint64_t t)
 asm volatile ("pause");
 }
 
+static inline void spin_sleep(uint64_t t)
+{
+uint64_t nsec = SEC_TO_NSEC(t);
+nspin_sleep(nsec);
+}
+
 /*
  * Local variables:
  * mode: C
-- 
2.16.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 6/7] add mspin_sleep function to time manager

2018-04-10 Thread Paul Semel

this function uses mspin_sleep to spin sleep for t milliseconds

Signed-off-by: Paul Semel 
---

Notes:
v4:
- new patch version

 common/time.c  | 6 ++
 include/xtf/time.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/common/time.c b/common/time.c
index 87db124..7515eb0 100644
--- a/common/time.c
+++ b/common/time.c
@@ -157,6 +157,12 @@ static inline void spin_sleep(uint64_t t)
 nspin_sleep(nsec);
 }
 
+static inline void mspin_sleep(uint64_t t)
+{
+uint64_t nsec = MSEC_TO_NSEC(t);
+nspin_sleep(nsec);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/include/xtf/time.h b/include/xtf/time.h
index ce4d6db..d9cecdb 100644
--- a/include/xtf/time.h
+++ b/include/xtf/time.h
@@ -15,6 +15,7 @@ struct timeval {
 
 
 #define SEC_TO_NSEC(x) ((x) * 10ul)
+#define MSEC_TO_NSEC(x) ((x) * 100ul)
 
 
 /* Time from boot in nanoseconds */
-- 
2.16.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 4/7] add nspin_sleep function to time manager

2018-04-10 Thread Paul Semel

this function spin sleeps for t nanoseconds

Signed-off-by: Paul Semel 
---

Notes:
v4:
- new patch version

 common/time.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/common/time.c b/common/time.c
index 8489f3b..232e134 100644
--- a/common/time.c
+++ b/common/time.c
@@ -139,6 +139,18 @@ int gettimeofday(struct timeval *tp, void *restrict tzp)
 return 0;
 }
 
+static inline void nspin_sleep(uint64_t t)
+{
+uint64_t curr = since_boot_time();
+uint64_t end = curr + t;
+
+if ( end < curr )
+panic("end value overflows counter\n");
+
+while ( since_boot_time() < end )
+asm volatile ("pause");
+}
+
 /*
  * Local variables:
  * mode: C
-- 
2.16.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 3/7] add gettimeofday function to time managment

2018-04-10 Thread Paul Semel

this function acts as the POSIX gettimeofday function

Signed-off-by: Paul Semel 
---

Notes:
v4:
- new patch version

 common/time.c  | 30 ++
 include/xtf/time.h |  8 
 2 files changed, 38 insertions(+)

diff --git a/common/time.c b/common/time.c
index c1b7cd1..8489f3b 100644
--- a/common/time.c
+++ b/common/time.c
@@ -1,6 +1,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -109,6 +110,35 @@ uint64_t current_time(void)
 return sec + boot_time;
 }
 
+/* The POSIX gettimeofday syscall normally takes a second argument, which is
+ * the timezone (struct timezone). However, it sould be NULL because linux
+ * doesn't use it anymore. So we need for us to add it in this function
+ */
+int gettimeofday(struct timeval *tp, void *restrict tzp)
+{
+uint64_t boot_time, sec;
+uint32_t mod, nsec;
+
+if ( tzp != NULL )
+return -EOPNOTSUPP;
+
+if ( tp == NULL )
+return -EINVAL;
+
+get_time_info(_time, , );
+
+#if defined(__i386__)
+mod = divmod64(_time, SEC_TO_NSEC(1));
+#else
+mod = boot_time % SEC_TO_NSEC(1);
+boot_time /= SEC_TO_NSEC(1);
+#endif
+
+tp->sec = sec + boot_time;
+tp->nsec = nsec + mod;
+return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/include/xtf/time.h b/include/xtf/time.h
index e33dc8a..ce4d6db 100644
--- a/include/xtf/time.h
+++ b/include/xtf/time.h
@@ -8,6 +8,12 @@
 
 #include 
 
+struct timeval {
+uint64_t sec;
+uint64_t nsec;
+};
+
+
 #define SEC_TO_NSEC(x) ((x) * 10ul)
 
 
@@ -16,6 +22,8 @@ uint64_t since_boot_time(void);
 
 uint64_t current_time(void);
 
+int gettimeofday(struct timeval *tp, void *restrict tzp);
+
 #endif /* XTF_TIME_H */
 
 /*
-- 
2.16.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 7/7] add sleep, msleep and NOW() macros to time manager

2018-04-10 Thread Paul Semel

those are helpful macro to use the time manager correctly

Signed-off-by: Paul Semel 
---

Notes:
v4:
- new patch version

 common/time.c  | 10 ++
 include/xtf/time.h | 12 
 2 files changed, 22 insertions(+)

diff --git a/common/time.c b/common/time.c
index 7515eb0..e2779b9 100644
--- a/common/time.c
+++ b/common/time.c
@@ -163,6 +163,16 @@ static inline void mspin_sleep(uint64_t t)
 nspin_sleep(nsec);
 }
 
+void sleep(uint64_t t)
+{
+spin_sleep(t);
+}
+
+void msleep(uint64_t t)
+{
+mspin_sleep(t);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/include/xtf/time.h b/include/xtf/time.h
index d9cecdb..545da25 100644
--- a/include/xtf/time.h
+++ b/include/xtf/time.h
@@ -23,8 +23,20 @@ uint64_t since_boot_time(void);
 
 uint64_t current_time(void);
 
+/* This function takes seconds in parameter */
+void sleep(uint64_t f);
+
+/* Be careful, this function takes milliseconds in parameter,
+ * not microseconds !
+ */
+void msleep(uint64_t f);
+
 int gettimeofday(struct timeval *tp, void *restrict tzp);
 
+
+/* This returns the current epoch time */
+#define NOW() current_time()
+
 #endif /* XTF_TIME_H */
 
 /*
-- 
2.16.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 1/7] introduce time managment in xtf

2018-04-10 Thread Paul Semel

this file is introduce to be able to implement an inter domain
communication protocol over xenstore. For synchronization purpose, we do
really want to be able to "control" time

common/time.c: since_boot_time gets the time in nanoseconds from the
moment the VM has booted

Signed-off-by: Paul Semel 
---

Notes:
v4:
- moved rdtsc to arch/x86/include/arch/lib.h
- added a rdtsc_ordered implementation to serialize rdtsc
- simplified since_boot_time function
- still need to have Andrew's scale_delta version

 arch/x86/include/arch/lib.h | 18 ++
 build/files.mk  |  1 +
 common/time.c   | 81 +
 include/xtf/time.h  | 24 ++
 4 files changed, 124 insertions(+)
 create mode 100644 common/time.c
 create mode 100644 include/xtf/time.h

diff --git a/arch/x86/include/arch/lib.h b/arch/x86/include/arch/lib.h
index 0045902..510cdb1 100644
--- a/arch/x86/include/arch/lib.h
+++ b/arch/x86/include/arch/lib.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static inline void cpuid(uint32_t leaf,
  uint32_t *eax, uint32_t *ebx,
@@ -374,6 +375,23 @@ static inline void write_xcr0(uint64_t xcr0)
 xsetbv(0, xcr0);
 }
 
+static inline uint64_t rdtsc(void)
+{
+uint32_t lo, hi;
+
+asm volatile("rdtsc": "=a"(lo), "=d"(hi));
+
+return ((uint64_t)hi << 32) | lo;
+}
+
+static inline uint64_t rdtsc_ordered(void)
+{
+rmb();
+mb();
+
+return rdtsc();
+}
+
 #endif /* XTF_X86_LIB_H */
 
 /*
diff --git a/build/files.mk b/build/files.mk
index 46b42d6..55ed1ca 100644
--- a/build/files.mk
+++ b/build/files.mk
@@ -16,6 +16,7 @@ obj-perarch += $(ROOT)/common/libc/vsnprintf.o
 obj-perarch += $(ROOT)/common/report.o
 obj-perarch += $(ROOT)/common/setup.o
 obj-perarch += $(ROOT)/common/xenbus.o
+obj-perarch += $(ROOT)/common/time.o
 
 obj-perenv += $(ROOT)/arch/x86/decode.o
 obj-perenv += $(ROOT)/arch/x86/desc.o
diff --git a/common/time.c b/common/time.c
new file mode 100644
index 000..79abc7e
--- /dev/null
+++ b/common/time.c
@@ -0,0 +1,81 @@
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+/* This function was taken from mini-os source code */
+/* It returns ((delta << shift) * mul_frac) >> 32 */
+static inline uint64_t scale_delta(uint64_t delta, uint32_t mul_frac, int 
shift)
+{
+uint64_t product;
+#ifdef __i386__
+uint32_t tmp1, tmp2;
+#endif
+
+if ( shift < 0 )
+delta >>= -shift;
+else
+delta <<= shift;
+
+#ifdef __i386__
+__asm__ (
+"mul  %5   ; "
+"mov  %4,%%eax ; "
+"mov  %%edx,%4 ; "
+"mul  %5   ; "
+"add  %4,%%eax ; "
+"xor  %5,%5; "
+"adc  %5,%%edx ; "
+: "=A" (product), "=r" (tmp1), "=r" (tmp2)
+: "a" ((uint32_t)delta), "1" ((uint32_t)(delta >> 32)), "2" 
(mul_frac) );
+#else
+__asm__ (
+"mul %%rdx ; shrd $32,%%rdx,%%rax"
+: "=a" (product) : "0" (delta), "d" ((uint64_t)mul_frac) );
+#endif
+
+return product;
+}
+
+
+uint64_t since_boot_time(void)
+{
+uint32_t ver1, ver2;
+uint64_t tsc_timestamp, system_time, tsc;
+uint32_t tsc_to_system_mul;
+int8_t tsc_shift;
+
+do
+{
+ver1 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
+smp_rmb();
+
+system_time = shared_info.vcpu_info[0].time.system_time;
+tsc_timestamp = shared_info.vcpu_info[0].time.tsc_timestamp;
+tsc_to_system_mul = shared_info.vcpu_info[0].time.tsc_to_system_mul;
+tsc_shift = shared_info.vcpu_info[0].time.tsc_shift;
+tsc = rdtsc_ordered();
+smp_rmb();
+
+ver2 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
+} while ( ver2 & 1 || ver1 != ver2 );
+
+
+system_time += scale_delta(tsc - tsc_timestamp,
+   tsc_to_system_mul,
+   tsc_shift);
+
+return system_time;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/include/xtf/time.h b/include/xtf/time.h
new file mode 100644
index 000..8180e07
--- /dev/null
+++ b/include/xtf/time.h
@@ -0,0 +1,24 @@
+/**
+ * @file include/xtf/time.h
+ *
+ * Time management
+ */
+#ifndef XTF_TIME_H
+# define XTF_TIME_H
+
+#include 
+
+/* Time from boot in nanoseconds */
+uint64_t since_boot_time(void);
+
+#endif /* XTF_TIME_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.16.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.7-testing baseline-only test] 74570: trouble: blocked/broken/fail/pass

2018-04-10 Thread Platform Team regression test user

This run is configured for baseline tests only.

flight 74570 xen-4.7-testing real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/74570/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64  broken
 build-arm64-pvopsbroken
 build-arm64-xsm  broken

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-pvops 2 hosts-allocate   broken never pass
 build-arm64   2 hosts-allocate   broken never pass
 build-arm64-xsm   2 hosts-allocate   broken never pass
 build-arm64   3 capture-logs broken never pass
 build-arm64-pvops 3 capture-logs broken never pass
 build-arm64-xsm   3 capture-logs broken never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-install fail baseline untested
 test-armhf-armhf-libvirt 12 guest-start fail baseline untested
 test-armhf-armhf-xl-midway   12 guest-start fail baseline untested
 test-armhf-armhf-libvirt-xsm 12 guest-start fail baseline untested
 test-armhf-armhf-xl  12 guest-start fail baseline untested
 test-armhf-armhf-xl-xsm  12 guest-start fail baseline untested
 test-armhf-armhf-xl-multivcpu 12 guest-startfail baseline untested
 test-armhf-armhf-xl-credit2  12 guest-start fail baseline untested
 test-armhf-armhf-xl-rtds 12 guest-start fail baseline untested
 test-amd64-amd64-qemuu-nested-intel 14 xen-boot/l1  fail baseline untested
 test-amd64-amd64-xl-qemut-ws16-amd64 16 guest-localmigrate/x10 fail baseline 
untested
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail baseline 
untested
 test-armhf-armhf-xl-vhd  10 debian-di-install   fail baseline untested
 test-armhf-armhf-libvirt-raw 10 debian-di-install   fail baseline untested
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-install fail never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 xen  9680710bed1c174ced7a170cb94e30b4ae4fff5e
baseline version:
 xen  dca80abc2075a54fec58344751357021b3b5b39e

Last test of basis74489  2018-04-05 12:24:56 Z5 days
Testing same since74570  2018-04-10 07:18:24 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  broken  
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-arm64  broken  
 build-armhf  pass
 build-i386   pass

[Xen-devel] [PATCH 6/7] docs/Makefile: Introduce GENERATE_PANDOC_RULE_RAW

2018-04-10 Thread Ian Jackson

We are going to want to format SUPPORT.md which does not match the
filename patterns in docs/.  So provide a way to make an ad-hoc rule
using pandoc with the standard options.

No functional change in this patch.

Signed-off-by: Ian Jackson 
---
 docs/Makefile | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/docs/Makefile b/docs/Makefile
index 6743fa3..d82463f 100644
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -237,17 +237,18 @@ txt/%.txt: %.markdown
$(INSTALL_DATA) $< $@
 
 # Metarule for generating pandoc rules.
-define GENERATE_PANDOC_RULE
-# $(1) is the target documentation format. $(2) is the source format.
-
-$(1)/%.$(1): %.$(2)
+define GENERATE_PANDOC_RULE_RAW
+$(1): $(2)
 ifneq ($(PANDOC),)
@$(INSTALL_DIR) $$(@D)
$(PANDOC) --number-sections --toc --standalone $$< --output $$@
 else
@echo "pandoc not installed; skipping $$@"
 endif
-
+endef
+define GENERATE_PANDOC_RULE
+# $(1) is the target documentation format. $(2) is the source format.
+$(call GENERATE_PANDOC_RULE_RAW,$(1)/%.$(1),%.$(2))
 endef
 $(eval $(call GENERATE_PANDOC_RULE,pdf,pandoc))   # pdf/%.pdf: %.pandoc
 $(eval $(call GENERATE_PANDOC_RULE,txt,pandoc))   # txt/%.txt: %.pandoc
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 4/7] docs/gen-html-index: Extract titles from HTML documents

2018-04-10 Thread Ian Jackson

Signed-off-by: Ian Jackson 
---
 docs/gen-html-index | 13 +
 1 file changed, 13 insertions(+)

diff --git a/docs/gen-html-index b/docs/gen-html-index
index e9792bf..5b43b42 100644
--- a/docs/gen-html-index
+++ b/docs/gen-html-index
@@ -10,6 +10,7 @@ use warnings;
 use Getopt::Long;
 use IO::File;
 use File::Basename;
+use HTML::TreeBuilder::XPath;
 
 Getopt::Long::Configure('bundling');
 
@@ -64,6 +65,18 @@ sub make_linktext ($) {
 return "$1($2)" if $l =~ m,^man/(.*)\.([0-9].*)\.html,;
 $l =~ s/.(?:html|txt)$//g;
 return $index{$l} if exists $index{$l};
+
+my $from_html;
+eval {
+my $tree = new HTML::TreeBuilder::XPath;
+my $f = "$outdir/$l.html";
+open F, '<', $f or die "$l $f $!";
+$tree->parse_file(\*F) or die;
+close F;
+$from_html = $tree->findvalue("/html/head/title");
+};
+return $from_html if $from_html;
+
 return basename($l);
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 5/7] docs/gen-html-index: Support documents at the toplevel

2018-04-10 Thread Ian Jackson

There are none yet.

Signed-off-by: Ian Jackson 
---
 docs/gen-html-index | 4 
 1 file changed, 4 insertions(+)

diff --git a/docs/gen-html-index b/docs/gen-html-index
index 5b43b42..8258e2b 100644
--- a/docs/gen-html-index
+++ b/docs/gen-html-index
@@ -137,6 +137,10 @@ sub dirs($)
 return @dirs;
 }
 
+foreach my $of (grep { !m{/} } @docs) {
+$top .= make_link($of,'');
+}
+
 foreach my $od (sort { $a cmp $b } uniq map { dirs($_) } @docs) {
 my @d = (grep /^\Q$od\E/, @docs);
 if ( @d == 1 and $d[0] eq "$od/index.html" )
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 3/7] SUPPORT.md: Syntax: Provide a title rather than a spurious empty section

2018-04-10 Thread Ian Jackson

Signed-off-by: Ian Jackson 
---
 SUPPORT.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/SUPPORT.md b/SUPPORT.md
index e447069..264b23f 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -1,4 +1,4 @@
-# Support statement for this release
+% Support statement for this release
 
 This document describes the support status
 and in particular the security support status of the Xen branch
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.11 0/7] SUPPORT.md: Format as part of html docs

2018-04-10 Thread Ian Jackson

The SUPPORT.md document (introduced in 4.10) does not appear here
  http://xenbits.xen.org/docs/
In this series I fix this.

This is a prerequisite for my work to generate a matrix representing
the cross-version feature support status, because that cross-version
matrix wants to contain hyperlinks into the appropriate bits of (html)
SUPPORT.md.

This series should be backported to 4.10.  If and when a SUPPORT.md is
provided for earlier releases, it should be backported to those too.

There are three patches fixing minor syntax trouble in SUPPORT.md, and
four build system changes.  I hope the release ack will be a formality
:-).

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/7] SUPPORT.md: Syntax: Fix a typo "States"

2018-04-10 Thread Ian Jackson

Signed-off-by: Ian Jackson 
---
 SUPPORT.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/SUPPORT.md b/SUPPORT.md
index 1c5220b..e447069 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -360,7 +360,7 @@ Guest-side driver capable of speaking the Xen PV block 
protocol
 Status, FreeBSD: Supported, Security support external
 Status, NetBSD: Supported, Security support external
 Status, OpenBSD: Supported, Security support external
-States, Windows: Supported
+Status, Windows: Supported
 
 Guest-side driver capable of speaking the Xen PV networking protocol
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/7] SUPPORT.md: Syntax: Fix some bullet lists

2018-04-10 Thread Ian Jackson

Continuations of bullet list items must be indented by exactly 4
spaces (according to pandoc_markdown(5) on Debian jessie).

This is most easily achieved by making the bullet list items have two
spaces before the `*'.

Signed-off-by: Ian Jackson 
---
 SUPPORT.md | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/SUPPORT.md b/SUPPORT.md
index c72a25b..1c5220b 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -783,40 +783,40 @@ What is the risk of it exhibiting bugs?
 
 General answers to the above:
 
- * **Here be dragons**
+  * **Here be dragons**
 
-   Pretty likely to still crash / fail to work.
-   Not recommended unless you like life on the bleeding edge.
+Pretty likely to still crash / fail to work.
+Not recommended unless you like life on the bleeding edge.
 
- * **Quirky**
+  * **Quirky**
 
-   Mostly works but may have odd behavior here and there.
-   Recommended for playing around or for non-production use cases.
+Mostly works but may have odd behavior here and there.
+Recommended for playing around or for non-production use cases.
 
- * **Normal**
+  * **Normal**
 
-   Ready for production use
+Ready for production use
 
 ### Interface stability
 
 If I build a system based on the current interfaces,
 will they still work when I upgrade to the next version?
 
- * **Not stable**
+  * **Not stable**
 
-   Interface is still in the early stages and
-   still fairly likely to be broken in future updates.
+Interface is still in the early stages and
+still fairly likely to be broken in future updates.
 
- * **Provisionally stable**
+  * **Provisionally stable**
 
-   We're not yet promising backwards compatibility,
-   but we think this is probably the final form of the interface.
-   It may still require some tweaks.
+We're not yet promising backwards compatibility,
+but we think this is probably the final form of the interface.
+It may still require some tweaks.
 
- * **Stable**
+  * **Stable**
 
-   We will try very hard to avoid breaking backwards  compatibility,
-   and to fix any regressions that are reported.
+We will try very hard to avoid breaking backwards  compatibility,
+and to fix any regressions that are reported.
 
 ### Security supported
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 7/7] docs/Makefile: Format SUPPORT.md into the toplevel

2018-04-10 Thread Ian Jackson

Signed-off-by: Ian Jackson 
---
 docs/Makefile | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/docs/Makefile b/docs/Makefile
index d82463f..b300bb6 100644
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -28,7 +28,8 @@ DOC_MAN7 := $(patsubst man/%.pod.7,man7/%.7,$(MAN7SRC-y)) \
$(patsubst man/%.markdown.7,man7/%.7,$(MAN7SRC-y))
 DOC_MAN8 := $(patsubst man/%.pod.8,man8/%.8,$(MAN8SRC-y)) \
$(patsubst man/%.markdown.8,man8/%.8,$(MAN8SRC-y))
-DOC_HTML := $(patsubst %.markdown,html/%.html,$(MARKDOWNSRC-y)) \
+DOC_HTML := html/SUPPORT.html \
+$(patsubst %.markdown,html/%.html,$(MARKDOWNSRC-y)) \
 $(patsubst %.pandoc,html/%.html,$(PANDOCSRC-y)) \
 $(patsubst man/%.markdown.1,html/man/%.1.html,$(MAN1SRC-y)) \
 $(patsubst man/%.markdown.5,html/man/%.5.html,$(MAN5SRC-y)) \
@@ -255,6 +256,8 @@ $(eval $(call GENERATE_PANDOC_RULE,txt,pandoc))   # 
txt/%.txt: %.pandoc
 $(eval $(call GENERATE_PANDOC_RULE,html,pandoc))  # html/%.html: %.pandoc
 $(eval $(call GENERATE_PANDOC_RULE,pdf,markdown)) # pdf/%.pdf: %.markdown
 
+$(eval $(call 
GENERATE_PANDOC_RULE_RAW,html/SUPPORT.html,$(XEN_ROOT)/SUPPORT.md)) # 
pdf/%.pdf: %.markdown
+
 ifeq (,$(findstring clean,$(MAKECMDGOALS)))
 $(XEN_ROOT)/config/Docs.mk:
$(error You have to run ./configure before building docs)
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

On Tue, 2018-04-10 at 16:25 +0100, George Dunlap wrote:
> On 04/10/2018 12:29 PM, Dario Faggioli wrote:
> > 
> One thing we might consider doing is implementing the migrate()
> callback
> for the Credit scheduler, and just have it make a bunch of sanity
> checks
> (v->processor lock held, new_cpu lock held, vcpu not on any runqueue,
> ).
> 
So, it turns out that I have to run. :-/

I hacked the attached patch rather quickly, but only compile tested
it... And I'm pretty tired, so I can't guarantee all the BUG_ON()s are
correct, nor that they are the proper ones to (potentially) catch the
issue.

So, Olaf, if you're fancy giving this a tray anyway, well, go ahead.

If it does not work, though, get rid of it... and I'll craft a better
one tomorrow. :-(

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/commit 52bd39c760ce6664186bc9d67bcc6a8eed11f792
Author: Dario Faggioli 
Date:   Tue Apr 10 18:59:28 2018 +0200

xen: credit: implement SCHED_OP(migrate)

with just sanity checking in it, to catch a race.

Signed-off-by: Dario Faggioli 

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 9bc638c09c..5dcf530c24 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -867,6 +867,16 @@ _csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit)
 return cpu;
 }
 
+static void
+csched_vcpu_migrate(const struct scheduler *ops, struct vcpu *vc,
+		unsigned int new_cpu)
+{
+BUG_ON(vc->is_running);
+BUG_ON(test_bit(_VPF_migrating, >pause_flags));
+BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc)));
+BUG_ON(CSCHED_VCPU(vc) == CSCHED_VCPU(curr_on_cpu(vc->processor)));
+}
+
 static int
 csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
 {
@@ -2278,6 +2288,7 @@ static const struct scheduler sched_credit_def = {
 .adjust_global  = csched_sys_cntl,
 
 .pick_cpu   = csched_cpu_pick,
+.migrate= csched_vcpu_migrate,
 .do_schedule= csched_schedule,
 
 .dump_cpu_state = csched_dump_pcpu,


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

On Tue, 2018-04-10 at 17:59 +0200, Olaf Hering wrote:
> On Tue, Apr 10, Olaf Hering wrote:
> 
> > (XEN) Xen BUG at sched_credit.c:1694
> 
> And another one with debug=y and this config:
>
Wow...

> memory=
> vcpus=36
> cpu="nodes:1,^node:0"
> cpu_soft="nodes:1,^node:0"
>
As said, its cpus= and cpus_soft=, and you probably just need

cpus="node:1"
cpus_soft="node:1"

Or, even just:

cpus="node:1"

as, if soft-affinity is set to be equal to hard, it is just ignored.

> (nodes=1 cycles between 1-3 for each following domU).
> 
> (XEN) Assertion 'CSCHED_PCPU(cpu)->nr_runnable >= 1' failed at
> sched_credit.c:269
> (XEN) [ Xen-4.11.20180407T144959.e62e140daa-
> 4.bug1087289_411  x86_64  debug=y   Not tainted ]
> (XEN) CPU:18
> (XEN) RIP:e008:[]
> sched_credit.c#csched_schedule+0x8fe/0xd42
> (XEN) RFLAGS: 00010046   CONTEXT: hypervisor (d0v18)
> ...
> (XEN) Xen call trace:
> (XEN)[]
> sched_credit.c#csched_schedule+0x8fe/0xd42
> (XEN)[] schedule.c#schedule+0x107/0x627
> (XEN)[] softirq.c#__do_softirq+0x85/0x90
> (XEN)[] do_softirq+0x13/0x15
> (XEN)[]
> x86_64/entry.S#process_softirqs+0x6/0x10
>
Yeah, thanks for trying with debugging on. Unfortunately, stack traces
in these case are not very helpful, as they only tell us that
schedule() is being called by do_softirq()... :-P

Still...

> (XEN) 
> (XEN) Panic on CPU 18:
> (XEN) Assertion 'CSCHED_PCPU(cpu)->nr_runnable >= 1' failed at
> sched_credit.c:269
>
...it is another, different, one, this time when removing (or not
reinserting) the vcpu from the runqueue.

What would be helpful, would be to catch the other side of the race,
i.e., the point when the vcpu is being re-insterted in the runqueue, or
when v->processor of a vcpu in the runqueue is changed Let's see if
the debug patch will help with this.

Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 122162: tolerable all pass - PUSHED

2018-04-10 Thread osstest service owner

flight 122162 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122162/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  50f8ba84a50ebf80dd22067a04062dbaaf2621ff
baseline version:
 xen  451004603247205467ec34b366b4cfa3814a5d95

Last test of basis   121876  2018-04-05 10:04:25 Z5 days
Failing since121889  2018-04-05 13:02:10 Z5 days   46 attempts
Testing same since   122162  2018-04-10 14:01:23 Z0 days1 attempts


People who touched revisions under test:
  Amit Singh Tomar 
  Andre Przywara 
  Andre Pzywara 
  Andrew Cooper 
  Boris Ostrovsky 
  George Dunlap 
  Jan Beulich 
  Juergen Gross 
  Julien Grall 
  Kevin Tian 
  Marcello Seri 
  Marcus of Wetware Labs 
  Marek Marczykowski-GÃ³recki 
  Petre Eftime 
  Razvan Cojocaru 
  Stefano Stabellini 
  Tim Deegan 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   4510046032..50f8ba84a5  50f8ba84a50ebf80dd22067a04062dbaaf2621ff -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering

On Tue, Apr 10, Olaf Hering wrote:

> (XEN) Xen BUG at sched_credit.c:1694

And another one with debug=y and this config:
memory=
vcpus=36
cpu="nodes:1,^node:0"
cpu_soft="nodes:1,^node:0"
(nodes=1 cycles between 1-3 for each following domU).

(XEN) Assertion 'CSCHED_PCPU(cpu)->nr_runnable >= 1' failed at 
sched_credit.c:269
(XEN) [ Xen-4.11.20180407T144959.e62e140daa-4.bug1087289_411  x86_64  
debug=y   Not tainted ]
(XEN) CPU:18
(XEN) RIP:e008:[] 
sched_credit.c#csched_schedule+0x8fe/0xd42
(XEN) RFLAGS: 00010046   CONTEXT: hypervisor (d0v18)
(XEN) rax: 830779e9e970   rbx: 83007ba44000   rcx: 0046
(XEN) rdx: 0036f953b080   rsi: 83077a738140   rdi: 830779e9a18e
(XEN) rbp: 83077a737e18   rsp: 83077a737d18   r8:  000b
(XEN) r9:  83077a7383c0   r10:    r11: 017e70349000
(XEN) r12: 8309d55879f0   r13: 0044   r14: 830779eae188
(XEN) r15: 8309d55879f0   cr0: 8005003b   cr4: 001526e0
(XEN) cr3: 000dd1056000   cr2: 557e1f370028
(XEN) fsb:    gsb: 88088508   gss: 
(XEN) ds: 002b   es: 002b   fs:    gs:    ss: e010   cs: e008
(XEN) Xen code around  
(sched_credit.c#csched_schedule+0x8fe/0xd42):
(XEN)  10 18 83 78 18 00 75 02 <0f> 0b 48 8d 05 0f 3e 73 00 48 8b 44 10 18 83 68
(XEN) Xen stack trace from rsp=83077a737d18:
(XEN)0004 ef047000 82d08095f0e0 830779eae188
(XEN)017e6f25c71c 82d08095f0c0 00010044 82d08095f0c0
(XEN)01c9c380 83077a737e60 83077ffe7720 82d08095f100
(XEN)83077a6c59e0 82d08095f0c0 0012 83077a73c570
(XEN)82d08095f100 00010028 83070046 00440012
(XEN)82d08023d5f0 83077a7381a0 7ffb5fe0 00bd
(XEN)  0092 830060ae3000
(XEN)82d08095f100 83077a738188 017e6f25c71c 0012
(XEN)83077a737ea8 82d080236406 82d080372434 83077a7381a0
(XEN)001200737ef8 83077a738180 83077a737ee8 82d08036a04a
(XEN)02ff82d080372434 0001  deadbeefdeadf00d
(XEN)deadbeefdeadf00d 82d080934500 82d080933c00 
(XEN)83077a737fff  83077a737ed8 82d080239ec5
(XEN)830060ae3000   
(XEN)83077a737ee8 82d080239f1a 7cf8858c80e7 82d08036e566
(XEN)88018171 88018171 88018171 
(XEN)0012 81d4c180 0246 7ff0
(XEN)0001   810013aa
(XEN)0012 deadbeefdeadf00d deadbeefdeadf00d 0100
(XEN)810013aa e033 0246 880181713ee0
(XEN) Xen call trace:
(XEN)[] sched_credit.c#csched_schedule+0x8fe/0xd42
(XEN)[] schedule.c#schedule+0x107/0x627
(XEN)[] softirq.c#__do_softirq+0x85/0x90
(XEN)[] do_softirq+0x13/0x15
(XEN)[] x86_64/entry.S#process_softirqs+0x6/0x10
(XEN) 
(XEN) Panic on CPU 18:
(XEN) Assertion 'CSCHED_PCPU(cpu)->nr_runnable >= 1' failed at 
sched_credit.c:269
(XEN) 
(XEN) Reboot in five seconds...



dom0 is still alive after that attempt to reboot and for some reason triple
ctrl-a appears to work. But it seems 'R' still fails.

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

On Tue, 2018-04-10 at 16:25 +0100, George Dunlap wrote:
> On 04/10/2018 12:29 PM, Dario Faggioli wrote:
> > 
> whenever that is.  (Possibly at the end of the current call to
> vcpu_migrate(), possibly at the end of a vcpu_migrate() triggered in
> context_saved() due to VPF_migrating.)
> 
> vcpu_migrate() is called from:
>  - vcpu_force_reschedule(), which is called from
> VCPUOP_{set,stop}_periodic_timer
>  - cpu_disable_schedler(), when doing hotplug or cpupool operations
> on a cpu
>  - vcpu_set_affinity()
>  - vcpu_pin_override()
> 
> But in any case, v->processor is only set from vcpu_move_locked(),
> which
> is only called if v->is_running is false; if v->is_running is false,
> then one way or another v can't be on any runqueue.  And if v isn't
> on
> any runqueue, and we hold v's current processor lock, then it's safe
> to
> modify v->processor.
> 
Indeed.

> But obviously there's a flaw in that logic somewhere. :-)
> 
frustratingly, yes. :-/

> One thing we might consider doing is implementing the migrate()
> callback
> for the Credit scheduler, and just have it make a bunch of sanity
> checks
> (v->processor lock held, new_cpu lock held, vcpu not on any runqueue,
> ).
> 
Yep, and in fact, this is exactly what the debug patch that I will send
to Olaf (after I'll be out of a meeting) does. :-)

Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread George Dunlap

On 04/10/2018 04:18 PM, Olaf Hering wrote:
> On Tue, Apr 10, Olaf Hering wrote:
> 
>> (XEN) Xen BUG at sched_credit.c:1694
> 
> Another variant:
> 
> This time the domUs had just vcpus=36 and 
> cpus=nodes:N,node:^0/cpus_soft=nodes:N,node:^0
> 
> (XEN) Xen BUG at sched_credit.c:280
> (XEN) [ Xen-4.11.20180407T144959.e62e140daa-2.bug1087289_411  x86_64  
> debug=n   Not tainted ]
> (XEN) CPU:54
> (XEN) RIP:e008:[] 
> sched_credit.c#__runq_insert.part.13+0/0x2
> (XEN) RFLAGS: 00010087   CONTEXT: hypervisor (d96v20)
> (XEN) rax: 82d08095f100   rbx: 830670506ea0   rcx: 830779f4ae80
> (XEN) rdx: 0036f95d7080   rsi:    rdi: 830670506ea0
> (XEN) rbp: 82d08094a480   rsp: 830e7ab2fd30   r8:  830779f361a0
> (XEN) r9:  82d080227cf0   r10:    r11: 
> (XEN) r12: 033c2684bb20   r13: 830779f4ae80   r14: 830779f36180
> (XEN) r15: 033c269c6f66   cr0: 8005003b   cr4: 001526e0
> (XEN) cr3: 00067058e000   cr2: 7f1299b17000
> (XEN) fsb:    gsb:    gss: 
> (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
> (XEN) Xen code around  
> (sched_credit.c#__runq_insert.part.13):
> (XEN)  f1 ff 5a 5b 31 c0 5d c3 <0f> 0b 0f 0b 0f 0b 48 89 e2 48 8d 05 eb 5d 60 
> 00
> (XEN) Xen stack trace from rsp=830e7ab2fd30:
> (XEN)82d080228845 82e030ac7f80 0036563fc000 
> (XEN)00a3 00c0 83077a6c59e0 830e7ab2fe70
> (XEN)82d0802354b5 82d0802fff50  01c9c380
> (XEN)8027bcd8 82d0802255d0 0036 033c269c6f66
> (XEN)8307798d4f30  830779f361a0 0036
> (XEN)82d0802386cc 830779f361a0 0046 82d08023827b
> (XEN)0096 0036 830779f361c8 82d08030f9ab
> (XEN)0036 83007ba3 830779f36188 033c269c6f66
> (XEN)830779f36180 82d08094a480 82d08023153d 82d0
> (XEN)830779f361a0  82d0802e13d5 83007ba3
> (XEN)83007ba3  82d08030bef6 82d08030f9ab
> (XEN)  830e7ab2 82d080933c00
> (XEN)  82d080234cb2 
> (XEN)83007ba3   
> (XEN)82d08030fb6b  0100 0054
> (XEN)0001 88011ff16c80 8800e1e2 
> (XEN)88011f000858 88011f0006c8  
> (XEN)0001 0001 00ad 00a5
> (XEN)00fb 810c8da3  0046
> (XEN)8800ea3af910   
> (XEN) Xen call trace:
> (XEN)[] sched_credit.c#__runq_insert.part.13+0/0x2
> (XEN)[] sched_credit.c#csched_schedule+0xb55/0xba0
> (XEN)[] smp_call_function_interrupt+0x85/0xa0
> (XEN)[] vmcs.c#__vmx_clear_vmcs+0/0xe0
> (XEN)[] sched_credit.c#csched_vcpu_yield+0/0x10
> (XEN)[] timer.c#remove_entry+0x7c/0x90
> (XEN)[] timer.c#add_entry+0x4b/0xb0
> (XEN)[] vmx_asm_vmexit_handler+0xab/0x240
> (XEN)[] schedule.c#schedule+0xdd/0x5d0
> (XEN)[] hvm_interrupt_blocked+0x15/0xd0
> (XEN)[] nvmx_switch_guest+0x86/0x1a00
> (XEN)[] vmx_asm_vmexit_handler+0xab/0x240
> (XEN)[] softirq.c#__do_softirq+0x62/0x90
> (XEN)[] vmx_asm_do_vmentry+0x2b/0x30
> (XEN) 
> (XEN) Panic on CPU 54:
> (XEN) Xen BUG at sched_credit.c:280
> (XEN) 
> (XEN) Reboot in five seconds...

Ooh:

BUG_ON( __vcpu_on_runq(svc) );

So we're trying to insert a vcpu onto a runqueue, but someone's already
put it on a runqueue.  Which still doesn't quite make sense...

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread George Dunlap

On 04/10/2018 12:29 PM, Dario Faggioli wrote:
> On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote:
>> On 04/10/2018 11:33 AM, Dario Faggioli wrote:
>>> On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote:
 Assuming the bug is this one:

 BUG_ON( cpu != snext->vcpu->processor );

>>>
>>> Yes, it is that one.
>>>
>>> Another stack trace, this time from a debug=y built hypervisor, of
>>> what
>>> we are thinking it is the same bug (although reproduced in a
>>> slightly
>>> different way) is this:
>>>
>>> (XEN) [ Xen-4.7.2_02-36.1.12847.11.PTF  x86_64  debug=y  Not
>>> tainted ]
>>> (XEN) CPU:45
>>> (XEN) RIP:e008:[]
>>> sched_credit.c#csched_schedule+0x361/0xaa9
>>> ...
>>> (XEN) Xen call trace:
>>> (XEN)[]
>>> sched_credit.c#csched_schedule+0x361/0xaa9
>>> (XEN)[] schedule.c#schedule+0x109/0x5d6
>>> (XEN)[] softirq.c#__do_softirq+0x7f/0x8a
>>> (XEN)[] do_softirq+0x13/0x15
>>> (XEN)[] vmx_asm_do_vmentry+0x25/0x2a
>>>
>>> (I can provide it all, if necessary.)
>>>
>>> I've done some analysis, although when we still were not entirely
>>> sure
>>> that changing the affinities was the actual cause (or, at least,
>>> what
>>> is triggering the whole thing).
>>>
>>> In the specific case of this stack trace, the current vcpu running
>>> on
>>> CPU 45 is d3v11. It is not in the runqueue, because it has been
>>> removed, and not added back to it, and the reason is it is not
>>> runnable
>>> (it has VPF_migrating on in pause_flags).
>>>
>>> The runqueue of pcpu 45 looks fine (i.e., it is not corrupt or
>>> anything
>>> like that), it has d3v10,d9v1,d32767v45 in it (in this order)
>>>
>>> d3v11->processor is 45, so that is also fine.
>>>
>>> Basically, d3v11 wants to move away from pcpu 45, and this might
>>> (but
>>> that's not certain) be the reson because we're rescheduling. The
>>> fact
>>> that there are vcpus wanting to migrate can very well be the cause
>>> of
>>> affinity being changed.
>>>
>>> Now, the problem is that, looking into the runqueue, I found out
>>> that
>>> d3v10->processor=32. I.e., d3v10 is queued in pcpu 45's runqueue,
>>> with
>>> processor=32, which really shouldn't happen.
>>>
>>> This leads to the bug triggering, as, in csched_schedule(), we read
>>> the
>>> head of the runqueue with:
>>>
>>> snext = __runq_elem(runq->next);
>>>
>>> and then we pass snext to csched_load_balance(), where the BUG_ON
>>> is.
>>>
>>> Another thing that I've found out, is that all "misplaced" vcpus
>>> (i.e.,
>>> in this and also in other manifestations of this bug) have their
>>> csched_vcpu.flags=4, which is CSCHED_FLAGS_VCPU_MIGRATING.
>>>
>>> This, basically, is again a sign of vcpu_migrate() having been
>>> called,
>>> on d3v10 as well, which in turn has called csched_vcpu_pick().

Right; csched_cpu_pick() is only called from csched_vcpu_insert(), and
from vcpu_migrate() and restore_vcpu_affinity().

Assuming we haven't been messing around with suspend / resume or
cpupools, that means it must have happened as a result of vcpu_migrate().

If it happened as a result of vcpu_migrate(), then it can only be set
between the very first call to pick_cpu(), and the next vcpu_wake() --
whenever that is.  (Possibly at the end of the current call to
vcpu_migrate(), possibly at the end of a vcpu_migrate() triggered in
context_saved() due to VPF_migrating.)

vcpu_migrate() is called from:
 - vcpu_force_reschedule(), which is called from
VCPUOP_{set,stop}_periodic_timer
 - cpu_disable_schedler(), when doing hotplug or cpupool operations on a cpu
 - vcpu_set_affinity()
 - vcpu_pin_override()

But in any case, v->processor is only set from vcpu_move_locked(), which
is only called if v->is_running is false; if v->is_running is false,
then one way or another v can't be on any runqueue.  And if v isn't on
any runqueue, and we hold v's current processor lock, then it's safe to
modify v->processor.

But obviously there's a flaw in that logic somewhere. :-)

One thing we might consider doing is implementing the migrate() callback
for the Credit scheduler, and just have it make a bunch of sanity checks
(v->processor lock held, new_cpu lock held, vcpu not on any runqueue, ).

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering

On Tue, Apr 10, Olaf Hering wrote:

> (XEN) Xen BUG at sched_credit.c:1694

Another variant:

This time the domUs had just vcpus=36 and 
cpus=nodes:N,node:^0/cpus_soft=nodes:N,node:^0

(XEN) Xen BUG at sched_credit.c:280
(XEN) [ Xen-4.11.20180407T144959.e62e140daa-2.bug1087289_411  x86_64  
debug=n   Not tainted ]
(XEN) CPU:54
(XEN) RIP:e008:[] 
sched_credit.c#__runq_insert.part.13+0/0x2
(XEN) RFLAGS: 00010087   CONTEXT: hypervisor (d96v20)
(XEN) rax: 82d08095f100   rbx: 830670506ea0   rcx: 830779f4ae80
(XEN) rdx: 0036f95d7080   rsi:    rdi: 830670506ea0
(XEN) rbp: 82d08094a480   rsp: 830e7ab2fd30   r8:  830779f361a0
(XEN) r9:  82d080227cf0   r10:    r11: 
(XEN) r12: 033c2684bb20   r13: 830779f4ae80   r14: 830779f36180
(XEN) r15: 033c269c6f66   cr0: 8005003b   cr4: 001526e0
(XEN) cr3: 00067058e000   cr2: 7f1299b17000
(XEN) fsb:    gsb:    gss: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen code around  (sched_credit.c#__runq_insert.part.13):
(XEN)  f1 ff 5a 5b 31 c0 5d c3 <0f> 0b 0f 0b 0f 0b 48 89 e2 48 8d 05 eb 5d 60 00
(XEN) Xen stack trace from rsp=830e7ab2fd30:
(XEN)82d080228845 82e030ac7f80 0036563fc000 
(XEN)00a3 00c0 83077a6c59e0 830e7ab2fe70
(XEN)82d0802354b5 82d0802fff50  01c9c380
(XEN)8027bcd8 82d0802255d0 0036 033c269c6f66
(XEN)8307798d4f30  830779f361a0 0036
(XEN)82d0802386cc 830779f361a0 0046 82d08023827b
(XEN)0096 0036 830779f361c8 82d08030f9ab
(XEN)0036 83007ba3 830779f36188 033c269c6f66
(XEN)830779f36180 82d08094a480 82d08023153d 82d0
(XEN)830779f361a0  82d0802e13d5 83007ba3
(XEN)83007ba3  82d08030bef6 82d08030f9ab
(XEN)  830e7ab2 82d080933c00
(XEN)  82d080234cb2 
(XEN)83007ba3   
(XEN)82d08030fb6b  0100 0054
(XEN)0001 88011ff16c80 8800e1e2 
(XEN)88011f000858 88011f0006c8  
(XEN)0001 0001 00ad 00a5
(XEN)00fb 810c8da3  0046
(XEN)8800ea3af910   
(XEN) Xen call trace:
(XEN)[] sched_credit.c#__runq_insert.part.13+0/0x2
(XEN)[] sched_credit.c#csched_schedule+0xb55/0xba0
(XEN)[] smp_call_function_interrupt+0x85/0xa0
(XEN)[] vmcs.c#__vmx_clear_vmcs+0/0xe0
(XEN)[] sched_credit.c#csched_vcpu_yield+0/0x10
(XEN)[] timer.c#remove_entry+0x7c/0x90
(XEN)[] timer.c#add_entry+0x4b/0xb0
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] schedule.c#schedule+0xdd/0x5d0
(XEN)[] hvm_interrupt_blocked+0x15/0xd0
(XEN)[] nvmx_switch_guest+0x86/0x1a00
(XEN)[] vmx_asm_vmexit_handler+0xab/0x240
(XEN)[] softirq.c#__do_softirq+0x62/0x90
(XEN)[] vmx_asm_do_vmentry+0x2b/0x30
(XEN) 
(XEN) Panic on CPU 54:
(XEN) Xen BUG at sched_credit.c:280
(XEN) 
(XEN) Reboot in five seconds...


Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/4] xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_probe

2018-04-10 Thread Boris Ostrovsky

On 04/10/2018 10:31 AM, Jia-Ju Bai wrote:
>
>
>
> On 2018/4/10 22:27, Boris Ostrovsky wrote:
>> On 04/09/2018 11:03 AM, Jia-Ju Bai wrote:
>>> pcistub_probe() is never called in atomic context.
>>> This function is only set as ".probe" in struct pci_driver.
>>>
>>> Despite never getting called from atomic context,
>>> pcistub_probe() calls kmalloc() with GFP_ATOMIC,
>>> which does not sleep for allocation.
>>> GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL,
>>> which can sleep and improve the possibility of sucessful allocation.
>>>
>>> This is found by a static analysis tool named DCNS written by myself.
>>> And I also manually check it.
>>>
>>> Signed-off-by: Jia-Ju Bai 
>> What about use of GFP_ATOMIC in pcistub_reg_add()?
>
> Thanks for your reply :)
> I find pcistub_reg_add() is called by pcistub_quirk_add().
> And pcistub_quirk_add() is called in the macro DRIVER_ATTR().
> I am not sure whether DRIVER_ATTR() can make the function called in
> atomic context,
> so I do not analyze it in my tool.

I don't see why it needs to be ATOMIC, it's sysfs access. Can you send a
patch to fix it as well?


Thanks.
-boris




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/4] xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_probe

2018-04-10 Thread Jia-Ju Bai




On 2018/4/10 22:27, Boris Ostrovsky wrote:

On 04/09/2018 11:03 AM, Jia-Ju Bai wrote:

pcistub_probe() is never called in atomic context.
This function is only set as ".probe" in struct pci_driver.

Despite never getting called from atomic context,
pcistub_probe() calls kmalloc() with GFP_ATOMIC,
which does not sleep for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL,
which can sleep and improve the possibility of sucessful allocation.

This is found by a static analysis tool named DCNS written by myself.
And I also manually check it.

Signed-off-by: Jia-Ju Bai

What about use of GFP_ATOMIC in pcistub_reg_add()?


Thanks for your reply :)
I find pcistub_reg_add() is called by pcistub_quirk_add().
And pcistub_quirk_add() is called in the macro DRIVER_ATTR().
I am not sure whether DRIVER_ATTR() can make the function called in 
atomic context,

so I do not analyze it in my tool.


Best wishes,
Jia-Ju Bai
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/4] xen: xen-pciback: Replace GFP_ATOMIC with GFP_KERNEL in pcistub_probe

2018-04-10 Thread Boris Ostrovsky

On 04/09/2018 11:03 AM, Jia-Ju Bai wrote:
> pcistub_probe() is never called in atomic context.
> This function is only set as ".probe" in struct pci_driver.
>
> Despite never getting called from atomic context,
> pcistub_probe() calls kmalloc() with GFP_ATOMIC,
> which does not sleep for allocation.
> GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL,
> which can sleep and improve the possibility of sucessful allocation.
>
> This is found by a static analysis tool named DCNS written by myself.
> And I also manually check it.
>
> Signed-off-by: Jia-Ju Bai 

What about use of GFP_ATOMIC in pcistub_reg_add()?

-boris

> ---
>  drivers/xen/xen-pciback/pci_stub.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/xen/xen-pciback/pci_stub.c 
> b/drivers/xen/xen-pciback/pci_stub.c
> index 9e480fd..95e6ddd 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -577,7 +577,7 @@ static int pcistub_probe(struct pci_dev *dev, const 
> struct pci_device_id *id)
>   }
>  
>   if (!match) {
> - pci_dev_id = kmalloc(sizeof(*pci_dev_id), GFP_ATOMIC);
> + pci_dev_id = kmalloc(sizeof(*pci_dev_id), GFP_KERNEL);
>   if (!pci_dev_id) {
>   err = -ENOMEM;
>   goto out;


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen

2018-04-10 Thread Boris Ostrovsky

On 04/09/2018 02:51 PM, Boris Ostrovsky wrote:
> Pre-4.17 kernels ignored start_info's rsdp_paddr pointer and instead
> relied on finding RSDP in standard location in BIOS RO memory. This
> has worked since that's where Xen used to place it.
>
> However, with recent Xen change (commit 4a5733771e6f ("libxl: put RSDP
> for PVH guest near 4GB")) it prefers to keep RSDP at a "non-standard"
> address. Even though as of commit b17d9d1df3c3 ("x86/xen: Add pvh
> specific rsdp address retrieval function") Linux is able to find RSDP,
> for back-compatibility reasons we need to indicate to Xen that we can
> handle this, an we do so by setting XENFEAT_linux_rsdp_unrestricted
> flag in ELF notes.
>
> (Also take this opportunity and sync features.h header file with Xen)
>
> Signed-off-by: Boris Ostrovsky 



Committed to for-linus-4.17.

-boris


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 122159: regressions - FAIL

2018-04-10 Thread osstest service owner

flight 122159 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122159/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   6 xen-buildfail REGR. vs. 121876
 build-armhf   6 xen-buildfail REGR. vs. 121876

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass

version targeted for testing:
 xen  21b5d48cf471709c933055adf3fe22fa0fbc3f85
baseline version:
 xen  451004603247205467ec34b366b4cfa3814a5d95

Last test of basis   121876  2018-04-05 10:04:25 Z5 days
Failing since121889  2018-04-05 13:02:10 Z4 days   45 attempts
Testing same since   122159  2018-04-10 11:09:16 Z0 days1 attempts


People who touched revisions under test:
  Amit Singh Tomar 
  Andre Przywara 
  Andre Pzywara 
  Andrew Cooper 
  Boris Ostrovsky 
  George Dunlap 
  Jan Beulich 
  Juergen Gross 
  Julien Grall 
  Kevin Tian 
  Marcello Seri 
  Marcus of Wetware Labs 
  Marek Marczykowski-GÃ³recki 
  Petre Eftime 
  Razvan Cojocaru 
  Stefano Stabellini 
  Tim Deegan 
  Wei Liu 

jobs:
 build-arm64-xsm  fail
 build-amd64  pass
 build-armhf  fail
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  blocked 
 test-arm64-arm64-xl-xsm  blocked 
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 789 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6.1 9/9] xen/x86: use PCID feature

2018-04-10 Thread Juergen Gross

Avoid flushing the complete TLB when switching %cr3 for mitigation of
Meltdown by using the PCID feature if available.

We are using 4 PCID values for a 64 bit pv domain subject to XPTI and
2 values for the non-XPTI case:

- guest active and in kernel mode
- guest active and in user mode
- hypervisor active and guest in user mode (XPTI only)
- hypervisor active and guest in kernel mode (XPTI only)

We use PCID only if PCID _and_ INVPCID are supported. With PCID in use
we disable global pages in cr4. A command line parameter controls in
which cases PCID is being used.

As the non-XPTI case has shown not to perform better with PCID at least
on some machines the default is to use PCID only for domains subject to
XPTI.

With PCID enabled we always disable global pages. This avoids having to
either flush the complete TLB or do a cycle through all PCID values
when invalidating a single global page.

Signed-off-by: Juergen Gross 
Reviewed-by: Jan Beulich 
---
V6.1:
- address some minor comments (Jan Beulich)

V6:
- split off pv_guest_cr4_to_real_cr4() conversion to function into new
  patch (Andrew Cooper)
- changed some comments (Jan Beulich, Andrew Cooper)

V5:
- use X86_CR3_ADDR_MASK instead of ~X86_CR3_PCID_MASK (Jan Beulich)
- add some const qualifiers (Jan Beulich)
- mask X86_CR3_ADDR_MASK with PADDR_MASK (Jan Beulich)
- add flushing the TLB from old PCID related entries in write_cr3_cr4()
  (Jan Beulich)

V4:
- add cr3 mask for page table address and use that in dbg_pv_va2mfn()
  (Jan Beulich)
- use invpcid_flush_all_nonglobals() instead of invpcid_flush_all()
  (Jan Beulich)
- use PCIDs 0/1 when running in Xen or without XPTI, 2/3 with XPTI in
  guest (Jan Beulich)
- ASSERT cr4.pge and cr4.pcide are never active at the same time
  (Jan Beulich)
- make pv_guest_cr4_to_real_cr4() a real function

V3:
- support PCID for non-XPTI case, too
- add command line parameter for controlling usage of PCID
- check PCID active by using cr4.pcide (Jan Beulich)
---
 docs/misc/xen-command-line.markdown | 14 +++
 xen/arch/x86/flushtlb.c | 47 -
 xen/arch/x86/mm.c   | 16 +++-
 xen/arch/x86/pv/dom0_build.c|  1 +
 xen/arch/x86/pv/domain.c| 81 -
 xen/include/asm-x86/domain.h|  4 +-
 xen/include/asm-x86/processor.h |  3 ++
 xen/include/asm-x86/pv/domain.h | 31 ++
 8 files changed, 191 insertions(+), 6 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 451a4fa566..f8950a3bb2 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1451,6 +1451,20 @@ All numbers specified must be hexadecimal ones.
 
 This option can be specified more than once (up to 8 times at present).
 
+### pcid (x86)
+> `=  | xpti=`
+
+> Default: `xpti`
+
+> Can be modified at runtime (change takes effect only for domains created
+  afterwards)
+
+If available, control usage of the PCID feature of the processor for
+64-bit pv-domains. PCID can be used either for no domain at all (`false`),
+for all of them (`true`), only for those subject to XPTI (`xpti`) or for
+those not subject to XPTI (`no-xpti`). The feature is used only in case
+INVPCID is supported and not disabled via `invpcid=false`.
+
 ### ple\_gap
 > `= `
 
diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c
index e28bf04a37..8dd184d8be 100644
--- a/xen/arch/x86/flushtlb.c
+++ b/xen/arch/x86/flushtlb.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Debug builds: Wrap frequently to stress-test the wrap logic. */
 #ifdef NDEBUG
@@ -93,6 +94,7 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long cr4)
 {
 unsigned long flags, old_cr4;
 u32 t;
+unsigned long old_pcid = cr3_pcid(read_cr3());
 
 /* This non-reentrant function is sometimes called in interrupt context. */
 local_irq_save(flags);
@@ -102,14 +104,34 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long cr4)
 old_cr4 = read_cr4();
 if ( old_cr4 & X86_CR4_PGE )
 {
+/*
+ * X86_CR4_PGE set means PCID is inactive.
+ * We have to purge the TLB via flipping cr4.pge.
+ */
 old_cr4 = cr4 & ~X86_CR4_PGE;
 write_cr4(old_cr4);
 }
+else if ( use_invpcid )
+/*
+ * Flushing the TLB via INVPCID is necessary only in case PCIDs are
+ * in use, which is true only with INVPCID being available.
+ * Without PCID usage the following write_cr3() will purge the TLB
+ * (we are in the cr4.pge off path) of all entries.
+ * Using invpcid_flush_all_nonglobals() seems to be faster than
+ * invpcid_flush_all(), so use that.
+ */
+invpcid_flush_all_nonglobals();
 
 write_cr3(cr3);
 
 if ( old_cr4 != cr4 )
 write_cr4(cr4);
+else if ( old_pcid != cr3_pcid(cr3) )
+/*
+ * Make sure

[Xen-devel] [PATCH v6.1 3/9] xen/x86: support per-domain flag for xpti

2018-04-10 Thread Juergen Gross

Instead of switching XPTI globally on or off add a per-domain flag for
that purpose. This allows to modify the xpti boot parameter to support
running dom0 without Meltdown mitigations. Using "xpti=nodom0" as boot
parameter will achieve that.

Move the xpti boot parameter handling to xen/arch/x86/pv/domain.c as
it is pv-domain specific.

Signed-off-by: Juergen Gross 
Reviewed-by: Jan Beulich 
---
V6.1:
- address some minor comments (Jan Beulich)

V6:
- modify xpti boot parameter options (Andrew Cooper)
- move xpti_init() code to spec_ctrl.c (Andrew Cooper)
- irework init of per-domain xpti flag (Andrew Cooper)

V3:
- latch get_cpu_info() return value in variable (Jan Beulich)
- call always xpti_domain_init() for pv dom0 (Jan Beulich)
- add __init annotations (Jan Beulich)
- drop per domain XPTI message (Jan Beulich)
- document xpti=default support (Jan Beulich)
- move domain xpti flag into a padding hole (Jan Beulich)
---
 docs/misc/xen-command-line.markdown | 14 ++--
 xen/arch/x86/mm.c   | 17 +++--
 xen/arch/x86/pv/dom0_build.c|  1 +
 xen/arch/x86/pv/domain.c|  6 
 xen/arch/x86/setup.c| 19 --
 xen/arch/x86/smpboot.c  |  4 +--
 xen/arch/x86/spec_ctrl.c| 70 +
 xen/include/asm-x86/current.h   |  3 +-
 xen/include/asm-x86/domain.h|  3 ++
 xen/include/asm-x86/spec_ctrl.h |  4 +++
 10 files changed, 115 insertions(+), 26 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index b353352adf..d4f758487a 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1955,14 +1955,24 @@ clustered mode.  The default, given no hint from the 
**FADT**, is cluster
 mode.
 
 ### xpti
-> `= `
+> `= List of [ default |  | dom0= | domu= ]`
 
-> Default: `false` on AMD hardware
+> Default: `false` on hardware not vulnerable to Meltdown (e.g. AMD)
 > Default: `true` everywhere else
 
 Override default selection of whether to isolate 64-bit PV guest page
 tables.
 
+`true` activates page table isolation even on hardware not vulnerable by
+Meltdown for all domains.
+
+`false` deactivates page table isolation on all systems for all domains.
+
+`default` sets the default behaviour.
+
+With `dom0` and `domu` it is possible to control page table isolation
+for dom0 or guest domains only.
+
 ### xsave
 > `= `
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index e245d96a97..9c36614099 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -502,8 +502,21 @@ void make_cr3(struct vcpu *v, mfn_t mfn)
 
 void write_ptbase(struct vcpu *v)
 {
-get_cpu_info()->root_pgt_changed = true;
-switch_cr3(v->arch.cr3);
+struct cpu_info *cpu_info = get_cpu_info();
+
+if ( is_pv_vcpu(v) && v->domain->arch.pv_domain.xpti )
+{
+cpu_info->root_pgt_changed = true;
+cpu_info->pv_cr3 = __pa(this_cpu(root_pgt));
+switch_cr3(v->arch.cr3);
+}
+else
+{
+/* Make sure to clear xen_cr3 before pv_cr3; switch_cr3() serializes. 
*/
+cpu_info->xen_cr3 = 0;
+switch_cr3(v->arch.cr3);
+cpu_info->pv_cr3 = 0;
+}
 }
 
 /*
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index 5b4325b87f..d148395919 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -387,6 +387,7 @@ int __init dom0_construct_pv(struct domain *d,
 if ( compat32 )
 {
 d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 1;
+d->arch.pv_domain.xpti = false;
 v->vcpu_info = (void *)>shared_info->compat.vcpu_info[0];
 if ( setup_compat_arg_xlat(v) != 0 )
 BUG();
diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
index be40843b05..ce1a1a9d35 100644
--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 
+#include 
 #include 
 
 static void noreturn continue_nonidle_domain(struct vcpu *v)
@@ -75,6 +76,8 @@ int switch_compat(struct domain *d)
 
 d->arch.x87_fip_width = 4;
 
+d->arch.pv_domain.xpti = false;
+
 return 0;
 
  undo_and_fail:
@@ -205,6 +208,9 @@ int pv_domain_initialise(struct domain *d)
 /* 64-bit PV guest by default. */
 d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
 
+d->arch.pv_domain.xpti = opt_xpti & (is_hardware_domain(d)
+ ? OPT_XPTI_DOM0 : OPT_XPTI_DOMU);
+
 return 0;
 
   fail:
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index b521db25a8..887d75a981 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -169,9 +169,6 @@ static int __init parse_smap_param(const char *s)
 }
 custom_param("smap", parse_smap_param);
 
-static int8_t __initdata opt_xpti = -1;
-boolean_param("xpti", opt_xpti);
-
 bool __read_mostly acpi_disabled;
 bool __initdata acpi_force;
 static char __initdata

Re: [Xen-devel] [PATCH governance.git] Make Security Policy Doc ready to become a CNA

2018-04-10 Thread Lars Kurth



On 10/04/2018, 09:12, "Juergen Gross"  wrote:

On 09/04/18 17:02, Lars Kurth wrote:
> Note: this time with html disabled
> 
> To become a CNA, we need to more clearly specifiy the scope of
> security support. This change updates the document and points
> to SUPPORT.md and pages generated from SUPPORT.md
>  
> Also fixed a typo in the following paragraph.
>  
> Signed-off-by: Lars Kurth 
> ---
> security-policy.pandoc | 12 ++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>  
> diff --git a/security-policy.pandoc b/security-policy.pandoc
> index 5783183..6796220 100644
> --- a/security-policy.pandoc
> +++ b/security-policy.pandoc
> @@ -19,7 +19,15 @@ Scope of this process
>  This process primarily covers the [Xen Hypervisor
> 
Project](index.php?option=com_content=article=82:xen-hypervisor=80:developers=484).
> -Vulnerabilties reported against other Xen Project teams will be handled 
on a
> +Specific information about features with security support can be found in
> +
> +1.  
[SUPPORT.md](http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=SUPPORT.md)
> +in the releases' tar ball and its xen.git tree and on
> +[web pages generated from the SUPPORT.md 
file](http://xenbits.xenproject.org/docs/support/)
> +2.  For releases that do not contain SUPPORT.md, this information can be 
found
> +on the [Release Feature wiki 
page](https://wiki.xenproject.org/wiki/Xen_Project_Release_Features)
> +
> +Vulnerabilities reported against other Xen Project teams will be handled 
on a
> best effort basis by the relevant Project Lead together with the Security
> Response Team.
> @@ -401,7 +409,7 @@ Change History
> --
>  
> -
> +-   **v3.18 April 9th 2017:** Added reference to SUPPORT.md

 ^ 2018?

Oh, yes. Will fix when I commit, as I will fix the exact date then anyway. I 
don’t think I need another review cycle for this one issue
Lars
 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] x86/HVM: suppress I/O completion for port output

2018-04-10 Thread Juergen Gross

On 09/04/18 15:23, Jan Beulich wrote:
> We don't break up port requests in case they cross emulation entity
> boundaries, and a write to an I/O port is necessarily the last
> operation of an instruction instance, so there's no need to re-invoke
> the full emulation path upon receiving the result from an external
> emulator.
> 
> In case we want to properly split port accesses in the future, this
> change will need to be reverted, as it would prevent things working
> correctly when e.g. the first part needs to go to an external emulator,
> while the second part is to be handled internally.
> 
> While this addresses the reported problem of Windows paging out the
> buffer underneath an in-process REP OUTS, it does not address the wider
> problem of the re-issued insn (to the insn emulator) being prone to
> raise an exception (#PF) during a replayed, previously successful memory
> access (we only record prior MMIO accesses).
> 
> Leaving aside the problem tried to be worked around here, I think the
> performance aspect alone is a good reason to change the behavior.
> 
> Also take the opportunity and change bool_t -> bool as
> hvm_vcpu_io_need_completion()'s return type.
> 
> Signed-off-by: Jan Beulich 

Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.8-testing test] 122132: regressions - FAIL

2018-04-10 Thread osstest service owner

flight 122132 xen-4.8-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122132/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail 
REGR. vs. 121318

Tests which did not succeed, but are not blocking:
 test-xtf-amd64-amd64-2  50 xtf/test-hvm64-lbr-tsx-vmentry fail like 121291
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 121291
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 121318
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 121318
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 121318
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 121318
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 121318
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 121318
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 121318
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 121318
 build-amd64-prev  7 xen-build/dist-test  fail   never pass
 build-i386-prev   7 xen-build/dist-test  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 xen  08647952260725344f4e67d2190c2c4c8457cea2
baseline version:
 xen  866dedabb3e51a56c1b9ad4206ee0ffaf0b5c4b3

Last test of basis   121318  2018-03-27 22:19:36 Z   13 days
Testing same since   122132  2018-04-09 10:53:19 Z1 days

Re: [Xen-devel] [for-4.11][PATCH] libxl: arm: Fix build after c/s 74fd984ae

2018-04-10 Thread Juergen Gross

On 10/04/18 13:24, Julien Grall wrote:
> c/s 74fd984ae "tools/libxl: Drop xc_domain_configuration_t from
> libxl__domain_build_state" removed state->config completely but missed
> some conversion libxl_arm.c.
> 
> Furthermore, not all the fields of xc_domain_configuration_t have a
> corresponding field in libxl_domain_build_info. This is the case of
> clock_frequency. As the field should not be exposed to the user, add a
> corresponding field in libxl__domain_build_state. This require some
> modification in the prototype of libxl__domain_make in order to have the
> state.
> 
> For all the other fields, use the up-to-date version in
> libxl_domain_build-info.
> 
> Signed-off-by: Julien Grall 

Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [for-4.11][PATCH] libxl: arm: Fix build after c/s 74fd984ae

2018-04-10 Thread Wei Liu

On Tue, Apr 10, 2018 at 12:24:53PM +0100, Julien Grall wrote:
> c/s 74fd984ae "tools/libxl: Drop xc_domain_configuration_t from
> libxl__domain_build_state" removed state->config completely but missed
> some conversion libxl_arm.c.
> 
> Furthermore, not all the fields of xc_domain_configuration_t have a
> corresponding field in libxl_domain_build_info. This is the case of
> clock_frequency. As the field should not be exposed to the user, add a
> corresponding field in libxl__domain_build_state. This require some
> modification in the prototype of libxl__domain_make in order to have the
> state.
> 
> For all the other fields, use the up-to-date version in
> libxl_domain_build-info.

Typo here.

> 
> Signed-off-by: Julien Grall 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote:
> On 04/10/2018 11:33 AM, Dario Faggioli wrote:
> > On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote:
> > > Assuming the bug is this one:
> > > 
> > > BUG_ON( cpu != snext->vcpu->processor );
> > > 
> > 
> > Yes, it is that one.
> > 
> > Another stack trace, this time from a debug=y built hypervisor, of
> > what
> > we are thinking it is the same bug (although reproduced in a
> > slightly
> > different way) is this:
> > 
> > (XEN) [ Xen-4.7.2_02-36.1.12847.11.PTF  x86_64  debug=y  Not
> > tainted ]
> > (XEN) CPU:45
> > (XEN) RIP:e008:[]
> > sched_credit.c#csched_schedule+0x361/0xaa9
> > ...
> > (XEN) Xen call trace:
> > (XEN)[]
> > sched_credit.c#csched_schedule+0x361/0xaa9
> > (XEN)[] schedule.c#schedule+0x109/0x5d6
> > (XEN)[] softirq.c#__do_softirq+0x7f/0x8a
> > (XEN)[] do_softirq+0x13/0x15
> > (XEN)[] vmx_asm_do_vmentry+0x25/0x2a
> > 
> > (I can provide it all, if necessary.)
> > 
> > I've done some analysis, although when we still were not entirely
> > sure
> > that changing the affinities was the actual cause (or, at least,
> > what
> > is triggering the whole thing).
> > 
> > In the specific case of this stack trace, the current vcpu running
> > on
> > CPU 45 is d3v11. It is not in the runqueue, because it has been
> > removed, and not added back to it, and the reason is it is not
> > runnable
> > (it has VPF_migrating on in pause_flags).
> > 
> > The runqueue of pcpu 45 looks fine (i.e., it is not corrupt or
> > anything
> > like that), it has d3v10,d9v1,d32767v45 in it (in this order)
> > 
> > d3v11->processor is 45, so that is also fine.
> > 
> > Basically, d3v11 wants to move away from pcpu 45, and this might
> > (but
> > that's not certain) be the reson because we're rescheduling. The
> > fact
> > that there are vcpus wanting to migrate can very well be the cause
> > of
> > affinity being changed.
> > 
> > Now, the problem is that, looking into the runqueue, I found out
> > that
> > d3v10->processor=32. I.e., d3v10 is queued in pcpu 45's runqueue,
> > with
> > processor=32, which really shouldn't happen.
> > 
> > This leads to the bug triggering, as, in csched_schedule(), we read
> > the
> > head of the runqueue with:
> > 
> > snext = __runq_elem(runq->next);
> > 
> > and then we pass snext to csched_load_balance(), where the BUG_ON
> > is.
> > 
> > Another thing that I've found out, is that all "misplaced" vcpus
> > (i.e.,
> > in this and also in other manifestations of this bug) have their
> > csched_vcpu.flags=4, which is CSCHED_FLAGS_VCPU_MIGRATING.
> > 
> > This, basically, is again a sign of vcpu_migrate() having been
> > called,
> > on d3v10 as well, which in turn has called csched_vcpu_pick().
> > 
> > > a nasty race condition… a vcpu has just been taken off the
> > > runqueue
> > > of the current pcpu, but it’s apparently been assigned to a
> > > different
> > > cpu.
> > > 
> > 
> > Nasty indeed. I've been looking into this on and off, but so far I
> > haven't found the root cause.
> > 
> > Now that we know for sure that it is changing affinity that trigger
> > it,
> > the field of the investigation can be narrowed a little bit... But
> > I
> > still am finding hard to spot where the race happens.
> > 
> > I'll look more into this later in the afternoon. I'll let know if
> > something comes to mind.
> 
> Actually, it looks quite simple:  schedule.c:vcpu_move_locked() is
> supposed to actually do the moving; if vcpu_scheduler()->migrate is
> defined, it calls that; otherwise, it just sets v-
> >processor.  Credit1
> doesn't define migrate.  So when changing the vcpu affinity on
> credit1,
> v->processor is simply modified without it changing runqueues.
> 
> The real question is why it's so hard to actually trigger any
> problems!
> 
Wait, but when vcpu_move_locked() is called, the vcpu being moved
should not be in any runqueue.

In fact, it is called from vcpu_migrate() which, in its turn, is always
 preceded by a call to vcpu_sleep_nosync(), that removes the vcpu from
the runqueue.

The only exception is when it is called from context_saved(). But then
again, the vcpu on which it is called is not on the runqueue, because
it was found not runnable.

That is why things works... well, apart from this bug. :-)

I mean, the root cause of this bug may very well be that there is a
code path that leads to calling vcpu_move_locked() on a vcpu that is
still in a runqueue... but have you actually identified it?

> But as a quick fix, implementing csched_vcpu_migrate() is probably
> the
> best solution.  Do you want to pick that up, or should I?
> 
And what should csched_vcpu_migrate() do, apart from changing
vc->processor?

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

signature.asc
Description: This is a digitally signed message part

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote:
> On 04/10/2018 11:33 AM, Dario Faggioli wrote:
> > On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote:
> > > Assuming the bug is this one:
> > > 
> > > BUG_ON( cpu != snext->vcpu->processor );
> > > 
> > 
> > Yes, it is that one.
> > 
> > Another stack trace, this time from a debug=y built hypervisor, of
> > what
> > we are thinking it is the same bug (although reproduced in a
> > slightly
> > different way) is this:
> > 
> > (XEN) [ Xen-4.7.2_02-36.1.12847.11.PTF  x86_64  debug=y  Not
> > tainted ]
> > (XEN) CPU:45
> > (XEN) RIP:e008:[]
> > sched_credit.c#csched_schedule+0x361/0xaa9
> > ...
> > (XEN) Xen call trace:
> > (XEN)[]
> > sched_credit.c#csched_schedule+0x361/0xaa9
> > (XEN)[] schedule.c#schedule+0x109/0x5d6
> > (XEN)[] softirq.c#__do_softirq+0x7f/0x8a
> > (XEN)[] do_softirq+0x13/0x15
> > (XEN)[] vmx_asm_do_vmentry+0x25/0x2a
> > 
> > (I can provide it all, if necessary.)
> > 
> > I've done some analysis, although when we still were not entirely
> > sure
> > that changing the affinities was the actual cause (or, at least,
> > what
> > is triggering the whole thing).
> > 
> > In the specific case of this stack trace, the current vcpu running
> > on
> > CPU 45 is d3v11. It is not in the runqueue, because it has been
> > removed, and not added back to it, and the reason is it is not
> > runnable
> > (it has VPF_migrating on in pause_flags).
> > 
> > The runqueue of pcpu 45 looks fine (i.e., it is not corrupt or
> > anything
> > like that), it has d3v10,d9v1,d32767v45 in it (in this order)
> > 
> > d3v11->processor is 45, so that is also fine.
> > 
> > Basically, d3v11 wants to move away from pcpu 45, and this might
> > (but
> > that's not certain) be the reson because we're rescheduling. The
> > fact
> > that there are vcpus wanting to migrate can very well be the cause
> > of
> > affinity being changed.
> > 
> > Now, the problem is that, looking into the runqueue, I found out
> > that
> > d3v10->processor=32. I.e., d3v10 is queued in pcpu 45's runqueue,
> > with
> > processor=32, which really shouldn't happen.
> > 
> > This leads to the bug triggering, as, in csched_schedule(), we read
> > the
> > head of the runqueue with:
> > 
> > snext = __runq_elem(runq->next);
> > 
> > and then we pass snext to csched_load_balance(), where the BUG_ON
> > is.
> > 
> > Another thing that I've found out, is that all "misplaced" vcpus
> > (i.e.,
> > in this and also in other manifestations of this bug) have their
> > csched_vcpu.flags=4, which is CSCHED_FLAGS_VCPU_MIGRATING.
> > 
> > This, basically, is again a sign of vcpu_migrate() having been
> > called,
> > on d3v10 as well, which in turn has called csched_vcpu_pick().
> > 
> > > a nasty race condition… a vcpu has just been taken off the
> > > runqueue
> > > of the current pcpu, but it’s apparently been assigned to a
> > > different
> > > cpu.
> > > 
> > 
> > Nasty indeed. I've been looking into this on and off, but so far I
> > haven't found the root cause.
> > 
> > Now that we know for sure that it is changing affinity that trigger
> > it,
> > the field of the investigation can be narrowed a little bit... But
> > I
> > still am finding hard to spot where the race happens.
> > 
> > I'll look more into this later in the afternoon. I'll let know if
> > something comes to mind.
> 
> Actually, it looks quite simple:  schedule.c:vcpu_move_locked() is
> supposed to actually do the moving; if vcpu_scheduler()->migrate is
> defined, it calls that; otherwise, it just sets v-
> >processor.  Credit1
> doesn't define migrate.  So when changing the vcpu affinity on
> credit1,
> v->processor is simply modified without it changing runqueues.
> 
> The real question is why it's so hard to actually trigger any
> problems!
> 
Wait, but when vcpu_move_locked() is called, the vcpu being moved
should not be in any runqueue.

In fact, it is called from vcpu_migrate() which, in its turn, is always
 preceded by a call to vcpu_sleep_nosync(), that removes the vcpu from
the runqueue.

The only exception is when it is called from context_saved(). But then
again, the vcpu on which it is called is not on the runqueue, because
it was found not runnable.

That is why things works... well, apart from this bug. :-)

I mean, the root cause of this bug may very well be that there is a
code path that leads to calling vcpu_move_locked() on a vcpu that is
still in a runqueue... but have you actually identified it?

> But as a quick fix, implementing csched_vcpu_migrate() is probably
> the
> best solution.  Do you want to pick that up, or should I?
> 
And what should csched_vcpu_migrate() do, apart from changing
vc->processor?

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

signature.asc
Description: This is a digitally signed message part

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote:
> On 04/10/2018 11:33 AM, Dario Faggioli wrote:
> > On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote:
> > > Assuming the bug is this one:
> > > 
> > > BUG_ON( cpu != snext->vcpu->processor );
> > > 
> > 
> > Yes, it is that one.
> > 
> > Another stack trace, this time from a debug=y built hypervisor, of
> > what
> > we are thinking it is the same bug (although reproduced in a
> > slightly
> > different way) is this:
> > 
> > (XEN) [ Xen-4.7.2_02-36.1.12847.11.PTF  x86_64  debug=y  Not
> > tainted ]
> > (XEN) CPU:45
> > (XEN) RIP:e008:[]
> > sched_credit.c#csched_schedule+0x361/0xaa9
> > ...
> > (XEN) Xen call trace:
> > (XEN)[]
> > sched_credit.c#csched_schedule+0x361/0xaa9
> > (XEN)[] schedule.c#schedule+0x109/0x5d6
> > (XEN)[] softirq.c#__do_softirq+0x7f/0x8a
> > (XEN)[] do_softirq+0x13/0x15
> > (XEN)[] vmx_asm_do_vmentry+0x25/0x2a
> > 
> > (I can provide it all, if necessary.)
> > 
> > I've done some analysis, although when we still were not entirely
> > sure
> > that changing the affinities was the actual cause (or, at least,
> > what
> > is triggering the whole thing).
> > 
> > In the specific case of this stack trace, the current vcpu running
> > on
> > CPU 45 is d3v11. It is not in the runqueue, because it has been
> > removed, and not added back to it, and the reason is it is not
> > runnable
> > (it has VPF_migrating on in pause_flags).
> > 
> > The runqueue of pcpu 45 looks fine (i.e., it is not corrupt or
> > anything
> > like that), it has d3v10,d9v1,d32767v45 in it (in this order)
> > 
> > d3v11->processor is 45, so that is also fine.
> > 
> > Basically, d3v11 wants to move away from pcpu 45, and this might
> > (but
> > that's not certain) be the reson because we're rescheduling. The
> > fact
> > that there are vcpus wanting to migrate can very well be the cause
> > of
> > affinity being changed.
> > 
> > Now, the problem is that, looking into the runqueue, I found out
> > that
> > d3v10->processor=32. I.e., d3v10 is queued in pcpu 45's runqueue,
> > with
> > processor=32, which really shouldn't happen.
> > 
> > This leads to the bug triggering, as, in csched_schedule(), we read
> > the
> > head of the runqueue with:
> > 
> > snext = __runq_elem(runq->next);
> > 
> > and then we pass snext to csched_load_balance(), where the BUG_ON
> > is.
> > 
> > Another thing that I've found out, is that all "misplaced" vcpus
> > (i.e.,
> > in this and also in other manifestations of this bug) have their
> > csched_vcpu.flags=4, which is CSCHED_FLAGS_VCPU_MIGRATING.
> > 
> > This, basically, is again a sign of vcpu_migrate() having been
> > called,
> > on d3v10 as well, which in turn has called csched_vcpu_pick().
> > 
> > > a nasty race condition… a vcpu has just been taken off the
> > > runqueue
> > > of the current pcpu, but it’s apparently been assigned to a
> > > different
> > > cpu.
> > > 
> > 
> > Nasty indeed. I've been looking into this on and off, but so far I
> > haven't found the root cause.
> > 
> > Now that we know for sure that it is changing affinity that trigger
> > it,
> > the field of the investigation can be narrowed a little bit... But
> > I
> > still am finding hard to spot where the race happens.
> > 
> > I'll look more into this later in the afternoon. I'll let know if
> > something comes to mind.
> 
> Actually, it looks quite simple:  schedule.c:vcpu_move_locked() is
> supposed to actually do the moving; if vcpu_scheduler()->migrate is
> defined, it calls that; otherwise, it just sets v-
> >processor.  Credit1
> doesn't define migrate.  So when changing the vcpu affinity on
> credit1,
> v->processor is simply modified without it changing runqueues.
> 
> The real question is why it's so hard to actually trigger any
> problems!
> 
Wait, but when vcpu_move_locked() is called, the vcpu being moved
should not be in any runqueue.

In fact, it is called from vcpu_migrate() which, in its turn, is always
 preceded by a call to vcpu_sleep_nosync(), that removes the vcpu from
the runqueue.

The only exception is when it is called from context_saved(). But then
again, the vcpu on which it is called is not on the runqueue, because
it was found not runnable.

That is why things works... well, apart from this bug. :-)

I mean, the root cause of this bug may very well be that there is a
code path that leads to calling vcpu_move_locked() on a vcpu that is
still in a runqueue... but have you actually identified it?

> But as a quick fix, implementing csched_vcpu_migrate() is probably
> the
> best solution.  Do you want to pick that up, or should I?
> 
And what should csched_vcpu_migrate() do, apart from changing
vc->processor?

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

signature.asc
Description: This is a digitally signed message part

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote:
> On 04/10/2018 11:33 AM, Dario Faggioli wrote:
> > On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote:
> > > Assuming the bug is this one:
> > > 
> > > BUG_ON( cpu != snext->vcpu->processor );
> > > 
> > 
> > Yes, it is that one.
> > 
> > Another stack trace, this time from a debug=y built hypervisor, of
> > what
> > we are thinking it is the same bug (although reproduced in a
> > slightly
> > different way) is this:
> > 
> > (XEN) [ Xen-4.7.2_02-36.1.12847.11.PTF  x86_64  debug=y  Not
> > tainted ]
> > (XEN) CPU:45
> > (XEN) RIP:e008:[]
> > sched_credit.c#csched_schedule+0x361/0xaa9
> > ...
> > (XEN) Xen call trace:
> > (XEN)[]
> > sched_credit.c#csched_schedule+0x361/0xaa9
> > (XEN)[] schedule.c#schedule+0x109/0x5d6
> > (XEN)[] softirq.c#__do_softirq+0x7f/0x8a
> > (XEN)[] do_softirq+0x13/0x15
> > (XEN)[] vmx_asm_do_vmentry+0x25/0x2a
> > 
> > (I can provide it all, if necessary.)
> > 
> > I've done some analysis, although when we still were not entirely
> > sure
> > that changing the affinities was the actual cause (or, at least,
> > what
> > is triggering the whole thing).
> > 
> > In the specific case of this stack trace, the current vcpu running
> > on
> > CPU 45 is d3v11. It is not in the runqueue, because it has been
> > removed, and not added back to it, and the reason is it is not
> > runnable
> > (it has VPF_migrating on in pause_flags).
> > 
> > The runqueue of pcpu 45 looks fine (i.e., it is not corrupt or
> > anything
> > like that), it has d3v10,d9v1,d32767v45 in it (in this order)
> > 
> > d3v11->processor is 45, so that is also fine.
> > 
> > Basically, d3v11 wants to move away from pcpu 45, and this might
> > (but
> > that's not certain) be the reson because we're rescheduling. The
> > fact
> > that there are vcpus wanting to migrate can very well be the cause
> > of
> > affinity being changed.
> > 
> > Now, the problem is that, looking into the runqueue, I found out
> > that
> > d3v10->processor=32. I.e., d3v10 is queued in pcpu 45's runqueue,
> > with
> > processor=32, which really shouldn't happen.
> > 
> > This leads to the bug triggering, as, in csched_schedule(), we read
> > the
> > head of the runqueue with:
> > 
> > snext = __runq_elem(runq->next);
> > 
> > and then we pass snext to csched_load_balance(), where the BUG_ON
> > is.
> > 
> > Another thing that I've found out, is that all "misplaced" vcpus
> > (i.e.,
> > in this and also in other manifestations of this bug) have their
> > csched_vcpu.flags=4, which is CSCHED_FLAGS_VCPU_MIGRATING.
> > 
> > This, basically, is again a sign of vcpu_migrate() having been
> > called,
> > on d3v10 as well, which in turn has called csched_vcpu_pick().
> > 
> > > a nasty race condition… a vcpu has just been taken off the
> > > runqueue
> > > of the current pcpu, but it’s apparently been assigned to a
> > > different
> > > cpu.
> > > 
> > 
> > Nasty indeed. I've been looking into this on and off, but so far I
> > haven't found the root cause.
> > 
> > Now that we know for sure that it is changing affinity that trigger
> > it,
> > the field of the investigation can be narrowed a little bit... But
> > I
> > still am finding hard to spot where the race happens.
> > 
> > I'll look more into this later in the afternoon. I'll let know if
> > something comes to mind.
> 
> Actually, it looks quite simple:  schedule.c:vcpu_move_locked() is
> supposed to actually do the moving; if vcpu_scheduler()->migrate is
> defined, it calls that; otherwise, it just sets v-
> >processor.  Credit1
> doesn't define migrate.  So when changing the vcpu affinity on
> credit1,
> v->processor is simply modified without it changing runqueues.
> 
> The real question is why it's so hard to actually trigger any
> problems!
> 
Wait, but when vcpu_move_locked() is called, the vcpu being moved
should not be in any runqueue.

In fact, it is called from vcpu_migrate() which, in its turn, is always
 preceded by a call to vcpu_sleep_nosync(), that removes the vcpu from
the runqueue.

The only exception is when it is called from context_saved(). But then
again, the vcpu on which it is called is not on the runqueue, because
it was found not runnable.

That is why things works... well, apart from this bug. :-)

I mean, the root cause of this bug may very well be that there is a
code path that leads to calling vcpu_move_locked() on a vcpu that is
still in a runqueue... but have you actually identified it?

> But as a quick fix, implementing csched_vcpu_migrate() is probably
> the
> best solution.  Do you want to pick that up, or should I?
> 
And what should csched_vcpu_migrate() do, apart from changing
vc->processor?

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

signature.asc
Description: This is a digitally signed message part

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

On Tue, 2018-04-10 at 11:59 +0100, George Dunlap wrote:
> On 04/10/2018 11:33 AM, Dario Faggioli wrote:
> > On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote:
> > > Assuming the bug is this one:
> > > 
> > > BUG_ON( cpu != snext->vcpu->processor );
> > > 
> > 
> > Yes, it is that one.
> > 
> > Another stack trace, this time from a debug=y built hypervisor, of
> > what
> > we are thinking it is the same bug (although reproduced in a
> > slightly
> > different way) is this:
> > 
> > (XEN) [ Xen-4.7.2_02-36.1.12847.11.PTF  x86_64  debug=y  Not
> > tainted ]
> > (XEN) CPU:45
> > (XEN) RIP:e008:[]
> > sched_credit.c#csched_schedule+0x361/0xaa9
> > ...
> > (XEN) Xen call trace:
> > (XEN)[]
> > sched_credit.c#csched_schedule+0x361/0xaa9
> > (XEN)[] schedule.c#schedule+0x109/0x5d6
> > (XEN)[] softirq.c#__do_softirq+0x7f/0x8a
> > (XEN)[] do_softirq+0x13/0x15
> > (XEN)[] vmx_asm_do_vmentry+0x25/0x2a
> > 
> > (I can provide it all, if necessary.)
> > 
> > I've done some analysis, although when we still were not entirely
> > sure
> > that changing the affinities was the actual cause (or, at least,
> > what
> > is triggering the whole thing).
> > 
> > In the specific case of this stack trace, the current vcpu running
> > on
> > CPU 45 is d3v11. It is not in the runqueue, because it has been
> > removed, and not added back to it, and the reason is it is not
> > runnable
> > (it has VPF_migrating on in pause_flags).
> > 
> > The runqueue of pcpu 45 looks fine (i.e., it is not corrupt or
> > anything
> > like that), it has d3v10,d9v1,d32767v45 in it (in this order)
> > 
> > d3v11->processor is 45, so that is also fine.
> > 
> > Basically, d3v11 wants to move away from pcpu 45, and this might
> > (but
> > that's not certain) be the reson because we're rescheduling. The
> > fact
> > that there are vcpus wanting to migrate can very well be the cause
> > of
> > affinity being changed.
> > 
> > Now, the problem is that, looking into the runqueue, I found out
> > that
> > d3v10->processor=32. I.e., d3v10 is queued in pcpu 45's runqueue,
> > with
> > processor=32, which really shouldn't happen.
> > 
> > This leads to the bug triggering, as, in csched_schedule(), we read
> > the
> > head of the runqueue with:
> > 
> > snext = __runq_elem(runq->next);
> > 
> > and then we pass snext to csched_load_balance(), where the BUG_ON
> > is.
> > 
> > Another thing that I've found out, is that all "misplaced" vcpus
> > (i.e.,
> > in this and also in other manifestations of this bug) have their
> > csched_vcpu.flags=4, which is CSCHED_FLAGS_VCPU_MIGRATING.
> > 
> > This, basically, is again a sign of vcpu_migrate() having been
> > called,
> > on d3v10 as well, which in turn has called csched_vcpu_pick().
> > 
> > > a nasty race condition… a vcpu has just been taken off the
> > > runqueue
> > > of the current pcpu, but it’s apparently been assigned to a
> > > different
> > > cpu.
> > > 
> > 
> > Nasty indeed. I've been looking into this on and off, but so far I
> > haven't found the root cause.
> > 
> > Now that we know for sure that it is changing affinity that trigger
> > it,
> > the field of the investigation can be narrowed a little bit... But
> > I
> > still am finding hard to spot where the race happens.
> > 
> > I'll look more into this later in the afternoon. I'll let know if
> > something comes to mind.
> 
> Actually, it looks quite simple:  schedule.c:vcpu_move_locked() is
> supposed to actually do the moving; if vcpu_scheduler()->migrate is
> defined, it calls that; otherwise, it just sets v-
> >processor.  Credit1
> doesn't define migrate.  So when changing the vcpu affinity on
> credit1,
> v->processor is simply modified without it changing runqueues.
> 
> The real question is why it's so hard to actually trigger any
> problems!
> 
Wait, but when vcpu_move_locked() is called, the vcpu being moved
should not be in any runqueue.

In fact, it is called from vcpu_migrate() which, in its turn, is always
 preceded by a call to vcpu_sleep_nosync(), that removes the vcpu from
the runqueue.

The only exception is when it is called from context_saved(). But then
again, the vcpu on which it is called is not on the runqueue, because
it was found not runnable.

That is why things works... well, apart from this bug. :-)

I mean, the root cause of this bug may very well be that there is a
code path that leads to calling vcpu_move_locked() on a vcpu that is
still in a runqueue... but have you actually identified it?

> But as a quick fix, implementing csched_vcpu_migrate() is probably
> the
> best solution.  Do you want to pick that up, or should I?
> 
And what should csched_vcpu_migrate() do, apart from changing
vc->processor?

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

signature.asc
Description: This is a digitally signed message part

[Xen-devel] [distros-debian-snapshot test] 74569: trouble: blocked/broken/fail/pass

2018-04-10 Thread Platform Team regression test user

flight 74569 distros-debian-snapshot real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/74569/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64  broken
 build-arm64-pvopsbroken

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-armhf-daily-netboot-pygrub  1 build-check(1)  blocked n/a
 build-arm64-pvops 2 hosts-allocate broken blocked in 74142
 build-arm64   2 hosts-allocate broken blocked in 74142
 build-arm64-pvops 3 capture-logs   broken blocked in 74142
 build-arm64   3 capture-logs   broken blocked in 74142
 test-amd64-i386-i386-daily-netboot-pvgrub 10 debian-di-install fail blocked in 
74142
 test-amd64-amd64-i386-daily-netboot-pygrub 10 debian-di-install fail blocked 
in 74142
 test-amd64-i386-amd64-weekly-netinst-pygrub 10 debian-di-install fail blocked 
in 74142
 test-amd64-i386-i386-weekly-netinst-pygrub 10 debian-di-install fail blocked 
in 74142
 test-amd64-amd64-amd64-daily-netboot-pvgrub 11 guest-start fail blocked in 
74142
 test-amd64-amd64-i386-weekly-netinst-pygrub 10 debian-di-install fail blocked 
in 74142
 test-amd64-amd64-amd64-weekly-netinst-pygrub 10 debian-di-install fail blocked 
in 74142
 test-amd64-amd64-amd64-current-netinst-pygrub 10 debian-di-install fail 
blocked in 74142
 test-armhf-armhf-armhf-daily-netboot-pygrub 10 debian-di-install fail blocked 
in 74142
 test-amd64-i386-amd64-current-netinst-pygrub 10 debian-di-install fail blocked 
in 74142
 test-amd64-i386-i386-current-netinst-pygrub 10 debian-di-install fail blocked 
in 74142
 test-amd64-amd64-i386-current-netinst-pygrub 10 debian-di-install fail blocked 
in 74142

baseline version:
 flight   74142

jobs:
 build-amd64  pass
 build-arm64  broken  
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-arm64-pvopsbroken  
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-daily-netboot-pvgrub  fail
 test-amd64-i386-i386-daily-netboot-pvgrubfail
 test-amd64-i386-amd64-daily-netboot-pygrub   pass
 test-arm64-arm64-armhf-daily-netboot-pygrub  blocked 
 test-armhf-armhf-armhf-daily-netboot-pygrub  fail
 test-amd64-amd64-i386-daily-netboot-pygrub   fail
 test-amd64-amd64-amd64-current-netinst-pygrubfail
 test-amd64-i386-amd64-current-netinst-pygrub fail
 test-amd64-amd64-i386-current-netinst-pygrub fail
 test-amd64-i386-i386-current-netinst-pygrub  fail
 test-amd64-amd64-amd64-weekly-netinst-pygrub fail
 test-amd64-i386-amd64-weekly-netinst-pygrub  fail
 test-amd64-amd64-i386-weekly-netinst-pygrub  fail
 test-amd64-i386-i386-weekly-netinst-pygrub   fail



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC, v2, 1/9] hyper_dmabuf: initial upload of hyper_dmabuf drv core framework

2018-04-10 Thread Oleksandr Andrushchenko


On 04/10/2018 01:47 PM, Julien Grall wrote:

Hi,

On 04/10/2018 09:53 AM, Oleksandr Andrushchenko wrote:

On 02/14/2018 03:50 AM, Dongwon Kim wrote:
diff --git a/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.h 


[...]


+#ifndef __HYPER_DMABUF_ID_H__
+#define __HYPER_DMABUF_ID_H__
+
+#define HYPER_DMABUF_ID_CREATE(domid, cnt) \
+    domid) & 0xFF) << 24) | ((cnt) & 0xFF))

I would define hyper_dmabuf_id_t.id as a union or 2 separate
fields to avoid his magic


I am not sure the union would be right here because the layout will 
differs between big and little endian.

Agree

So does that value will be passed to other guest?

As per my understanding yes, with HYPER_DMABUF_EXPORT request


Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread George Dunlap

On 04/10/2018 11:33 AM, Dario Faggioli wrote:
> On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote:
>> Assuming the bug is this one:
>>
>> BUG_ON( cpu != snext->vcpu->processor );
>>
> Yes, it is that one.
> 
> Another stack trace, this time from a debug=y built hypervisor, of what
> we are thinking it is the same bug (although reproduced in a slightly
> different way) is this:
> 
> (XEN) [ Xen-4.7.2_02-36.1.12847.11.PTF  x86_64  debug=y  Not tainted ]
> (XEN) CPU:45
> (XEN) RIP:e008:[] 
> sched_credit.c#csched_schedule+0x361/0xaa9
> ...
> (XEN) Xen call trace:
> (XEN)[] sched_credit.c#csched_schedule+0x361/0xaa9
> (XEN)[] schedule.c#schedule+0x109/0x5d6
> (XEN)[] softirq.c#__do_softirq+0x7f/0x8a
> (XEN)[] do_softirq+0x13/0x15
> (XEN)[] vmx_asm_do_vmentry+0x25/0x2a
> 
> (I can provide it all, if necessary.)
> 
> I've done some analysis, although when we still were not entirely sure
> that changing the affinities was the actual cause (or, at least, what
> is triggering the whole thing).
> 
> In the specific case of this stack trace, the current vcpu running on
> CPU 45 is d3v11. It is not in the runqueue, because it has been
> removed, and not added back to it, and the reason is it is not runnable
> (it has VPF_migrating on in pause_flags).
> 
> The runqueue of pcpu 45 looks fine (i.e., it is not corrupt or anything
> like that), it has d3v10,d9v1,d32767v45 in it (in this order)
> 
> d3v11->processor is 45, so that is also fine.
> 
> Basically, d3v11 wants to move away from pcpu 45, and this might (but
> that's not certain) be the reson because we're rescheduling. The fact
> that there are vcpus wanting to migrate can very well be the cause of
> affinity being changed.
> 
> Now, the problem is that, looking into the runqueue, I found out that
> d3v10->processor=32. I.e., d3v10 is queued in pcpu 45's runqueue, with
> processor=32, which really shouldn't happen.
> 
> This leads to the bug triggering, as, in csched_schedule(), we read the
> head of the runqueue with:
> 
> snext = __runq_elem(runq->next);
> 
> and then we pass snext to csched_load_balance(), where the BUG_ON is.
> 
> Another thing that I've found out, is that all "misplaced" vcpus (i.e.,
> in this and also in other manifestations of this bug) have their
> csched_vcpu.flags=4, which is CSCHED_FLAGS_VCPU_MIGRATING.
> 
> This, basically, is again a sign of vcpu_migrate() having been called,
> on d3v10 as well, which in turn has called csched_vcpu_pick().
> 
>> a nasty race condition… a vcpu has just been taken off the runqueue
>> of the current pcpu, but it’s apparently been assigned to a different
>> cpu.
>>
> Nasty indeed. I've been looking into this on and off, but so far I
> haven't found the root cause.
> 
> Now that we know for sure that it is changing affinity that trigger it,
> the field of the investigation can be narrowed a little bit... But I
> still am finding hard to spot where the race happens.
> 
> I'll look more into this later in the afternoon. I'll let know if
> something comes to mind.

Actually, it looks quite simple:  schedule.c:vcpu_move_locked() is
supposed to actually do the moving; if vcpu_scheduler()->migrate is
defined, it calls that; otherwise, it just sets v->processor.  Credit1
doesn't define migrate.  So when changing the vcpu affinity on credit1,
v->processor is simply modified without it changing runqueues.

The real question is why it's so hard to actually trigger any problems!

All in all it looks like the migration / cpu_pick could be made a bit
more rational... we do this weird thing where we call cpu_pick, and if
it's different we call migrate; but of course if the vcpu is running, we
just set the VPF_migrating bit and raise a schedule_softirq, which will
cause cpu_pick() to be called yet another time.

But as a quick fix, implementing csched_vcpu_migrate() is probably the
best solution.  Do you want to pick that up, or should I?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC, v2, 1/9] hyper_dmabuf: initial upload of hyper_dmabuf drv core framework

2018-04-10 Thread Julien Grall


Hi,

On 04/10/2018 09:53 AM, Oleksandr Andrushchenko wrote:

On 02/14/2018 03:50 AM, Dongwon Kim wrote:
diff --git a/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.h 


[...]


+#ifndef __HYPER_DMABUF_ID_H__
+#define __HYPER_DMABUF_ID_H__
+
+#define HYPER_DMABUF_ID_CREATE(domid, cnt) \
+    domid) & 0xFF) << 24) | ((cnt) & 0xFF))

I would define hyper_dmabuf_id_t.id as a union or 2 separate
fields to avoid his magic


I am not sure the union would be right here because the layout will 
differs between big and little endian. So does that value will be passed 
to other guest?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/7] introduce time managment in xtf

2018-04-10 Thread Paul Semel


On 04/10/2018 12:36 PM, Roger Pau Monné wrote:

this file is introduce to be able to implement an inter domain
communication protocol over xenstore. For synchronization purpose, we do
really want to be able to "control" time

common/time.c: since_boot_time gets the time in nanoseconds from the
moment the VM has booted

Signed-off-by: Paul Semel 
---


This seems to be missing a list of changes between v2 and v3. Please
add such a list when posting new versions.


+uint64_t since_boot_time(void)
+{
+uint64_t tsc;
+uint32_t ver1, ver2;
+uint64_t system_time;
+uint64_t old_tsc;
+
+do
+{
+do
+{
+ver1 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
+smp_rmb();
+} while ( (ver1 & 1) == 1 );
+
+system_time = ACCESS_ONCE(shared_info.vcpu_info[0].time.system_time);
+old_tsc = ACCESS_ONCE(shared_info.vcpu_info[0].time.tsc_timestamp);
+smp_rmb();
+ver2 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
+smp_rmb();
+} while ( ver1 != ver2 );


This is still overly complicated IMO, and you have not replied to my
question of whether doing the scale_delta below is OK.


About this scale_delta, we discussed with Andrew, and we are going to use
another version of the function as far as I remember. That's why I am not
taking care of it for the moment.


You should send that version then :).



AFAICT uou _cannot_ access any of the vcpu_time_info fields without
checking for the version (in order to avoid reading inconsistent data
during an update), yet below you read tsc_to_system_mul and
tsc_shift.



I'm sorry, I am probably not getting your point here, because I am already
checking for the version. I was actually checking for the wc_version too in
the first version of those patches, but after chatting with Andrew, It
appeared that it was not necessary..


AFAICT the following should work:

do
{
   ver1 = shared_info.vcpu_info[0].time.version;
   smp_rmb();

   system_time = shared_info.vcpu_info[0].time.system_time;
   tsc_timestamp = shared_info.vcpu_info[0].time.tsc_timestamp;
   tsc_to_system_mul = shared_info.vcpu_info[0].time.tsc_to_system_mul;
   tsc_shift = shared_info.vcpu_info[0].time.tsc_shift;
   tsc = rdtsc_ordered();
   /* NB: this barrier is probably not needed if rdtsc is serializing. */
   smp_rmb();

   ver2 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
} while ( ver2 & 1 || ver1 != ver2 );



Just a (probably dumb) question. Why aren't you doing ACCESS_ONCE on every
shared_info field accesses ?
As far as I understand, we need to do this as most as we can to avoid having
completely broken data (or security issues). Am I missing something ?


ACCESS_ONCE prevents the reordering of the reads, but here AFAICT we
don't really care about the order in which way they are performed as
long as they are all done before the read barrier (smp_rmb).

Note that I used ACCESS_ONCE for the last access to the version field.
I've done that to prevent the compiler from optimizing the code as:

} while ( shared_info.vcpu_info[0].time.version & 1 ||
ver1 != shared_info.vcpu_info[0].time.version );

Which would be incorrect, since we want to use the same version data
for both checks in the while loop condition.



Okay, I really thought that it was also used to ensure that the accesses are
not splitted into multiple instructions (for optimizations because of the
loop), and thus put us in trouble if the shared memory was modified in
between.


If the memory is modified in between either (ver2 & 1) == 1 or ver1 !=
ver2 because that's the protocol between the hypervisor and the guest
in order to update vpcu_time_info, so we will discard the read data
and start the loop again.


Oh okay, I get it now, sorry for missing this !

Thanks,

--
Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/7] introduce time managment in xtf

2018-04-10 Thread Roger Pau Monné

On Tue, Apr 10, 2018 at 12:32:23PM +0200, Paul Semel wrote:
> On 04/10/2018 12:05 PM, Roger Pau Monné wrote:
> > > > > > > this file is introduce to be able to implement an inter domain
> > > > > > > communication protocol over xenstore. For synchronization 
> > > > > > > purpose, we do
> > > > > > > really want to be able to "control" time
> > > > > > > 
> > > > > > > common/time.c: since_boot_time gets the time in nanoseconds from 
> > > > > > > the
> > > > > > > moment the VM has booted
> > > > > > > 
> > > > > > > Signed-off-by: Paul Semel 
> > > > > > > ---
> > > > > > 
> > > > > > This seems to be missing a list of changes between v2 and v3. Please
> > > > > > add such a list when posting new versions.
> > > > > > 
> > > > > > > +uint64_t since_boot_time(void)
> > > > > > > +{
> > > > > > > +uint64_t tsc;
> > > > > > > +uint32_t ver1, ver2;
> > > > > > > +uint64_t system_time;
> > > > > > > +uint64_t old_tsc;
> > > > > > > +
> > > > > > > +do
> > > > > > > +{
> > > > > > > +do
> > > > > > > +{
> > > > > > > +ver1 = 
> > > > > > > ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
> > > > > > > +smp_rmb();
> > > > > > > +} while ( (ver1 & 1) == 1 );
> > > > > > > +
> > > > > > > +system_time = 
> > > > > > > ACCESS_ONCE(shared_info.vcpu_info[0].time.system_time);
> > > > > > > +old_tsc = 
> > > > > > > ACCESS_ONCE(shared_info.vcpu_info[0].time.tsc_timestamp);
> > > > > > > +smp_rmb();
> > > > > > > +ver2 = 
> > > > > > > ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
> > > > > > > +smp_rmb();
> > > > > > > +} while ( ver1 != ver2 );
> > > > > > 
> > > > > > This is still overly complicated IMO, and you have not replied to my
> > > > > > question of whether doing the scale_delta below is OK.
> > > > > 
> > > > > About this scale_delta, we discussed with Andrew, and we are going to 
> > > > > use
> > > > > another version of the function as far as I remember. That's why I am 
> > > > > not
> > > > > taking care of it for the moment.
> > > > 
> > > > You should send that version then :).
> > > > 
> > > > > > 
> > > > > > AFAICT uou _cannot_ access any of the vcpu_time_info fields without
> > > > > > checking for the version (in order to avoid reading inconsistent 
> > > > > > data
> > > > > > during an update), yet below you read tsc_to_system_mul and
> > > > > > tsc_shift.
> > > > > > 
> > > > > 
> > > > > I'm sorry, I am probably not getting your point here, because I am 
> > > > > already
> > > > > checking for the version. I was actually checking for the wc_version 
> > > > > too in
> > > > > the first version of those patches, but after chatting with Andrew, It
> > > > > appeared that it was not necessary..
> > > > 
> > > > AFAICT the following should work:
> > > > 
> > > > do
> > > > {
> > > >   ver1 = shared_info.vcpu_info[0].time.version;
> > > >   smp_rmb();
> > > > 
> > > >   system_time = shared_info.vcpu_info[0].time.system_time;
> > > >   tsc_timestamp = shared_info.vcpu_info[0].time.tsc_timestamp;
> > > >   tsc_to_system_mul = 
> > > > shared_info.vcpu_info[0].time.tsc_to_system_mul;
> > > >   tsc_shift = shared_info.vcpu_info[0].time.tsc_shift;
> > > >   tsc = rdtsc_ordered();
> > > >   /* NB: this barrier is probably not needed if rdtsc is 
> > > > serializing. */
> > > >   smp_rmb();
> > > > 
> > > >   ver2 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
> > > > } while ( ver2 & 1 || ver1 != ver2 );
> > > > 
> > > 
> > > Just a (probably dumb) question. Why aren't you doing ACCESS_ONCE on every
> > > shared_info field accesses ?
> > > As far as I understand, we need to do this as most as we can to avoid 
> > > having
> > > completely broken data (or security issues). Am I missing something ?
> > 
> > ACCESS_ONCE prevents the reordering of the reads, but here AFAICT we
> > don't really care about the order in which way they are performed as
> > long as they are all done before the read barrier (smp_rmb).
> > 
> > Note that I used ACCESS_ONCE for the last access to the version field.
> > I've done that to prevent the compiler from optimizing the code as:
> > 
> > } while ( shared_info.vcpu_info[0].time.version & 1 ||
> >ver1 != shared_info.vcpu_info[0].time.version );
> > 
> > Which would be incorrect, since we want to use the same version data
> > for both checks in the while loop condition.
> > 
> 
> Okay, I really thought that it was also used to ensure that the accesses are
> not splitted into multiple instructions (for optimizations because of the
> loop), and thus put us in trouble if the shared memory was modified in
> between.

If the memory is modified in between either (ver2 & 1) == 1 or ver1 !=
ver2 because that's the protocol between the hypervisor and the guest
in order to update vpcu_time_info, so we will discard the read data
and start the loop again.

Roger.

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Dario Faggioli

On Tue, 2018-04-10 at 09:34 +, George Dunlap wrote:
> Assuming the bug is this one:
> 
> BUG_ON( cpu != snext->vcpu->processor );
> 
Yes, it is that one.

Another stack trace, this time from a debug=y built hypervisor, of what
we are thinking it is the same bug (although reproduced in a slightly
different way) is this:

(XEN) [ Xen-4.7.2_02-36.1.12847.11.PTF  x86_64  debug=y  Not tainted ]
(XEN) CPU:45
(XEN) RIP:e008:[] 
sched_credit.c#csched_schedule+0x361/0xaa9
...
(XEN) Xen call trace:
(XEN)[] sched_credit.c#csched_schedule+0x361/0xaa9
(XEN)[] schedule.c#schedule+0x109/0x5d6
(XEN)[] softirq.c#__do_softirq+0x7f/0x8a
(XEN)[] do_softirq+0x13/0x15
(XEN)[] vmx_asm_do_vmentry+0x25/0x2a

(I can provide it all, if necessary.)

I've done some analysis, although when we still were not entirely sure
that changing the affinities was the actual cause (or, at least, what
is triggering the whole thing).

In the specific case of this stack trace, the current vcpu running on
CPU 45 is d3v11. It is not in the runqueue, because it has been
removed, and not added back to it, and the reason is it is not runnable
(it has VPF_migrating on in pause_flags).

The runqueue of pcpu 45 looks fine (i.e., it is not corrupt or anything
like that), it has d3v10,d9v1,d32767v45 in it (in this order)

d3v11->processor is 45, so that is also fine.

Basically, d3v11 wants to move away from pcpu 45, and this might (but
that's not certain) be the reson because we're rescheduling. The fact
that there are vcpus wanting to migrate can very well be the cause of
affinity being changed.

Now, the problem is that, looking into the runqueue, I found out that
d3v10->processor=32. I.e., d3v10 is queued in pcpu 45's runqueue, with
processor=32, which really shouldn't happen.

This leads to the bug triggering, as, in csched_schedule(), we read the
head of the runqueue with:

snext = __runq_elem(runq->next);

and then we pass snext to csched_load_balance(), where the BUG_ON is.

Another thing that I've found out, is that all "misplaced" vcpus (i.e.,
in this and also in other manifestations of this bug) have their
csched_vcpu.flags=4, which is CSCHED_FLAGS_VCPU_MIGRATING.

This, basically, is again a sign of vcpu_migrate() having been called,
on d3v10 as well, which in turn has called csched_vcpu_pick().

> a nasty race condition… a vcpu has just been taken off the runqueue
> of the current pcpu, but it’s apparently been assigned to a different
> cpu.
> 
Nasty indeed. I've been looking into this on and off, but so far I
haven't found the root cause.

Now that we know for sure that it is changing affinity that trigger it,
the field of the investigation can be narrowed a little bit... But I
still am finding hard to spot where the race happens.

I'll look more into this later in the afternoon. I'll let know if
something comes to mind.

> Let me take a look.
> 
Thanks! :-)
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/7] introduce time managment in xtf

2018-04-10 Thread Paul Semel


On 04/10/2018 12:05 PM, Roger Pau Monné wrote:

this file is introduce to be able to implement an inter domain
communication protocol over xenstore. For synchronization purpose, we do
really want to be able to "control" time

common/time.c: since_boot_time gets the time in nanoseconds from the
moment the VM has booted

Signed-off-by: Paul Semel 
---


This seems to be missing a list of changes between v2 and v3. Please
add such a list when posting new versions.


+uint64_t since_boot_time(void)
+{
+uint64_t tsc;
+uint32_t ver1, ver2;
+uint64_t system_time;
+uint64_t old_tsc;
+
+do
+{
+do
+{
+ver1 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
+smp_rmb();
+} while ( (ver1 & 1) == 1 );
+
+system_time = ACCESS_ONCE(shared_info.vcpu_info[0].time.system_time);
+old_tsc = ACCESS_ONCE(shared_info.vcpu_info[0].time.tsc_timestamp);
+smp_rmb();
+ver2 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
+smp_rmb();
+} while ( ver1 != ver2 );


This is still overly complicated IMO, and you have not replied to my
question of whether doing the scale_delta below is OK.


About this scale_delta, we discussed with Andrew, and we are going to use
another version of the function as far as I remember. That's why I am not
taking care of it for the moment.


You should send that version then :).



AFAICT uou _cannot_ access any of the vcpu_time_info fields without
checking for the version (in order to avoid reading inconsistent data
during an update), yet below you read tsc_to_system_mul and
tsc_shift.



I'm sorry, I am probably not getting your point here, because I am already
checking for the version. I was actually checking for the wc_version too in
the first version of those patches, but after chatting with Andrew, It
appeared that it was not necessary..


AFAICT the following should work:

do
{
  ver1 = shared_info.vcpu_info[0].time.version;
  smp_rmb();

  system_time = shared_info.vcpu_info[0].time.system_time;
  tsc_timestamp = shared_info.vcpu_info[0].time.tsc_timestamp;
  tsc_to_system_mul = shared_info.vcpu_info[0].time.tsc_to_system_mul;
  tsc_shift = shared_info.vcpu_info[0].time.tsc_shift;
  tsc = rdtsc_ordered();
  /* NB: this barrier is probably not needed if rdtsc is serializing. */
  smp_rmb();

  ver2 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
} while ( ver2 & 1 || ver1 != ver2 );



Just a (probably dumb) question. Why aren't you doing ACCESS_ONCE on every
shared_info field accesses ?
As far as I understand, we need to do this as most as we can to avoid having
completely broken data (or security issues). Am I missing something ?


ACCESS_ONCE prevents the reordering of the reads, but here AFAICT we
don't really care about the order in which way they are performed as
long as they are all done before the read barrier (smp_rmb).

Note that I used ACCESS_ONCE for the last access to the version field.
I've done that to prevent the compiler from optimizing the code as:

} while ( shared_info.vcpu_info[0].time.version & 1 ||
   ver1 != shared_info.vcpu_info[0].time.version );

Which would be incorrect, since we want to use the same version data
for both checks in the while loop condition.



Okay, I really thought that it was also used to ensure that the accesses 
are not splitted into multiple instructions (for optimizations because 
of the loop), and thus put us in trouble if the shared memory was 
modified in between.


Thank you very much !

--
Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 122157: regressions - FAIL

2018-04-10 Thread osstest service owner

flight 122157 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122157/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   6 xen-buildfail REGR. vs. 121876
 build-armhf   6 xen-buildfail REGR. vs. 121876

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass

version targeted for testing:
 xen  aaae6290965b1434ae41e08b808bf5a59e6cf93e
baseline version:
 xen  451004603247205467ec34b366b4cfa3814a5d95

Last test of basis   121876  2018-04-05 10:04:25 Z4 days
Failing since121889  2018-04-05 13:02:10 Z4 days   44 attempts
Testing same since   122146  2018-04-09 20:01:28 Z0 days6 attempts


People who touched revisions under test:
  Amit Singh Tomar 
  Andre Przywara 
  Andre Pzywara 
  Andrew Cooper 
  Boris Ostrovsky 
  George Dunlap 
  Jan Beulich 
  Juergen Gross 
  Julien Grall 
  Kevin Tian 
  Marcello Seri 
  Marcus of Wetware Labs 
  Marek Marczykowski-GÃ³recki 
  Petre Eftime 
  Razvan Cojocaru 
  Stefano Stabellini 
  Tim Deegan 
  Wei Liu 

jobs:
 build-arm64-xsm  fail
 build-amd64  pass
 build-armhf  fail
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  blocked 
 test-arm64-arm64-xl-xsm  blocked 
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 776 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/7] introduce time managment in xtf

2018-04-10 Thread Roger Pau Monné

On Tue, Apr 10, 2018 at 11:47:11AM +0200, Paul Semel wrote:
> On 04/10/2018 10:08 AM, Roger Pau Monné wrote:
> > > > > this file is introduce to be able to implement an inter domain
> > > > > communication protocol over xenstore. For synchronization purpose, we 
> > > > > do
> > > > > really want to be able to "control" time
> > > > > 
> > > > > common/time.c: since_boot_time gets the time in nanoseconds from the
> > > > > moment the VM has booted
> > > > > 
> > > > > Signed-off-by: Paul Semel 
> > > > > ---
> > > > 
> > > > This seems to be missing a list of changes between v2 and v3. Please
> > > > add such a list when posting new versions.
> > > > 
> > > > > +uint64_t since_boot_time(void)
> > > > > +{
> > > > > +uint64_t tsc;
> > > > > +uint32_t ver1, ver2;
> > > > > +uint64_t system_time;
> > > > > +uint64_t old_tsc;
> > > > > +
> > > > > +do
> > > > > +{
> > > > > +do
> > > > > +{
> > > > > +ver1 = 
> > > > > ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
> > > > > +smp_rmb();
> > > > > +} while ( (ver1 & 1) == 1 );
> > > > > +
> > > > > +system_time = 
> > > > > ACCESS_ONCE(shared_info.vcpu_info[0].time.system_time);
> > > > > +old_tsc = 
> > > > > ACCESS_ONCE(shared_info.vcpu_info[0].time.tsc_timestamp);
> > > > > +smp_rmb();
> > > > > +ver2 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
> > > > > +smp_rmb();
> > > > > +} while ( ver1 != ver2 );
> > > > 
> > > > This is still overly complicated IMO, and you have not replied to my
> > > > question of whether doing the scale_delta below is OK.
> > > 
> > > About this scale_delta, we discussed with Andrew, and we are going to use
> > > another version of the function as far as I remember. That's why I am not
> > > taking care of it for the moment.
> > 
> > You should send that version then :).
> > 
> > > > 
> > > > AFAICT uou _cannot_ access any of the vcpu_time_info fields without
> > > > checking for the version (in order to avoid reading inconsistent data
> > > > during an update), yet below you read tsc_to_system_mul and
> > > > tsc_shift.
> > > > 
> > > 
> > > I'm sorry, I am probably not getting your point here, because I am already
> > > checking for the version. I was actually checking for the wc_version too 
> > > in
> > > the first version of those patches, but after chatting with Andrew, It
> > > appeared that it was not necessary..
> > 
> > AFAICT the following should work:
> > 
> > do
> > {
> >  ver1 = shared_info.vcpu_info[0].time.version;
> >  smp_rmb();
> > 
> >  system_time = shared_info.vcpu_info[0].time.system_time;
> >  tsc_timestamp = shared_info.vcpu_info[0].time.tsc_timestamp;
> >  tsc_to_system_mul = shared_info.vcpu_info[0].time.tsc_to_system_mul;
> >  tsc_shift = shared_info.vcpu_info[0].time.tsc_shift;
> >  tsc = rdtsc_ordered();
> >  /* NB: this barrier is probably not needed if rdtsc is serializing. */
> >  smp_rmb();
> > 
> >  ver2 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
> > } while ( ver2 & 1 || ver1 != ver2 );
> > 
> 
> Just a (probably dumb) question. Why aren't you doing ACCESS_ONCE on every
> shared_info field accesses ?
> As far as I understand, we need to do this as most as we can to avoid having
> completely broken data (or security issues). Am I missing something ?

ACCESS_ONCE prevents the reordering of the reads, but here AFAICT we
don't really care about the order in which way they are performed as
long as they are all done before the read barrier (smp_rmb).

Note that I used ACCESS_ONCE for the last access to the version field.
I've done that to prevent the compiler from optimizing the code as:

} while ( shared_info.vcpu_info[0].time.version & 1 ||
  ver1 != shared_info.vcpu_info[0].time.version );

Which would be incorrect, since we want to use the same version data
for both checks in the while loop condition.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC, v2, 9/9] hyper_dmabuf: threaded interrupt in Xen-backend

2018-04-10 Thread Oleksandr Andrushchenko


On 02/14/2018 03:50 AM, Dongwon Kim wrote:

Use threaded interrupt intead of regular one because most part of ISR
is time-critical and possibly sleeps

Signed-off-by: Dongwon Kim 
---
  .../hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c | 19 +++
  1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c 
b/drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c
index 30bc4b6304ac..65af5ddfb2d7 100644
--- a/drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c
+++ b/drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c
@@ -332,11 +332,14 @@ int xen_be_init_tx_rbuf(int domid)
}
  
  	/* setting up interrupt */

-   ret = bind_evtchn_to_irqhandler(alloc_unbound.port,
-   front_ring_isr, 0,
-   NULL, (void *) ring_info);
+   ring_info->irq = bind_evtchn_to_irq(alloc_unbound.port);
  
-	if (ret < 0) {

+   ret = request_threaded_irq(ring_info->irq,
+  NULL,
+  front_ring_isr,
+  IRQF_ONESHOT, NULL, ring_info);
+

Why don't you go with threaded IRQ from the beginning and change it
in the patch #9?

+   if (ret != 0) {
dev_err(hy_drv_priv->dev,
"Failed to setup event channel\n");
close.port = alloc_unbound.port;
@@ -348,7 +351,6 @@ int xen_be_init_tx_rbuf(int domid)
}
  
  	ring_info->rdomain = domid;

-   ring_info->irq = ret;
ring_info->port = alloc_unbound.port;
  
  	mutex_init(_info->lock);

@@ -535,9 +537,10 @@ int xen_be_init_rx_rbuf(int domid)
if (!xen_comm_find_tx_ring(domid))
ret = xen_be_init_tx_rbuf(domid);
  
-	ret = request_irq(ring_info->irq,

- back_ring_isr, 0,
- NULL, (void *)ring_info);
+   ret = request_threaded_irq(ring_info->irq,
+  NULL,
+  back_ring_isr, IRQF_ONESHOT,
+  NULL, (void *)ring_info);
  

Ditto

return ret;
  






___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC, v2, 4/9] hyper_dmabuf: user private data attached to hyper_DMABUF

2018-04-10 Thread Oleksandr Andrushchenko


On 02/14/2018 03:50 AM, Dongwon Kim wrote:

Define a private data (e.g. meta data for the buffer) attached to
each hyper_DMABUF structure. This data is provided by userapace via
export_remote IOCTL and its size can be up to 192 bytes.

Signed-off-by: Dongwon Kim 
Signed-off-by: Mateusz Polrola 
---
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c  | 83 --
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c| 36 +-
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.h|  2 +-
  .../dma-buf/hyper_dmabuf/hyper_dmabuf_sgl_proc.c   |  1 +
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_struct.h | 12 
  include/uapi/linux/hyper_dmabuf.h  |  4 ++
  6 files changed, 132 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c 
b/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c
index 020a5590a254..168ccf98f710 100644
--- a/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c
+++ b/drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c
@@ -103,6 +103,11 @@ static int send_export_msg(struct exported_sgt_info 
*exported,
}
}
  
+	op[8] = exported->sz_priv;

+
+   /* driver/application specific private info */
+   memcpy([9], exported->priv, op[8]);
+
req = kcalloc(1, sizeof(*req), GFP_KERNEL);
  
  	if (!req)

@@ -120,8 +125,9 @@ static int send_export_msg(struct exported_sgt_info 
*exported,
  
  /* Fast path exporting routine in case same buffer is already exported.

   *
- * If same buffer is still valid and exist in EXPORT LIST it returns 0 so
- * that remaining normal export process can be skipped.
+ * If same buffer is still valid and exist in EXPORT LIST, it only updates
+ * user-private data for the buffer and returns 0 so that that it can skip
+ * normal export process.
   *
   * If "unexport" is scheduled for the buffer, it cancels it since the buffer
   * is being re-exported.
@@ -129,7 +135,7 @@ static int send_export_msg(struct exported_sgt_info 
*exported,
   * return '1' if reexport is needed, return '0' if succeeds, return
   * Kernel error code if something goes wrong
   */
-static int fastpath_export(hyper_dmabuf_id_t hid)
+static int fastpath_export(hyper_dmabuf_id_t hid, int sz_priv, char *priv)
  {
int reexport = 1;
int ret = 0;
@@ -155,6 +161,46 @@ static int fastpath_export(hyper_dmabuf_id_t hid)
exported->unexport_sched = false;
}
  
+	/* if there's any change in size of private data.

+* we reallocate space for private data with new size
+*/
+   if (sz_priv != exported->sz_priv) {
+   kfree(exported->priv);
+
+   /* truncating size */
+   if (sz_priv > MAX_SIZE_PRIV_DATA)
+   exported->sz_priv = MAX_SIZE_PRIV_DATA;
+   else
+   exported->sz_priv = sz_priv;
+
+   exported->priv = kcalloc(1, exported->sz_priv,
+GFP_KERNEL);
+
+   if (!exported->priv) {
+   hyper_dmabuf_remove_exported(exported->hid);
+   hyper_dmabuf_cleanup_sgt_info(exported, true);
+   kfree(exported);
+   return -ENOMEM;
+   }
+   }
+
+   /* update private data in sgt_info with new ones */
+   ret = copy_from_user(exported->priv, priv, exported->sz_priv);
+   if (ret) {
+   dev_err(hy_drv_priv->dev,
+   "Failed to load a new private data\n");
+   ret = -EINVAL;
+   } else {
+   /* send an export msg for updating priv in importer */
+   ret = send_export_msg(exported, NULL);
+
+   if (ret < 0) {
+   dev_err(hy_drv_priv->dev,
+   "Failed to send a new private data\n");
+   ret = -EBUSY;
+   }
+   }
+
return ret;
  }
  
@@ -191,7 +237,8 @@ static int hyper_dmabuf_export_remote_ioctl(struct file *filp, void *data)

 export_remote_attr->remote_domain);
  
  	if (hid.id != -1) {

-   ret = fastpath_export(hid);
+   ret = fastpath_export(hid, export_remote_attr->sz_priv,
+ export_remote_attr->priv);
  
  		/* return if fastpath_export succeeds or

 * gets some fatal error
@@ -225,6 +272,24 @@ static int hyper_dmabuf_export_remote_ioctl(struct file 
*filp, void *data)
goto fail_sgt_info_creation;
}
  
+	/* possible truncation */

+   if (export_remote_attr->sz_priv > MAX_SIZE_PRIV_DATA)
+   exported->sz_priv = MAX_SIZE_PRIV_DATA;
+   else
+   exported->sz_priv = export_remote_attr->sz_priv;
+
+   /* creating buffer for private data of buffer */
+   if (exported->sz_priv != 0) {
+

Re: [Xen-devel] [RFC, v2, 2/9] hyper_dmabuf: architecture specification and reference guide

2018-04-10 Thread Oleksandr Andrushchenko


Sorry for top-posting

Can we have all this go into some header file which

will not only describe the structures/commands/responses/etc,

but will also allow drivers to use those directly without

defining the same one more time in the code? For example,

this is how it is done in Xen [1]. This way, you can keep

documentation and the protocol implementation in sync easily


On 02/14/2018 03:50 AM, Dongwon Kim wrote:

Reference document for hyper_DMABUF driver

Documentation/hyper-dmabuf-sharing.txt

Signed-off-by: Dongwon Kim 
---
  Documentation/hyper-dmabuf-sharing.txt | 734 +
  1 file changed, 734 insertions(+)
  create mode 100644 Documentation/hyper-dmabuf-sharing.txt

diff --git a/Documentation/hyper-dmabuf-sharing.txt 
b/Documentation/hyper-dmabuf-sharing.txt
new file mode 100644
index ..928e411931e3
--- /dev/null
+++ b/Documentation/hyper-dmabuf-sharing.txt
@@ -0,0 +1,734 @@
+Linux Hyper DMABUF Driver
+
+--
+Section 1. Overview
+--
+
+Hyper_DMABUF driver is a Linux device driver running on multiple Virtual
+achines (VMs), which expands DMA-BUF sharing capability to the VM environment
+where multiple different OS instances need to share same physical data without
+data-copy across VMs.
+
+To share a DMA_BUF across VMs, an instance of the Hyper_DMABUF drv on the
+exporting VM (so called, “exporter”) imports a local DMA_BUF from the original
+producer of the buffer, then re-exports it with an unique ID, hyper_dmabuf_id
+for the buffer to the importing VM (so called, “importer”).
+
+Another instance of the Hyper_DMABUF driver on importer registers
+a hyper_dmabuf_id together with reference information for the shared physical
+pages associated with the DMA_BUF to its database when the export happens.
+
+The actual mapping of the DMA_BUF on the importer’s side is done by
+the Hyper_DMABUF driver when user space issues the IOCTL command to access
+the shared DMA_BUF. The Hyper_DMABUF driver works as both an importing and
+exporting driver as is, that is, no special configuration is required.
+Consequently, only a single module per VM is needed to enable cross-VM DMA_BUF
+exchange.
+
+--
+Section 2. Architecture
+--
+
+1. Hyper_DMABUF ID
+
+hyper_dmabuf_id is a global handle for shared DMA BUFs, which is compatible
+across VMs. It is a key used by the importer to retrieve information about
+shared Kernel pages behind the DMA_BUF structure from the IMPORT list. When
+a DMA_BUF is exported to another domain, its hyper_dmabuf_id and META data
+are also kept in the EXPORT list by the exporter for further synchronization
+of control over the DMA_BUF.
+
+hyper_dmabuf_id is “targeted”, meaning it is valid only in exporting (owner of
+the buffer) and importing VMs, where the corresponding hyper_dmabuf_id is
+stored in their database (EXPORT and IMPORT lists).
+
+A user-space application specifies the targeted VM id in the user parameter
+when it calls the IOCTL command to export shared DMA_BUF to another VM.
+
+hyper_dmabuf_id_t is a data type for hyper_dmabuf_id. It is defined as 16-byte
+data structure, and it contains id and rng_key[3] as elements for
+the structure.
+
+typedef struct {
+int id;
+int rng_key[3]; /* 12bytes long random number */
+} hyper_dmabuf_id_t;
+
+The first element in the hyper_dmabuf_id structure, int id is combined data of
+a count number generated by the driver running on the exporter and
+the exporter’s ID. The VM’s ID is a one byte value and located at the field’s
+SB in int id. The remaining three bytes in int id are reserved for a count
+number.
+
+However, there is a limit related to this count number, which is 1000.
+Therefore, only little more than a byte starting from the LSB is actually used
+for storing this count number.
+
+#define HYPER_DMABUF_ID_CREATE(domid, id) \
+domid) & 0xFF) << 24) | ((id) & 0xFF))
+
+This limit on the count number directly means the maximum number of DMA BUFs
+that  can be shared simultaneously by one VM. The second element of
+hyper_dmabuf_id, that is int rng_key[3], is an array of three integers. These
+numbers are generated by Linux’s native random number generation mechanism.
+This field is added to enhance the security of the Hyper DMABUF driver by
+maximizing the entropy of hyper_dmabuf_id (that is, preventing it from being
+guessed by a security attacker).
+
+Once DMA_BUF is no longer shared, the hyper_dmabuf_id associated with
+the DMA_BUF is released, but the count number in hyper_dmabuf_id is saved in
+the ID list for reuse. However, random keys stored in int rng_key[3] are not
+reused. Instead, those keys are always filled with freshly

Re: [Xen-devel] [PATCH v6 3/9] xen/x86: support per-domain flag for xpti

2018-04-10 Thread Juergen Gross

On 10/04/18 11:36, Jan Beulich wrote:
 On 10.04.18 at 11:32,  wrote:
>> On 10/04/18 11:14, Jan Beulich wrote:
>> On 10.04.18 at 09:58,  wrote:
 --- a/docs/misc/xen-command-line.markdown
 +++ b/docs/misc/xen-command-line.markdown
 @@ -1955,14 +1955,29 @@ clustered mode.  The default, given no hint from 
 the **FADT**, is cluster
  mode.
  
  ### xpti
 -> `= `
 +> `= List of [ default |  | dom0= | domu= ]`
  
 -> Default: `false` on AMD hardware
 +> Default: `false` on hardware not vulnerable to Meltdown (e.g. AMD)
  > Default: `true` everywhere else
  
  Override default selection of whether to isolate 64-bit PV guest page
  tables.
  
 +`true` activates page table isolation even on hardware not vulnerable by
 +Meltdown for all domains.
 +
 +`false` deactivates page table isolation on all systems for all domains.
 +
 +`default` sets the default behaviour.
 +
 +`dom0=false` deactivates page table isolation for dom0.
 +
 +`dom0=true` activates page table isolation for dom0.
 +
 +`domu=false` deactivates page table isolation for guest domains.
 +
 +`domu=true` activates page table isolation for guest domains.
>>>
>>> This is too verbose / repetitive for my taste.
>>
>> So you'd like it better as:
>>
>> "With `dom0` and `domu` it is possible to control page table isolation
>> for dom0 or guest domains only." ?
> 
> Yes.
> 
 @@ -205,6 +208,10 @@ int pv_domain_initialise(struct domain *d)
  /* 64-bit PV guest by default. */
  d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
  
 +d->arch.pv_domain.xpti = (d->domain_id == hardware_domid)
 + ? (opt_xpti & XPTI_DOM0)
 + : (opt_xpti & XPTI_DOMU);
>>>
>>> I would generally prefer to have as little redundancy as possible in
>>> such expressions, i.e.
>>>
>>> d->arch.pv_domain.xpti = opt_xpti & (d->domain_id == hardware_domid
>>>  ? XPTI_DOM0 : XPTI_DOMU);
>>
>> Okay.
>>
>>>
>>> Furthermore - shouldn't this cover domain 0 as well as the hardware
>>> domain, even if - in case they are different - domain 0 should be
>>> short lived?
>>
>> When domain 0 is created is _is_ the hardware domain. Only domain 0
>> creating a hardware domain will set hardware_domid to a non-zero value.
> 
> hardware_domid is set by an integer_param() afaics, so would be
> set long before creation of domain 0.

Hmm, seems I shouldn't have trusted Andrew to tell me the truth here ;-)

Using is_hardware_domain(d) for the test is the better choice here, as
hardware_domain is first set to ->dom0 and later to ->hwdom.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/7] introduce time managment in xtf

2018-04-10 Thread Paul Semel


On 04/10/2018 10:08 AM, Roger Pau Monné wrote:

this file is introduce to be able to implement an inter domain
communication protocol over xenstore. For synchronization purpose, we do
really want to be able to "control" time

common/time.c: since_boot_time gets the time in nanoseconds from the
moment the VM has booted

Signed-off-by: Paul Semel 
---


This seems to be missing a list of changes between v2 and v3. Please
add such a list when posting new versions.


+uint64_t since_boot_time(void)
+{
+uint64_t tsc;
+uint32_t ver1, ver2;
+uint64_t system_time;
+uint64_t old_tsc;
+
+do
+{
+do
+{
+ver1 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
+smp_rmb();
+} while ( (ver1 & 1) == 1 );
+
+system_time = ACCESS_ONCE(shared_info.vcpu_info[0].time.system_time);
+old_tsc = ACCESS_ONCE(shared_info.vcpu_info[0].time.tsc_timestamp);
+smp_rmb();
+ver2 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
+smp_rmb();
+} while ( ver1 != ver2 );


This is still overly complicated IMO, and you have not replied to my
question of whether doing the scale_delta below is OK.


About this scale_delta, we discussed with Andrew, and we are going to use
another version of the function as far as I remember. That's why I am not
taking care of it for the moment.


You should send that version then :).



AFAICT uou _cannot_ access any of the vcpu_time_info fields without
checking for the version (in order to avoid reading inconsistent data
during an update), yet below you read tsc_to_system_mul and
tsc_shift.



I'm sorry, I am probably not getting your point here, because I am already
checking for the version. I was actually checking for the wc_version too in
the first version of those patches, but after chatting with Andrew, It
appeared that it was not necessary..


AFAICT the following should work:

do
{
 ver1 = shared_info.vcpu_info[0].time.version;
 smp_rmb();

 system_time = shared_info.vcpu_info[0].time.system_time;
 tsc_timestamp = shared_info.vcpu_info[0].time.tsc_timestamp;
 tsc_to_system_mul = shared_info.vcpu_info[0].time.tsc_to_system_mul;
 tsc_shift = shared_info.vcpu_info[0].time.tsc_shift;
 tsc = rdtsc_ordered();
 /* NB: this barrier is probably not needed if rdtsc is serializing. */
 smp_rmb();

 ver2 = ACCESS_ONCE(shared_info.vcpu_info[0].time.version);
} while ( ver2 & 1 || ver1 != ver2 );



Just a (probably dumb) question. Why aren't you doing ACCESS_ONCE on 
every shared_info field accesses ?
As far as I understand, we need to do this as most as we can to avoid 
having completely broken data (or security issues). Am I missing something ?



system_time += scale_delta(tsc - tsc_timestamp, tsc_to_system_mul,
time.tsc_shift);

I'm not sure the second barrier is actually needed, since
rdtsc_ordered should be serializing.


I've already pointed out the code at:

https://github.com/freebsd/freebsd/blob/master/sys/x86/x86/pvclock.c#L141

As a simpler reference implementation.


+
+rdtsc(tsc);
+
+system_time += scale_delta(tsc - old_tsc,
+   
ACCESS_ONCE(shared_info.vcpu_info[0].time.tsc_to_system_mul),
+   ACCESS_ONCE(shared_info.vcpu_info[0].time.tsc_shift));
+
+return system_time;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/include/xtf/time.h b/include/xtf/time.h
new file mode 100644
index 000..b88da63
--- /dev/null
+++ b/include/xtf/time.h
@@ -0,0 +1,31 @@
+/**
+ * @file include/xtf/time.h
+ *
+ * Time management
+ */
+#ifndef XTF_TIME_H
+# define XTF_TIME_H
+
+#include 
+
+#define rdtsc(tsc) {\
+uint32_t lo, hi;\
+__asm__ volatile("rdtsc": "=a"(lo), "=d"(hi));\


Please make sure you only send a new version after having fixed all
the comments, this is still missing the serialization requirements
mentioned in the review, and it's also the wrong file to place this
helper:



I am sorry, I was really convinced that this version didn't need revision
anymore (and I still don't see what I should change).


rdtsc is not a serializing instruction, and as such there's no
guarantee it's not executed before or after any of it's preceding or
following instructions. In order to make it serializing you need to
add a lfence or mfence, or if you are not sure about the architecture
you likely need to add both, see:

https://marc.info/?l=xen-devel=151983511212795=2

Also, as said in the previous review, the rdtsc helper should be
placed in a x86 specific file because it's a x86 specific instruction.
IMO it should be placed in arch/x86/include/arch/lib.h with the rest
of the x86 specific instructions.



Thanks for clarifying this point ! I will put the function in the 
correct place, and add an implementation of rdtsc_ordered with 
serialization 



Hope

Re: [Xen-devel] [PATCH v6 3/9] xen/x86: support per-domain flag for xpti

2018-04-10 Thread Jan Beulich

>>> On 10.04.18 at 11:32,  wrote:
> On 10/04/18 11:14, Jan Beulich wrote:
> On 10.04.18 at 09:58,  wrote:
>>> --- a/docs/misc/xen-command-line.markdown
>>> +++ b/docs/misc/xen-command-line.markdown
>>> @@ -1955,14 +1955,29 @@ clustered mode.  The default, given no hint from 
>>> the **FADT**, is cluster
>>>  mode.
>>>  
>>>  ### xpti
>>> -> `= `
>>> +> `= List of [ default |  | dom0= | domu= ]`
>>>  
>>> -> Default: `false` on AMD hardware
>>> +> Default: `false` on hardware not vulnerable to Meltdown (e.g. AMD)
>>>  > Default: `true` everywhere else
>>>  
>>>  Override default selection of whether to isolate 64-bit PV guest page
>>>  tables.
>>>  
>>> +`true` activates page table isolation even on hardware not vulnerable by
>>> +Meltdown for all domains.
>>> +
>>> +`false` deactivates page table isolation on all systems for all domains.
>>> +
>>> +`default` sets the default behaviour.
>>> +
>>> +`dom0=false` deactivates page table isolation for dom0.
>>> +
>>> +`dom0=true` activates page table isolation for dom0.
>>> +
>>> +`domu=false` deactivates page table isolation for guest domains.
>>> +
>>> +`domu=true` activates page table isolation for guest domains.
>> 
>> This is too verbose / repetitive for my taste.
> 
> So you'd like it better as:
> 
> "With `dom0` and `domu` it is possible to control page table isolation
> for dom0 or guest domains only." ?

Yes.

>>> @@ -205,6 +208,10 @@ int pv_domain_initialise(struct domain *d)
>>>  /* 64-bit PV guest by default. */
>>>  d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
>>>  
>>> +d->arch.pv_domain.xpti = (d->domain_id == hardware_domid)
>>> + ? (opt_xpti & XPTI_DOM0)
>>> + : (opt_xpti & XPTI_DOMU);
>> 
>> I would generally prefer to have as little redundancy as possible in
>> such expressions, i.e.
>> 
>> d->arch.pv_domain.xpti = opt_xpti & (d->domain_id == hardware_domid
>>  ? XPTI_DOM0 : XPTI_DOMU);
> 
> Okay.
> 
>> 
>> Furthermore - shouldn't this cover domain 0 as well as the hardware
>> domain, even if - in case they are different - domain 0 should be
>> short lived?
> 
> When domain 0 is created is _is_ the hardware domain. Only domain 0
> creating a hardware domain will set hardware_domid to a non-zero value.

hardware_domid is set by an integer_param() afaics, so would be
set long before creation of domain 0.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] x86/vpt: execute callbacks for masked interrupts

2018-04-10 Thread Jan Beulich

>>> On 10.04.18 at 10:53,  wrote:
> On Mon, Apr 09, 2018 at 09:34:57AM -0600, Jan Beulich wrote:
>> >>> On 30.03.18 at 14:35,  wrote:
>> > Execute periodic_time callbacks even if the interrupt is not actually
>> > injected because the IRQ is masked.
>> > 
>> > Current callbacks from emulated timer devices only update emulated
>> > registers, which from my reading of the specs should happen regardless
>> > of whether the interrupt has been injected or not.
>> 
>> While generally I agree, it also means extra work done. Looking
>> at the PIT case, for example, there's no strict need to do the
>> update when the IRQ is masked, as the value being updated is
>> only used to subtract from get_guest_time()'s return value.
>> Similarly for the LAPIC case.
>> 
>> In the RTC case your change actually looks risky, due to the
>> pt_dead_ticks logic. I can't help getting the impression that the
>> IRQ being off for 10 ticks would lead to no RTC interrupts at all
>> anymore for the guest (until something resets that counter),
>> which seems wrong to me.
> 
> Hm, right. The RTC is already handled specially in order to not
> disable the timer but also don't call the handler if the IRQ is
> masked.
> 
> Maybe the right solution is to add some flags to the vpt code,
> something like:
> 
>  - DISABLE_ON_MASKED: only valid for periodic interrupts. Destroy the
>timer if the IRQ is masked when the timer fires.
>  - SKIP_CALLBACK_ON_MASKED: do not execute the timer callback if the
>IRQ is masked when the timer fires.
> 
> That AFAICT should allow Xen to keep the previous behaviour for
> existing timer code (and remove the RTC special casing).

Something like this, yes (I don't really like the names you suggest,
but I also can't suggest any better ones right away).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread George Dunlap



> On Apr 10, 2018, at 9:57 AM, Olaf Hering  wrote:
> 
> While hunting some other bug we run into the single BUG in
> sched_credit.c:csched_load_balance(). This happens with all versions
> since 4.7, staging is also affected. Testsystem is a Haswell model 63
> system with 4 NUMA nodes and 144 threads.
> 
> (XEN) Xen BUG at sched_credit.c:1694
> (XEN) [ Xen-4.11.20180407T144959.e62e140daa-2.bug1087289_411  x86_64  
> debug=n   Not tainted ]
> (XEN) CPU:30
> (XEN) RIP:e008:[] 
> sched_credit.c#csched_schedule+0xaad/0xba0
> (XEN) RFLAGS: 00010087   CONTEXT: hypervisor
> (XEN) rax: 83077ffe76d0   rbx: 83077fe571d0   rcx: 001e
> (XEN) rdx: 83005d082000   rsi:    rdi: 83077fe575b0
> (XEN) rbp: 82d08094a480   rsp: 83077fe4fd00   r8:  83077fe581a0
> (XEN) r9:  82d080227cf0   r10:    r11: 830060b62060
> (XEN) r12: 14f4e864c2d4   r13: 83077fe575b0   r14: 83077fe58180
> (XEN) r15: 82d08094a480   cr0: 8005003b   cr4: 001526e0
> (XEN) cr3: 49416000   cr2: 7fb24e1b7277
> (XEN) fsb:    gsb:    gss: 
> (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
> (XEN) Xen code around  
> (sched_credit.c#csched_schedule+0xaad/0xba0):
> (XEN)  18 01 00 e9 73 f7 ff ff <0f> 0b 48 8b 43 28 be 01 00 00 00 bf 0a 20 02 
> 00
> (XEN) Xen stack trace from rsp=83077fe4fd00:
> (XEN)82d0803577ef 001e 8000803577ef 830f9d5b2aa0
> (XEN)82d0803577ef 83077a6c59e0 83077fe4fe38 82d0803577fb
> (XEN)  01c9c380 
> (XEN)83077fe4 001e 14f4e86c885e 83077fe4
> (XEN)82d08094a480 14f4e86c73be 80230c80 830060b38000
> (XEN)83077fe58300 0046 830f9d4f6018 0082
> (XEN)001e 83077fe581c8 0001 001e
> (XEN)83005d1f 83077fe58188 14f4e86c885e 83077fe58180
> (XEN)82d08094a480 82d08023153d 8307 83077fe581a0
> (XEN)0206 82d080268705 83077fe58300 830060b38060
> (XEN)830845d83010 82d080238578 83077fe4 
> (XEN) 83077fe4 82d080933c00 82d08094a480
> (XEN)83077fe4 82d080234cb2 82d08095f1f0 82d080934b00
> (XEN)82d08095f1f0 001e 001e 82d08026daf5
> (XEN)83005d1f 83005d1f 83005d1f 83077fe58188
> (XEN)14f4e86a43ab 83077fe58180 82d08094a480 88011dd88000
> (XEN)88011dd88000 88011dd88000  002b
> (XEN)81d4c180  0013fe969894 0001
> (XEN) 81020e50  
> (XEN)  00fc 81060182
> (XEN) Xen call trace:
> (XEN)[] sched_credit.c#csched_schedule+0xaad/0xba0
> (XEN)[] common_interrupt+0x8f/0x110
> (XEN)[] common_interrupt+0x8f/0x110
> (XEN)[] common_interrupt+0x9b/0x110
> (XEN)[] schedule.c#schedule+0xdd/0x5d0
> (XEN)[] reprogram_timer+0x75/0xe0
> (XEN)[] timer.c#timer_softirq_action+0x138/0x210
> (XEN)[] softirq.c#__do_softirq+0x62/0x90
> (XEN)[] domain.c#idle_loop+0x45/0xb0
> (XEN) 
> (XEN) Panic on CPU 30:
> (XEN) Xen BUG at sched_credit.c:1694
> (XEN) 
> (XEN) Reboot in five seconds...
> 
> But after that the system hangs hard, one has to pull the plug.
> Running the debug version of xen.efi did not trigger any ASSERT.
> 
> 
> This happens if there are many busy backend/frontend pairs in a number
> of domUs. I think more domUs will trigger it sooner, overcommit helps as
> well. It was not seen with a single domU.
> 
> The testcase is like that:
> - boot dom0 with "dom0_max_vcpus=30 dom0_mem=32G dom0_vcpus_pin"
> - create a tmpfs in dom0
> - create files in that tmpfs to be exported to domUs via file://path,xvdtN,w
> - assign these files to HVM domUs
> - inside the domUs, create a filesystem on the xvdtN devices
> - mount the filesystem
> - run fio(1) on the filesystem
> - in dom0, run 'xl vcpu-pin domU $node1-3 $nodeN' in a loop to move domU 
> between node 1 to 3.
> 
> After a low number of iterations Xen crashes in csched_load_balance.
> 
> In my setup I had 16 HVM domUs with 64 vcpus, each one had 3 vbd devices.
> It was reported also with fewer and smaller domUs.
> Scripts exist to recreate the setup easily.
> 
> 
[snip]
> 
> Any idea what might causing this crash?

Assuming the bug is this one:

BUG_ON( cpu != snext->vcpu->processor );

a nasty race condition… a vcpu has just been taken off the runqueue of the 
current pcpu, but it’s apparently

Re: [Xen-devel] [PATCH v6 9/9] xen/x86: use PCID feature

2018-04-10 Thread Juergen Gross

On 10/04/18 11:29, Jan Beulich wrote:
 On 10.04.18 at 09:58,  wrote:
>> @@ -102,14 +104,34 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long 
>> cr4)
>>  old_cr4 = read_cr4();
>>  if ( old_cr4 & X86_CR4_PGE )
>>  {
>> +/*
>> + * X86_CR4_PGE set means PCID is inactive.
>> + * We have to purge the TLB via flipping cr4.pge.
>> + */
>>  old_cr4 = cr4 & ~X86_CR4_PGE;
>>  write_cr4(old_cr4);
>>  }
>> +else if ( use_invpcid )
>> +/*
>> + * Flushing the TLB via INVPCID is necessary only in case PCIDs are
>> + * in use, which is true only with INVPCID being available.
>> + * Without PCID usage the following write_cr3() will purge the TLB
>> + * (we are in the cr4.pge off path) from all entries.
> 
> s/from/of/ ?
> 
>> @@ -136,11 +158,32 @@ unsigned int flush_area_local(const void *va, unsigned 
>> int flags)
>>  /*
>>   * We don't INVLPG multi-page regions because the 2M/4M/1G
>>   * region may not have been mapped with a superpage. Also there
>> - * are various errata surrounding INVLPG usage on superpages, 
>> and
>> - * a full flush is in any case not *that* expensive.
>> + * are various errata surrounding INVLPG usage on superpages,
>> + * and a full flush is in any case not *that* expensive.
>>   */
> 
> Stray change?
> 
>> @@ -508,7 +513,8 @@ unsigned long pv_guest_cr4_to_real_cr4(const struct vcpu 
>> *v)
>>  cr4 = v->arch.pv_vcpu.ctrlreg[4] & ~X86_CR4_DE;
>>  cr4 |= mmu_cr4_features & (X86_CR4_PSE | X86_CR4_SMEP | X86_CR4_SMAP |
>> X86_CR4_OSXSAVE | X86_CR4_FSGSBASE);
>> -cr4 |= d->arch.pv_domain.xpti  ? 0 : X86_CR4_PGE;
>> +cr4 |= (d->arch.pv_domain.xpti || d->arch.pv_domain.pcid) ? 0 : 
>> X86_CR4_PGE;
>> +cr4 |= d->arch.pv_domain.pcid ? X86_CR4_PCIDE : 0;
> 
> I think this would be more clear to follow as
> 
> if ( d->arch.pv_domain.pcid )
> cr4 |= X86_CR4_PCIDE;
> else if ( !d->arch.pv_domain.xpti )
> cr4 |= X86_CR4_PGE;
> 
> Anyway, with or without these addressed (which could probably
> also be done while committing, as long as you agree)

I do agree.

> Reviewed-by: Jan Beulich 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 3/9] xen/x86: support per-domain flag for xpti

2018-04-10 Thread Juergen Gross

On 10/04/18 11:14, Jan Beulich wrote:
 On 10.04.18 at 09:58,  wrote:
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -1955,14 +1955,29 @@ clustered mode.  The default, given no hint from the 
>> **FADT**, is cluster
>>  mode.
>>  
>>  ### xpti
>> -> `= `
>> +> `= List of [ default |  | dom0= | domu= ]`
>>  
>> -> Default: `false` on AMD hardware
>> +> Default: `false` on hardware not vulnerable to Meltdown (e.g. AMD)
>>  > Default: `true` everywhere else
>>  
>>  Override default selection of whether to isolate 64-bit PV guest page
>>  tables.
>>  
>> +`true` activates page table isolation even on hardware not vulnerable by
>> +Meltdown for all domains.
>> +
>> +`false` deactivates page table isolation on all systems for all domains.
>> +
>> +`default` sets the default behaviour.
>> +
>> +`dom0=false` deactivates page table isolation for dom0.
>> +
>> +`dom0=true` activates page table isolation for dom0.
>> +
>> +`domu=false` deactivates page table isolation for guest domains.
>> +
>> +`domu=true` activates page table isolation for guest domains.
> 
> This is too verbose / repetitive for my taste.

So you'd like it better as:

"With `dom0` and `domu` it is possible to control page table isolation
for dom0 or guest domains only." ?

> 
>> @@ -205,6 +208,10 @@ int pv_domain_initialise(struct domain *d)
>>  /* 64-bit PV guest by default. */
>>  d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
>>  
>> +d->arch.pv_domain.xpti = (d->domain_id == hardware_domid)
>> + ? (opt_xpti & XPTI_DOM0)
>> + : (opt_xpti & XPTI_DOMU);
> 
> I would generally prefer to have as little redundancy as possible in
> such expressions, i.e.
> 
> d->arch.pv_domain.xpti = opt_xpti & (d->domain_id == hardware_domid
>  ? XPTI_DOM0 : XPTI_DOMU);

Okay.

> 
> Furthermore - shouldn't this cover domain 0 as well as the hardware
> domain, even if - in case they are different - domain 0 should be
> short lived?

When domain 0 is created is _is_ the hardware domain. Only domain 0
creating a hardware domain will set hardware_domid to a non-zero value.

> 
>> --- a/xen/arch/x86/spec_ctrl.c
>> +++ b/xen/arch/x86/spec_ctrl.c
>> @@ -193,6 +193,68 @@ static bool __init retpoline_safe(void)
>> }
>> }
>>
>> +#define XPTI_DEFAULT  0xff
>> +uint8_t opt_xpti = XPTI_DEFAULT;
> 
> __read_mostly

Okay.

> 
>> --- a/xen/include/asm-x86/spec_ctrl.h
>> +++ b/xen/include/asm-x86/spec_ctrl.h
>> @@ -29,6 +29,10 @@ void init_speculation_mitigations(void);
>>  extern bool opt_ibpb;
>>  extern uint8_t default_bti_ist_info;
>>  
>> +extern uint8_t opt_xpti;
>> +#define XPTI_DOM0  0x01
>> +#define XPTI_DOMU  0x02
> 
> OPT_XPTI_DOM{0,U} would perhaps have been better.
> 
> Anyway, in the interest of getting done with this
> Reviewed-by: Jan Beulich 
> with or without some or all of the suggestions addressed.

I can send a followup patch.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 9/9] xen/x86: use PCID feature

2018-04-10 Thread Jan Beulich

>>> On 10.04.18 at 09:58,  wrote:
> @@ -102,14 +104,34 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long 
> cr4)
>  old_cr4 = read_cr4();
>  if ( old_cr4 & X86_CR4_PGE )
>  {
> +/*
> + * X86_CR4_PGE set means PCID is inactive.
> + * We have to purge the TLB via flipping cr4.pge.
> + */
>  old_cr4 = cr4 & ~X86_CR4_PGE;
>  write_cr4(old_cr4);
>  }
> +else if ( use_invpcid )
> +/*
> + * Flushing the TLB via INVPCID is necessary only in case PCIDs are
> + * in use, which is true only with INVPCID being available.
> + * Without PCID usage the following write_cr3() will purge the TLB
> + * (we are in the cr4.pge off path) from all entries.

s/from/of/ ?

> @@ -136,11 +158,32 @@ unsigned int flush_area_local(const void *va, unsigned 
> int flags)
>  /*
>   * We don't INVLPG multi-page regions because the 2M/4M/1G
>   * region may not have been mapped with a superpage. Also there
> - * are various errata surrounding INVLPG usage on superpages, and
> - * a full flush is in any case not *that* expensive.
> + * are various errata surrounding INVLPG usage on superpages,
> + * and a full flush is in any case not *that* expensive.
>   */

Stray change?

> @@ -508,7 +513,8 @@ unsigned long pv_guest_cr4_to_real_cr4(const struct vcpu 
> *v)
>  cr4 = v->arch.pv_vcpu.ctrlreg[4] & ~X86_CR4_DE;
>  cr4 |= mmu_cr4_features & (X86_CR4_PSE | X86_CR4_SMEP | X86_CR4_SMAP |
> X86_CR4_OSXSAVE | X86_CR4_FSGSBASE);
> -cr4 |= d->arch.pv_domain.xpti  ? 0 : X86_CR4_PGE;
> +cr4 |= (d->arch.pv_domain.xpti || d->arch.pv_domain.pcid) ? 0 : 
> X86_CR4_PGE;
> +cr4 |= d->arch.pv_domain.pcid ? X86_CR4_PCIDE : 0;

I think this would be more clear to follow as

if ( d->arch.pv_domain.pcid )
cr4 |= X86_CR4_PCIDE;
else if ( !d->arch.pv_domain.xpti )
cr4 |= X86_CR4_PGE;

Anyway, with or without these addressed (which could probably
also be done while committing, as long as you agree)
Reviewed-by: Jan Beulich 

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC, v2, 5/9] hyper_dmabuf: default backend for XEN hypervisor

2018-04-10 Thread Oleksandr Andrushchenko


On 02/14/2018 03:50 AM, Dongwon Kim wrote:

From: "Matuesz Polrola" 

The default backend for XEN hypervisor. This backend contains actual
implementation of individual methods defined in "struct hyper_dmabuf_bknd_ops"
defined as:

struct hyper_dmabuf_bknd_ops {
 /* backend initialization routine (optional) */
 int (*init)(void);

 /* backend cleanup routine (optional) */
 int (*cleanup)(void);

 /* retreiving id of current virtual machine */
 int (*get_vm_id)(void);

 /* get pages shared via hypervisor-specific method */
 int (*share_pages)(struct page **, int, int, void **);

 /* make shared pages unshared via hypervisor specific method */
 int (*unshare_pages)(void **, int);

 /* map remotely shared pages on importer's side via
  * hypervisor-specific method
  */
 struct page ** (*map_shared_pages)(unsigned long, int, int, void **);

 /* unmap and free shared pages on importer's side via
  * hypervisor-specific method
  */
 int (*unmap_shared_pages)(void **, int);

 /* initialize communication environment */
 int (*init_comm_env)(void);

 void (*destroy_comm)(void);

 /* upstream ch setup (receiving and responding) */
 int (*init_rx_ch)(int);

 /* downstream ch setup (transmitting and parsing responses) */
 int (*init_tx_ch)(int);

 int (*send_req)(int, struct hyper_dmabuf_req *, int);
};

First two methods are for extra initialization or cleaning up possibly
required for the current Hypervisor (optional). Third method
(.get_vm_id) provides a way to get current VM's id, which will be used
as an identication of source VM of shared hyper_DMABUF later.

All other methods are related to either memory sharing or inter-VM
communication, which are minimum requirement for hyper_DMABUF driver.
(Brief description of role of each method is embedded as a comment in the
definition of the structure above and header file.)

Actual implementation of each of these methods specific to XEN is under
backends/xen/. Their mappings are done as followed:

struct hyper_dmabuf_bknd_ops xen_bknd_ops = {
 .init = NULL, /* not needed for xen */
 .cleanup = NULL, /* not needed for xen */
 .get_vm_id = xen_be_get_domid,
 .share_pages = xen_be_share_pages,
 .unshare_pages = xen_be_unshare_pages,
 .map_shared_pages = (void *)xen_be_map_shared_pages,
 .unmap_shared_pages = xen_be_unmap_shared_pages,
 .init_comm_env = xen_be_init_comm_env,
 .destroy_comm = xen_be_destroy_comm,
 .init_rx_ch = xen_be_init_rx_rbuf,
 .init_tx_ch = xen_be_init_tx_rbuf,
 .send_req = xen_be_send_req,
};

A section for Hypervisor Backend has been added to

"Documentation/hyper-dmabuf-sharing.txt" accordingly

Signed-off-by: Dongwon Kim 
Signed-off-by: Mateusz Polrola 
---
  drivers/dma-buf/hyper_dmabuf/Kconfig   |   7 +
  drivers/dma-buf/hyper_dmabuf/Makefile  |   7 +
  .../backends/xen/hyper_dmabuf_xen_comm.c   | 941 +
  .../backends/xen/hyper_dmabuf_xen_comm.h   |  78 ++
  .../backends/xen/hyper_dmabuf_xen_comm_list.c  | 158 
  .../backends/xen/hyper_dmabuf_xen_comm_list.h  |  67 ++
  .../backends/xen/hyper_dmabuf_xen_drv.c|  46 +
  .../backends/xen/hyper_dmabuf_xen_drv.h|  53 ++
  .../backends/xen/hyper_dmabuf_xen_shm.c| 525 
  .../backends/xen/hyper_dmabuf_xen_shm.h|  46 +
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c|  10 +
  11 files changed, 1938 insertions(+)
  create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.c
  create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm.h
  create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm_list.c
  create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_comm_list.h
  create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_drv.c
  create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_drv.h
  create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_shm.c
  create mode 100644 
drivers/dma-buf/hyper_dmabuf/backends/xen/hyper_dmabuf_xen_shm.h

diff --git a/drivers/dma-buf/hyper_dmabuf/Kconfig 
b/drivers/dma-buf/hyper_dmabuf/Kconfig
index 5ebf516d65eb..68f3d6ce2c1f 100644
--- a/drivers/dma-buf/hyper_dmabuf/Kconfig
+++ b/drivers/dma-buf/hyper_dmabuf/Kconfig
@@ -20,4 +20,11 @@ config HYPER_DMABUF_SYSFS
  
  	  The location of sysfs is under ""
  
+config HYPER_DMABUF_XEN

+bool "Configure hyper_dmabuf for XEN hypervisor"
+default y

n?

+depends on HYPER_DMABUF && XEN && XENFS
+help
+

Re: [Xen-devel] [PATCH v6 1/9] x86/xpti: avoid copying L4 page table contents when possible

2018-04-10 Thread Juergen Gross

On 10/04/18 11:00, Jan Beulich wrote:
 On 10.04.18 at 09:58,  wrote:
>> For mitigation of Meltdown the current L4 page table is copied to the
>> cpu local root page table each time a 64 bit pv guest is entered.
>>
>> Copying can be avoided in cases where the guest L4 page table hasn't
>> been modified while running the hypervisor, e.g. when handling
>> interrupts or any hypercall not modifying the L4 page table or %cr3.
>>
>> So add a per-cpu flag indicating whether the copying should be
>> performed and set that flag only when loading a new %cr3 or modifying
>> the L4 page table.  This includes synchronization of the cpu local
>> root page table with other cpus, so add a special synchronization flag
>> for that case.
>>
>> A simple performance check (compiling the hypervisor via "make -j 4")
>> in dom0 with 4 vcpus shows a significant improvement:
>>
>> - real time drops from 112 seconds to 103 seconds
>> - system time drops from 142 seconds to 131 seconds
>>
>> Signed-off-by: Juergen Gross 
> 
> Reviewed-by: Jan Beulich 
> 
>> ---
>> V6:
>> - correct an error from rebasing to staging in assembly part
> 
> I have to admit that without digging out v5 I can't spot the
> change.

It is subtle, even more as patch 3 corrected the error again.

In restore_all_guest pv_cr3 must be tested for being zero before
testing whether to copy the root page table.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xtf test] 122138: all pass - PUSHED

2018-04-10 Thread osstest service owner

flight 122138 xtf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122138/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 xtf  6f45086733cc1ce92ec093533097900a0de1c7b4
baseline version:
 xtf  1498952b2417271ac4767cbcb550bf75eba24492

Last test of basis   122049  2018-04-07 20:16:38 Z2 days
Testing same since   122138  2018-04-09 14:47:26 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-amd64-xtf  pass
 build-amd64  pass
 build-amd64-pvopspass
 test-xtf-amd64-amd64-1   pass
 test-xtf-amd64-amd64-2   pass
 test-xtf-amd64-amd64-3   pass
 test-xtf-amd64-amd64-4   pass
 test-xtf-amd64-amd64-5   pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xtf.git
   1498952..6f45086  6f45086733cc1ce92ec093533097900a0de1c7b4 -> 
xen-tested-master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 8/9] xen/x86: add some cr3 helpers

2018-04-10 Thread Jan Beulich

>>> On 10.04.18 at 09:58,  wrote:
> --- a/xen/include/asm-x86/processor.h
> +++ b/xen/include/asm-x86/processor.h
> @@ -288,6 +288,16 @@ static inline void write_cr3(unsigned long val)
>  asm volatile ( "mov %0, %%cr3" : : "r" (val) : "memory" );
>  }
>  
> +static inline unsigned long cr3_pa(unsigned long cr3)
> +{
> +return cr3 & X86_CR3_ADDR_MASK;
> +}
> +
> +static inline unsigned long cr3_pcid(unsigned long cr3)

This would perhaps better return unsigned int, but anyway
Reviewed-by: Jan Beulich 

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 7/9] xen/x86: convert pv_guest_cr4_to_real_cr4() to a function

2018-04-10 Thread Jan Beulich

>>> On 10.04.18 at 09:58,  wrote:
> pv_guest_cr4_to_real_cr4() is becoming more and more complex. Convert
> it from a macro to an ordinary function.
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 5/9] xen/x86: disable global pages for domains with XPTI active

2018-04-10 Thread Jan Beulich

>>> On 10.04.18 at 09:58,  wrote:
> Instead of flushing the TLB from global pages when switching address
> spaces with XPTI being active just disable global pages via %cr4
> completely when a domain subject to XPTI is active. This avoids the
> need for extra TLB flushes as loading %cr3 will remove all TLB
> entries.
> 
> In order to avoid states with cr3/cr4 having inconsistent values
> (e.g. global pages being activated while cr3 already specifies a XPTI
> address space) move loading of the new cr4 value to write_ptbase()
> (actually to switch_cr3_cr4() called by write_ptbase()).
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 4/9] xen/x86: use invpcid for flushing the TLB

2018-04-10 Thread Jan Beulich

>>> On 10.04.18 at 09:58,  wrote:
> If possible use the INVPCID instruction for flushing the TLB instead of
> toggling cr4.pge for that purpose.
> 
> While at it remove the dependency on cr4.pge being required for mtrr
> loading, as this will be required later anyway.
> 
> Add a command line option "invpcid" for controlling the use of
> INVPCID (default to true).
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 3/9] xen/x86: support per-domain flag for xpti

2018-04-10 Thread Jan Beulich

>>> On 10.04.18 at 09:58,  wrote:
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1955,14 +1955,29 @@ clustered mode.  The default, given no hint from the 
> **FADT**, is cluster
>  mode.
>  
>  ### xpti
> -> `= `
> +> `= List of [ default |  | dom0= | domu= ]`
>  
> -> Default: `false` on AMD hardware
> +> Default: `false` on hardware not vulnerable to Meltdown (e.g. AMD)
>  > Default: `true` everywhere else
>  
>  Override default selection of whether to isolate 64-bit PV guest page
>  tables.
>  
> +`true` activates page table isolation even on hardware not vulnerable by
> +Meltdown for all domains.
> +
> +`false` deactivates page table isolation on all systems for all domains.
> +
> +`default` sets the default behaviour.
> +
> +`dom0=false` deactivates page table isolation for dom0.
> +
> +`dom0=true` activates page table isolation for dom0.
> +
> +`domu=false` deactivates page table isolation for guest domains.
> +
> +`domu=true` activates page table isolation for guest domains.

This is too verbose / repetitive for my taste.

> @@ -205,6 +208,10 @@ int pv_domain_initialise(struct domain *d)
>  /* 64-bit PV guest by default. */
>  d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
>  
> +d->arch.pv_domain.xpti = (d->domain_id == hardware_domid)
> + ? (opt_xpti & XPTI_DOM0)
> + : (opt_xpti & XPTI_DOMU);

I would generally prefer to have as little redundancy as possible in
such expressions, i.e.

d->arch.pv_domain.xpti = opt_xpti & (d->domain_id == hardware_domid
 ? XPTI_DOM0 : XPTI_DOMU);

Furthermore - shouldn't this cover domain 0 as well as the hardware
domain, even if - in case they are different - domain 0 should be
short lived?

> --- a/xen/arch/x86/spec_ctrl.c
> +++ b/xen/arch/x86/spec_ctrl.c
> @@ -193,6 +193,68 @@ static bool __init retpoline_safe(void)
> }
> }
> 
> +#define XPTI_DEFAULT  0xff
> +uint8_t opt_xpti = XPTI_DEFAULT;

__read_mostly

> --- a/xen/include/asm-x86/spec_ctrl.h
> +++ b/xen/include/asm-x86/spec_ctrl.h
> @@ -29,6 +29,10 @@ void init_speculation_mitigations(void);
>  extern bool opt_ibpb;
>  extern uint8_t default_bti_ist_info;
>  
> +extern uint8_t opt_xpti;
> +#define XPTI_DOM0  0x01
> +#define XPTI_DOMU  0x02

OPT_XPTI_DOM{0,U} would perhaps have been better.

Anyway, in the interest of getting done with this
Reviewed-by: Jan Beulich 
with or without some or all of the suggestions addressed.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 122135: all pass - PUSHED

2018-04-10 Thread osstest service owner

flight 122135 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/122135/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 64797018df0cf5c1f11523bb575355aba918b940
baseline version:
 ovmf 95cc4962167572089a99be324574094ba22415ad

Last test of basis   122120  2018-04-09 03:22:26 Z1 days
Testing same since   122135  2018-04-09 12:51:49 Z0 days1 attempts


People who touched revisions under test:
  Carsey, Jaben 
  Jaben Carsey 
  Yonghong Zhu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   95cc496216..64797018df  64797018df0cf5c1f11523bb575355aba918b940 -> 
xen-tested-master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 2/9] xen/x86: add a function for modifying cr3

2018-04-10 Thread Jan Beulich

>>> On 10.04.18 at 09:58,  wrote:
> Instead of having multiple places with more or less identical asm
> statements just have one function doing a write to cr3.
> 
> As this function should be named write_cr3() rename the current
> write_cr3() function to switch_cr3().
> 
> Suggested-by: Andrew Copper 
> Signed-off-by: Juergen Gross 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 1/9] x86/xpti: avoid copying L4 page table contents when possible

2018-04-10 Thread Jan Beulich

>>> On 10.04.18 at 09:58,  wrote:
> For mitigation of Meltdown the current L4 page table is copied to the
> cpu local root page table each time a 64 bit pv guest is entered.
> 
> Copying can be avoided in cases where the guest L4 page table hasn't
> been modified while running the hypervisor, e.g. when handling
> interrupts or any hypercall not modifying the L4 page table or %cr3.
> 
> So add a per-cpu flag indicating whether the copying should be
> performed and set that flag only when loading a new %cr3 or modifying
> the L4 page table.  This includes synchronization of the cpu local
> root page table with other cpus, so add a special synchronization flag
> for that case.
> 
> A simple performance check (compiling the hypervisor via "make -j 4")
> in dom0 with 4 vcpus shows a significant improvement:
> 
> - real time drops from 112 seconds to 103 seconds
> - system time drops from 142 seconds to 131 seconds
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Jan Beulich 

> ---
> V6:
> - correct an error from rebasing to staging in assembly part

I have to admit that without digging out v5 I can't spot the
change.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] BUG - 'xl restore' does not overwrite HVM

2018-04-10 Thread Wei Liu

(Add back xen-devel)

Hello Peter

Please don't top-post.

On Tue, Apr 10, 2018 at 08:52:10AM +, Peter McLaren wrote:
> Hi Wei
> I would like the restore command to return the HVM to the exact state it was 
> at when the save command was performed.
> Thanks
> Peter

I think what you need is disk snapshot -- because you want your disk to
return to its previous state.

I don't think xen 4.4 supports that. IIRC even the latest version of Xen
doesn't have disk snapshot support. You will have to manually snapshot
your disk (like using lvm snapshot).

Wei.

> 
> Get Outlook for Android
> 
> 
> From: Wei Liu 
> Sent: Tuesday, April 10, 2018 6:44:26 PM
> To: Peter McLaren
> Cc: xen-de...@lists.xen.org; Wei Liu
> Subject: Re: [Xen-devel] BUG - 'xl restore' does not overwrite HVM
> 
> On Tue, Apr 10, 2018 at 06:07:12AM +, Peter McLaren wrote:
> > Hi
> > with at least 1 version of Windows 10 (build 16299), the 'xl restore' 
> > command does not overwrite the previously running HVM. The symptoms are:
> > 1) the restore appears to rapidly complete after approx 50% of the time
> > 2)  files created after the save in the running HVM are present after the 
> > restore
> > 3) the Windows system tries to recover.
> >
> > I have tried restoring both with a shutdown of the Windows system or a 
> > destroy. In both cases the results are the same.
> >
> > I have listed some relevant info below. Any help would be appreciated.
> 
> What do you mean by "overwrite HVM"? What do you want to achieve?
> 
> Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] crash in csched_load_balance after xl vcpu-pin

2018-04-10 Thread Olaf Hering

While hunting some other bug we run into the single BUG in
sched_credit.c:csched_load_balance(). This happens with all versions
since 4.7, staging is also affected. Testsystem is a Haswell model 63
system with 4 NUMA nodes and 144 threads.

(XEN) Xen BUG at sched_credit.c:1694
(XEN) [ Xen-4.11.20180407T144959.e62e140daa-2.bug1087289_411  x86_64  
debug=n   Not tainted ]
(XEN) CPU:30
(XEN) RIP:e008:[] 
sched_credit.c#csched_schedule+0xaad/0xba0
(XEN) RFLAGS: 00010087   CONTEXT: hypervisor
(XEN) rax: 83077ffe76d0   rbx: 83077fe571d0   rcx: 001e
(XEN) rdx: 83005d082000   rsi:    rdi: 83077fe575b0
(XEN) rbp: 82d08094a480   rsp: 83077fe4fd00   r8:  83077fe581a0
(XEN) r9:  82d080227cf0   r10:    r11: 830060b62060
(XEN) r12: 14f4e864c2d4   r13: 83077fe575b0   r14: 83077fe58180
(XEN) r15: 82d08094a480   cr0: 8005003b   cr4: 001526e0
(XEN) cr3: 49416000   cr2: 7fb24e1b7277
(XEN) fsb:    gsb:    gss: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen code around  
(sched_credit.c#csched_schedule+0xaad/0xba0):
(XEN)  18 01 00 e9 73 f7 ff ff <0f> 0b 48 8b 43 28 be 01 00 00 00 bf 0a 20 02 00
(XEN) Xen stack trace from rsp=83077fe4fd00:
(XEN)82d0803577ef 001e 8000803577ef 830f9d5b2aa0
(XEN)82d0803577ef 83077a6c59e0 83077fe4fe38 82d0803577fb
(XEN)  01c9c380 
(XEN)83077fe4 001e 14f4e86c885e 83077fe4
(XEN)82d08094a480 14f4e86c73be 80230c80 830060b38000
(XEN)83077fe58300 0046 830f9d4f6018 0082
(XEN)001e 83077fe581c8 0001 001e
(XEN)83005d1f 83077fe58188 14f4e86c885e 83077fe58180
(XEN)82d08094a480 82d08023153d 8307 83077fe581a0
(XEN)0206 82d080268705 83077fe58300 830060b38060
(XEN)830845d83010 82d080238578 83077fe4 
(XEN) 83077fe4 82d080933c00 82d08094a480
(XEN)83077fe4 82d080234cb2 82d08095f1f0 82d080934b00
(XEN)82d08095f1f0 001e 001e 82d08026daf5
(XEN)83005d1f 83005d1f 83005d1f 83077fe58188
(XEN)14f4e86a43ab 83077fe58180 82d08094a480 88011dd88000
(XEN)88011dd88000 88011dd88000  002b
(XEN)81d4c180  0013fe969894 0001
(XEN) 81020e50  
(XEN)  00fc 81060182
(XEN) Xen call trace:
(XEN)[] sched_credit.c#csched_schedule+0xaad/0xba0
(XEN)[] common_interrupt+0x8f/0x110
(XEN)[] common_interrupt+0x8f/0x110
(XEN)[] common_interrupt+0x9b/0x110
(XEN)[] schedule.c#schedule+0xdd/0x5d0
(XEN)[] reprogram_timer+0x75/0xe0
(XEN)[] timer.c#timer_softirq_action+0x138/0x210
(XEN)[] softirq.c#__do_softirq+0x62/0x90
(XEN)[] domain.c#idle_loop+0x45/0xb0
(XEN) 
(XEN) Panic on CPU 30:
(XEN) Xen BUG at sched_credit.c:1694
(XEN) 
(XEN) Reboot in five seconds...

But after that the system hangs hard, one has to pull the plug.
Running the debug version of xen.efi did not trigger any ASSERT.


This happens if there are many busy backend/frontend pairs in a number
of domUs. I think more domUs will trigger it sooner, overcommit helps as
well. It was not seen with a single domU.

The testcase is like that:
- boot dom0 with "dom0_max_vcpus=30 dom0_mem=32G dom0_vcpus_pin"
- create a tmpfs in dom0
- create files in that tmpfs to be exported to domUs via file://path,xvdtN,w
- assign these files to HVM domUs
- inside the domUs, create a filesystem on the xvdtN devices
- mount the filesystem
- run fio(1) on the filesystem
- in dom0, run 'xl vcpu-pin domU $node1-3 $nodeN' in a loop to move domU 
between node 1 to 3.

After a low number of iterations Xen crashes in csched_load_balance.

In my setup I had 16 HVM domUs with 64 vcpus, each one had 3 vbd devices.
It was reported also with fewer and smaller domUs.
Scripts exist to recreate the setup easily.


In one case I have seen this:

(XEN) d32v60 VMRESUME error: 0x5
(XEN) domain_crash_sync called from vmcs.c:1673
(XEN) Domain 32 (vcpu#60) crashed on cpu#139:
(XEN) [ Xen-4.11.20180407T144959.e62e140daa-2.bug1087289_411  x86_64  
debug=n   Not tainted ]


Any idea what might causing this crash?

Olaf


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org

Re: [Xen-devel] [RFC, v2, 1/9] hyper_dmabuf: initial upload of hyper_dmabuf drv core framework

2018-04-10 Thread Oleksandr Andrushchenko


On 02/14/2018 03:50 AM, Dongwon Kim wrote:

Upload of intial version of core framework in hyper_DMABUF driver
enabling DMA_BUF exchange between two different VMs in virtualized
platform based on Hypervisor such as XEN.

Hyper_DMABUF drv's primary role is to import a DMA_BUF from originator
then re-export it to another Linux VM so that it can be mapped and
accessed in there.

This driver has two layers, one is so called, "core framework", which
contains driver interface and core functions handling export/import of
new hyper_DMABUF and its maintenance. This part of the driver is
independent from Hypervisor so can work as is with any Hypervisor.

The other layer is called "Hypervisor Backend". This layer represents
the interface between "core framework" and actual Hypervisor, handling
memory sharing and communication. Not like "core framework", every
Hypervisor needs it's own backend interface designed using its native
mechanism for memory sharing and inter-VM communication.

This patch contains the first part, "core framework", which consists of
7 source files and 11 header files. Some brief description of these
source code are attached below:

hyper_dmabuf_drv.c

- Linux driver interface and initialization/cleaning-up routines

hyper_dmabuf_ioctl.c

- IOCTLs calls for export/import of DMA-BUF comm channel's creation and
   destruction.

hyper_dmabuf_sgl_proc.c

- Provides methods to managing DMA-BUF for exporing and importing. For
   exporting, extraction of pages, sharing pages via procedures in
   "Backend" and notifying importing VM exist. For importing, all
   operations related to the reconstruction of DMA-BUF (with shared
   pages) on importer's side are defined.

hyper_dmabuf_ops.c

- Standard DMA-BUF operations for hyper_DMABUF reconstructed on
   importer's side.

hyper_dmabuf_list.c

- Lists for storing exported and imported hyper_DMABUF to keep track of
   remote usage of hyper_DMABUF currently being shared.

hyper_dmabuf_msg.c

- Defines messages exchanged between VMs (exporter and importer) and
   function calls for sending and parsing (when received) those.

hyper_dmabuf_id.c

- Contains methods to generate and manage "hyper_DMABUF id" for each
   hyper_DMABUF being exported. It is a global handle for a hyper_DMABUF,
   which another VM needs to know to import it.

hyper_dmabuf_struct.h

- Contains data structures of importer or exporter hyper_DMABUF

include/uapi/linux/hyper_dmabuf.h

- Contains definition of data types and structures referenced by user
   application to interact with driver

Signed-off-by: Dongwon Kim 
Signed-off-by: Mateusz Polrola 
---
  drivers/dma-buf/Kconfig|   2 +
  drivers/dma-buf/Makefile   |   1 +
  drivers/dma-buf/hyper_dmabuf/Kconfig   |  23 +
  drivers/dma-buf/hyper_dmabuf/Makefile  |  34 ++
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c| 254 
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.h| 111 
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.c | 135 +
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.h |  53 ++
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c  | 672 +
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.h  |  52 ++
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.c   | 294 +
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.h   |  73 +++
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c| 320 ++
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.h|  87 +++
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.c| 264 
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.h|  34 ++
  .../dma-buf/hyper_dmabuf/hyper_dmabuf_sgl_proc.c   | 256 
  .../dma-buf/hyper_dmabuf/hyper_dmabuf_sgl_proc.h   |  43 ++
  drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_struct.h | 131 
  include/uapi/linux/hyper_dmabuf.h  |  87 +++
  20 files changed, 2926 insertions(+)
  create mode 100644 drivers/dma-buf/hyper_dmabuf/Kconfig
  create mode 100644 drivers/dma-buf/hyper_dmabuf/Makefile
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.c
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_drv.h
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.c
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_id.h
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.c
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ioctl.h
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.c
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_list.h
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.c
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_msg.h
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.c
  create mode 100644 drivers/dma-buf/hyper_dmabuf/hyper_dmabuf_ops.h
  create mode 100644

Re: [Xen-devel] [PATCH v2 1/2] x86/vpt: execute callbacks for masked interrupts

2018-04-10 Thread Roger Pau Monné

On Mon, Apr 09, 2018 at 09:34:57AM -0600, Jan Beulich wrote:
> >>> On 30.03.18 at 14:35,  wrote:
> > Execute periodic_time callbacks even if the interrupt is not actually
> > injected because the IRQ is masked.
> > 
> > Current callbacks from emulated timer devices only update emulated
> > registers, which from my reading of the specs should happen regardless
> > of whether the interrupt has been injected or not.
> 
> While generally I agree, it also means extra work done. Looking
> at the PIT case, for example, there's no strict need to do the
> update when the IRQ is masked, as the value being updated is
> only used to subtract from get_guest_time()'s return value.
> Similarly for the LAPIC case.
> 
> In the RTC case your change actually looks risky, due to the
> pt_dead_ticks logic. I can't help getting the impression that the
> IRQ being off for 10 ticks would lead to no RTC interrupts at all
> anymore for the guest (until something resets that counter),
> which seems wrong to me.

Hm, right. The RTC is already handled specially in order to not
disable the timer but also don't call the handler if the IRQ is
masked.

Maybe the right solution is to add some flags to the vpt code,
something like:

 - DISABLE_ON_MASKED: only valid for periodic interrupts. Destroy the
   timer if the IRQ is masked when the timer fires.
 - SKIP_CALLBACK_ON_MASKED: do not execute the timer callback if the
   IRQ is masked when the timer fires.

That AFAICT should allow Xen to keep the previous behaviour for
existing timer code (and remove the RTC special casing).

> > @@ -282,6 +305,12 @@ int pt_update_irq(struct vcpu *v)
> >  
> >  if ( earliest_pt == NULL )
> >  {
> > +/*
> > + * NB: although the to_purge list is local, calls to
> > + * destroy_periodic_time can still remove items from the list, 
> > hence
> 
> pt_adjust_vcpu() as well as it looks.
> 
> > + * the need to hold the lock while accessing it.
> > + */
> > +execute_callbacks(v, _purge);
> >  spin_unlock(>arch.hvm_vcpu.tm_lock);
> >  return -1;
> >  }
> > @@ -290,6 +319,8 @@ int pt_update_irq(struct vcpu *v)
> >  irq = earliest_pt->irq;
> >  is_lapic = (earliest_pt->source == PTSRC_lapic);
> >  
> > +execute_callbacks(v, _purge);
> > +
> >  spin_unlock(>arch.hvm_vcpu.tm_lock);
> 
> It seems to me that with your addition some code restructuring
> would actually be desirable, such that execute_callbacks() (and
> the lock release) would occur just once. Perhaps the mid-function
> return could be avoided altogether.

OK, I can do that. Let's first agree on the interface though.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] BUG - 'xl restore' does not overwrite HVM

2018-04-10 Thread Wei Liu

On Tue, Apr 10, 2018 at 06:07:12AM +, Peter McLaren wrote:
> Hi
> with at least 1 version of Windows 10 (build 16299), the 'xl restore' command 
> does not overwrite the previously running HVM. The symptoms are:
> 1) the restore appears to rapidly complete after approx 50% of the time
> 2)  files created after the save in the running HVM are present after the 
> restore
> 3) the Windows system tries to recover.
> 
> I have tried restoring both with a shutdown of the Windows system or a 
> destroy. In both cases the results are the same.
> 
> I have listed some relevant info below. Any help would be appreciated.

What do you mean by "overwrite HVM"? What do you want to achieve?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.11] x86/VT-x: Fix determination of EFER.LMA in vmcs_dump_vcpu()

2018-04-10 Thread Jan Beulich

>>> On 09.04.18 at 19:56,  wrote:
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -1788,7 +1788,10 @@ void vmcs_dump_vcpu(struct vcpu *v)
>  vmentry_ctl = vmr32(VM_ENTRY_CONTROLS),
>  vmexit_ctl = vmr32(VM_EXIT_CONTROLS);
>  cr4 = vmr(GUEST_CR4);
> -efer = vmr(GUEST_EFER);
> +
> +/* EFER.LMA is read as zero, and is loaded from vmentry_ctl on entry. */
> +BUILD_BUG_ON(VM_ENTRY_IA32E_MODE << 1 != EFER_LMA);
> +efer = vmr(GUEST_EFER) | ((vmentry_ctl & VM_ENTRY_IA32E_MODE) << 1);

I have to admit that - despite the BUILD_BUG_ON() - I dislike the
literal 1 here, which would better be
(_EFER_LMA - _VM_ENTRY_IA32E_MODE), albeit the latter doesn't
exist, so perhaps

efer = vmr(GUEST_EFER) | ((vmentry_ctl & VM_ENTRY_IA32E_MODE) * (EFER_LMA / 
VM_ENTRY_IA32E_MODE));

or the same expressed through MASK_EXTR() / MASK_INSR()? But
it's the VMX maintainers to judge anyway.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen

2018-04-10 Thread Wei Liu

On Mon, Apr 09, 2018 at 02:51:44PM -0400, Boris Ostrovsky wrote:
> Pre-4.17 kernels ignored start_info's rsdp_paddr pointer and instead
> relied on finding RSDP in standard location in BIOS RO memory. This
> has worked since that's where Xen used to place it.
> 
> However, with recent Xen change (commit 4a5733771e6f ("libxl: put RSDP
> for PVH guest near 4GB")) it prefers to keep RSDP at a "non-standard"
> address. Even though as of commit b17d9d1df3c3 ("x86/xen: Add pvh
> specific rsdp address retrieval function") Linux is able to find RSDP,
> for back-compatibility reasons we need to indicate to Xen that we can
> handle this, an we do so by setting XENFEAT_linux_rsdp_unrestricted
> flag in ELF notes.
> 
> (Also take this opportunity and sync features.h header file with Xen)
> 
> Signed-off-by: Boris Ostrovsky 

Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH governance.git] Make Security Policy Doc ready to become a CNA

2018-04-10 Thread Juergen Gross

On 09/04/18 17:02, Lars Kurth wrote:
> Note: this time with html disabled
> 
> To become a CNA, we need to more clearly specifiy the scope of
> security support. This change updates the document and points
> to SUPPORT.md and pages generated from SUPPORT.md
>  
> Also fixed a typo in the following paragraph.
>  
> Signed-off-by: Lars Kurth 
> ---
> security-policy.pandoc | 12 ++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>  
> diff --git a/security-policy.pandoc b/security-policy.pandoc
> index 5783183..6796220 100644
> --- a/security-policy.pandoc
> +++ b/security-policy.pandoc
> @@ -19,7 +19,15 @@ Scope of this process
>  This process primarily covers the [Xen Hypervisor
> Project](index.php?option=com_content=article=82:xen-hypervisor=80:developers=484).
> -Vulnerabilties reported against other Xen Project teams will be handled on a
> +Specific information about features with security support can be found in
> +
> +1.  
> [SUPPORT.md](http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=SUPPORT.md)
> +    in the releases' tar ball and its xen.git tree and on
> +    [web pages generated from the SUPPORT.md 
> file](http://xenbits.xenproject.org/docs/support/)
> +2.  For releases that do not contain SUPPORT.md, this information can be 
> found
> +    on the [Release Feature wiki 
> page](https://wiki.xenproject.org/wiki/Xen_Project_Release_Features)
> +
> +Vulnerabilities reported against other Xen Project teams will be handled on a
> best effort basis by the relevant Project Lead together with the Security
> Response Team.
> @@ -401,7 +409,7 @@ Change History
> --
>  
> -
> +-   **v3.18 April 9th 2017:** Added reference to SUPPORT.md

 ^ 2018?

Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

1 2 >

1 - 100 of 118 matches

Mail list logo