[Xen-devel] [qemu-mainline test] 148483: regressions - FAIL

2020-03-13 Thread osstest service owner
flight 148483 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/148483/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 144861
 test-amd64-amd64-xl-qemuu-win7-amd64 10 windows-install  fail REGR. vs. 144861
 test-amd64-i386-xl-qemuu-debianhvm-amd64 10 debian-hvm-install fail REGR. vs. 
144861
 test-amd64-i386-qemuu-rhel6hvm-intel 10 redhat-install   fail REGR. vs. 144861
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 debian-hvm-install fail 
REGR. vs. 144861
 test-amd64-i386-freebsd10-i386 11 guest-startfail REGR. vs. 144861
 test-amd64-amd64-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 
144861
 test-amd64-amd64-qemuu-nested-amd 10 debian-hvm-install  fail REGR. vs. 144861
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 10 debian-hvm-install fail 
REGR. vs. 144861
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 144861
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-install  fail REGR. vs. 144861
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 10 debian-hvm-install fail REGR. 
vs. 144861
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 debian-hvm-install fail 
REGR. vs. 144861
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 10 debian-hvm-install fail REGR. 
vs. 144861
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 10 debian-hvm-install fail REGR. vs. 
144861
 test-amd64-i386-qemuu-rhel6hvm-amd 10 redhat-install fail REGR. vs. 144861
 test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-install   fail REGR. vs. 144861
 test-amd64-amd64-qemuu-nested-intel 10 debian-hvm-install fail REGR. vs. 144861
 test-amd64-i386-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 144861
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 10 debian-hvm-install fail 
REGR. vs. 144861
 test-amd64-amd64-libvirt-xsm 12 guest-start  fail REGR. vs. 144861
 test-amd64-i386-freebsd10-amd64 11 guest-start   fail REGR. vs. 144861
 test-amd64-i386-xl-qemuu-win7-amd64 10 windows-install   fail REGR. vs. 144861
 test-amd64-amd64-libvirt-pair 21 guest-start/debian  fail REGR. vs. 144861
 test-amd64-i386-libvirt-pair 21 guest-start/debian   fail REGR. vs. 144861
 test-amd64-i386-libvirt-xsm  12 guest-start  fail REGR. vs. 144861
 test-amd64-i386-libvirt  12 guest-start  fail REGR. vs. 144861
 test-amd64-amd64-libvirt 12 guest-start  fail REGR. vs. 144861
 test-arm64-arm64-libvirt-xsm 12 guest-start  fail REGR. vs. 144861
 test-armhf-armhf-libvirt 12 guest-start  fail REGR. vs. 144861
 test-amd64-amd64-libvirt-vhd 10 debian-di-installfail REGR. vs. 144861
 test-amd64-amd64-xl-qcow210 debian-di-installfail REGR. vs. 144861
 test-armhf-armhf-xl-vhd  10 debian-di-installfail REGR. vs. 144861
 test-armhf-armhf-libvirt-raw 10 debian-di-installfail REGR. vs. 144861

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 16 guest-localmigrate   fail REGR. vs. 144861
 test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 144861

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-check

[Xen-devel] [linux-5.4 test] 148499: regressions - FAIL

2020-03-13 Thread osstest service owner
flight 148499 linux-5.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/148499/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
146121

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail REGR. vs. 146121
 test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 146121

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 linux18fe53f6dfbc5ad4ff2164bff841b56d61b22720
baseline version:
 linux122179cb7d648a6f36b20dd6bf34f953cb384c30

Last test of basis   146121  2020-01-15 17:42:04 Z   58 days
Failing since146178  2020-01-17 02:59:07 Z   57 days   79 attempts
Testing same since   

[Xen-devel] [ovmf test] 148507: all pass - PUSHED

2020-03-13 Thread osstest service owner
flight 148507 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/148507/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf d42fdd6f8384bb4681d93e4a25d8f57db1e63adb
baseline version:
 ovmf 5e75c4d1fe4fd641abc9c15404e65a1dffe70e3e

Last test of basis   148461  2020-03-12 00:09:49 Z2 days
Testing same since   148507  2020-03-13 07:03:56 Z0 days1 attempts


People who touched revisions under test:
  Bob Feng 
  Daniel Schaefer 
  Daniel Schaefer 
  Fan, ZhijuX 
  Laszlo Ersek 
  Leif Lindholm 
  Nickle Wang 
  Zhiju.Fan 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   5e75c4d1fe..d42fdd6f83  d42fdd6f8384bb4681d93e4a25d8f57db1e63adb -> 
xen-tested-master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [libvirt test] 148503: regressions - FAIL

2020-03-13 Thread osstest service owner
flight 148503 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/148503/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 146182
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 146182
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 146182
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 146182

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  52532073d80776aa80257d5a1509524228da228d
baseline version:
 libvirt  a1cd25b919509be2645dbe6f952d5263e0d4e4e5

Last test of basis   146182  2020-01-17 06:00:23 Z   56 days
Failing since146211  2020-01-18 04:18:52 Z   55 days   52 attempts
Testing same since   148503  2020-03-13 04:30:25 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Arnaud Patard 
  Boris Fiuczynski 
  Christian Ehrhardt 
  Collin Walling 
  Daniel Henrique Barboza 
  Daniel P. Berrangé 
  Daniel Veillard 
  Dario Faggioli 
  Erik Skultety 
  Gaurav Agrawal 
  Han Han 
  Jim Fehlig 
  Jiri Denemark 
  Jonathon Jongsma 
  Julio Faracco 
  Ján Tomko 
  Laine Stump 
  Lin Ma 
  Marek Marczykowski-Górecki 
  Michal Privoznik 
  Nikolay Shirokovskiy 
  Pavel Hrdina 
  Pavel Mores 
  Peter Krempa 
  Richard W.M. Jones 
  Rikard Falkeborn 
  Ryan Moeller 
  Sahid Orentino Ferdjaoui 
  Stefan Berger 
  Stefan Berger 
  Stefan Hajnoczi 
  Thomas Huth 
  Your Name 
  zhenwei pi 
  Zhimin Feng 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-arm64-libvirt  fail
 build-armhf-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   blocked 
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked 
 test-amd64-amd64-libvirt-xsm blocked 
 test-arm64-arm64-libvirt-xsm blocked 
 test-amd64-i386-libvirt-xsm  blocked 
 test-amd64-amd64-libvirt blocked 
 test-arm64-arm64-libvirt blocked 
 test-armhf-armhf-libvirt blocked 
 test-amd64-i386-libvirt  blocked 
 test-amd64-amd64-libvirt-pairblocked 
 test-amd64-i386-libvirt-pair blocked 
 test-arm64-arm64-libvirt-qcow2   blocked 
 test-armhf-armhf-libvirt-raw blocked 
 test-amd64-amd64-libvirt-vhd blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: 

[Xen-devel] [xen-unstable test] 148479: regressions - FAIL

2020-03-13 Thread osstest service owner
flight 148479 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/148479/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-qemut-rhel6hvm-amd 12 guest-start/redhat.repeat fail REGR. vs. 
148098
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail REGR. vs. 148098

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-rtds   16 guest-start/debian.repeat fail blocked in 148098
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail  like 148098
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail like 
148098
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 148098
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 148098
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 148098
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 148098
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 148098
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 148098
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 148098
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 148098
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 148098
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 xen  a9b6dacf88fe99fbb69a2ee505833851ffdc9cec
baseline version:
 xen  0d99c909d7e1cbe69329a00f7772946f10a7865b

Last test of basis   148098  

Re: [Xen-devel] [XEN PATCH v3 2/2] xen/arm: Configure early printk via Kconfig

2020-03-13 Thread Julien Grall
On Fri, 13 Mar 2020 at 23:19, Julien Grall  wrote:
>
> On Fri, 13 Mar 2020 at 23:12, Stefano Stabellini  
> wrote:
> >
> > On Mon, 9 Mar 2020, Anthony PERARD wrote:
> > > At the moment, early printk can only be configured on the make command
> > > line. It is not very handy because a user has to remove the option
> > > everytime it is using another command other than compiling the
> > > hypervisor.
> > >
> > > Furthermore, early printk is one of the few odds one that are not
> > > using Kconfig.
> > >
> > > So this is about time to move it to Kconfig.
> > >
> > > The new kconfigs options allow a user to eather select a UART driver
> > > to use at boot time, and set the parameters, or it is still possible
> > > to select a platform which will set the parameters.
> > >
> > > If CONFIG_EARLY_PRINTK is present in the environment or on the make
> > > command line, make will return an error.
> > >
> > > Signed-off-by: Julien Grall 
> > > Signed-off-by: Anthony PERARD 
> > >
> > > ---
> > >
> > > Original patch:
> > > [PATCH for-4.13] xen/arm: Add Skeleton for using configuring early 
> > > printk using Kconfig
> > > <20190913103953.8182-1-julien.gr...@arm.com>
> > > ---
> > >
> > > Notes:
> > > v3:
> > > - rename EARLY_PRINK to CONFIG_EARLY_PRINTK in makefile here (which
> > >   select which object to build).
> > > - rename EARLY_UART_BAUD_RATE to EARLY_UART_PL011_BAUD_RATE
> > > - typos
> > > - drop the list of aliases in early-printk.txt. Kconfig choice menu
> > >   should be enough.
> > > - reword early-printk.txt.
> > > - rework how EARLY_PRINTK is set to Y
> > >   and use that instead of a list of all EARLY_UART_*
> > > - Add a check to ask user to use Kconfig to set early printk.
> > > - rework the possible choice to have all uart driver and platform
> > >   specific option together.
> > > - have added or reword prompt and help messages of the different
> > >   options. The platform specific option don't have extended help, the
> > >   prompt is probably enough.
> > >   (The non-platform specific options have the help message that Julien
> > >   have written in the first version.)
> > > - have made EARLY_UART_INIT dependent on the value of
> > >   EARLY_UART_PL011_BAUD_RATE so that there is no need to expose _INIT 
> > > to
> > >   users.
> > >
> >
> > The patch is fine by me. I only have one very minor comment below.
> >
> >
> > > + config EARLY_UART_CHOICE_CADENCE
> > > + select EARLY_UART_CADENCE
> > > + depends on ARM_64
> > > + bool "Early printk via Cadence UART"
> > > + help
> > > + Say Y here if you wish the early printk to direct 
> > > their
> > > + output to a Cadence UART. You can use this option to
> > > + provide the parameters for the Cadence UART rather 
> > > than
> > > + selecting one of the platform specific options 
> > > below if
> > > + you know the parameters for the port.
> > > +
> > > + This option is preferred over the platform specific
> > > + options; the platform specific options are 
> > > deprecated
> > > + and will soon be removed.
> >
> > [...]
> >
> > > + config EARLY_PRINTK_ZYNQMP
> > > + bool "Early printk with Cadence UART for Xilinx ZynqMP SoCs"
> > > + select EARLY_UART_CADENCE
> > > + depends on ARM_64
> > > +endchoice
> >
> > [...]
> >
> > > +config EARLY_UART_BASE_ADDRESS
> > > + depends on EARLY_PRINTK
> > > + hex "Early printk, physical base address of debug UART"
> > > + default 0xF040AB00 if EARLY_PRINTK_BRCM
> > > + default 0x4806A000 if EARLY_PRINTK_DRA7
> > > + default 0x1c09 if EARLY_PRINTK_FASTMODEL
> > > + default 0x12c2 if EARLY_PRINTK_EXYNOS5250
> > > + default 0xfff32000 if EARLY_PRINTK_HIKEY960
> > > + default 0x7ff8 if EARLY_PRINTK_JUNO
> > > + default 0xe6e6 if EARLY_PRINTK_LAGER
> > > + default 0xfff36000 if EARLY_PRINTK_MIDWAY
> > > + default 0xd0012000 if EARLY_PRINTK_MVEBU
> > > + default 0x4802 if EARLY_PRINTK_OMAP5432
> > > + default 0xe6e88000 if EARLY_PRINTK_RCAR3
> > > + default 0xe101 if EARLY_PRINTK_SEATTLE
> > > + default 0x01c28000 if EARLY_PRINTK_SUN6I
> > > + default 0x01c28000 if EARLY_PRINTK_SUN7I
> > > + default 0x87e02400 if EARLY_PRINTK_THUNDERX
> > > + default 0x1c09 if EARLY_PRINTK_VEXPRESS
> > > + default 0x1c021000 if EARLY_PRINTK_XGENE_MCDIVITT
> > > + default 0x1c02 if EARLY_PRINTK_XGENE_STORM
> > > + default 0xff00 if EARLY_PRINTK_ZYNQMP
> >
> > Today we only have one board with CADENCE UART which is ZynqMP. However,
> > only if EARLY_PRINTK_ZYNQMP is selected the BASE_ADDRESS is default to
> > 0xff00.
> >
> > Ideally, BASE_ADDRESS would default to 0xff00 

Re: [Xen-devel] [XEN PATCH v3 2/2] xen/arm: Configure early printk via Kconfig

2020-03-13 Thread Julien Grall
On Fri, 13 Mar 2020 at 23:12, Stefano Stabellini  wrote:
>
> On Mon, 9 Mar 2020, Anthony PERARD wrote:
> > At the moment, early printk can only be configured on the make command
> > line. It is not very handy because a user has to remove the option
> > everytime it is using another command other than compiling the
> > hypervisor.
> >
> > Furthermore, early printk is one of the few odds one that are not
> > using Kconfig.
> >
> > So this is about time to move it to Kconfig.
> >
> > The new kconfigs options allow a user to eather select a UART driver
> > to use at boot time, and set the parameters, or it is still possible
> > to select a platform which will set the parameters.
> >
> > If CONFIG_EARLY_PRINTK is present in the environment or on the make
> > command line, make will return an error.
> >
> > Signed-off-by: Julien Grall 
> > Signed-off-by: Anthony PERARD 
> >
> > ---
> >
> > Original patch:
> > [PATCH for-4.13] xen/arm: Add Skeleton for using configuring early 
> > printk using Kconfig
> > <20190913103953.8182-1-julien.gr...@arm.com>
> > ---
> >
> > Notes:
> > v3:
> > - rename EARLY_PRINK to CONFIG_EARLY_PRINTK in makefile here (which
> >   select which object to build).
> > - rename EARLY_UART_BAUD_RATE to EARLY_UART_PL011_BAUD_RATE
> > - typos
> > - drop the list of aliases in early-printk.txt. Kconfig choice menu
> >   should be enough.
> > - reword early-printk.txt.
> > - rework how EARLY_PRINTK is set to Y
> >   and use that instead of a list of all EARLY_UART_*
> > - Add a check to ask user to use Kconfig to set early printk.
> > - rework the possible choice to have all uart driver and platform
> >   specific option together.
> > - have added or reword prompt and help messages of the different
> >   options. The platform specific option don't have extended help, the
> >   prompt is probably enough.
> >   (The non-platform specific options have the help message that Julien
> >   have written in the first version.)
> > - have made EARLY_UART_INIT dependent on the value of
> >   EARLY_UART_PL011_BAUD_RATE so that there is no need to expose _INIT to
> >   users.
> >
>
> The patch is fine by me. I only have one very minor comment below.
>
>
> > + config EARLY_UART_CHOICE_CADENCE
> > + select EARLY_UART_CADENCE
> > + depends on ARM_64
> > + bool "Early printk via Cadence UART"
> > + help
> > + Say Y here if you wish the early printk to direct 
> > their
> > + output to a Cadence UART. You can use this option to
> > + provide the parameters for the Cadence UART rather 
> > than
> > + selecting one of the platform specific options below 
> > if
> > + you know the parameters for the port.
> > +
> > + This option is preferred over the platform specific
> > + options; the platform specific options are deprecated
> > + and will soon be removed.
>
> [...]
>
> > + config EARLY_PRINTK_ZYNQMP
> > + bool "Early printk with Cadence UART for Xilinx ZynqMP SoCs"
> > + select EARLY_UART_CADENCE
> > + depends on ARM_64
> > +endchoice
>
> [...]
>
> > +config EARLY_UART_BASE_ADDRESS
> > + depends on EARLY_PRINTK
> > + hex "Early printk, physical base address of debug UART"
> > + default 0xF040AB00 if EARLY_PRINTK_BRCM
> > + default 0x4806A000 if EARLY_PRINTK_DRA7
> > + default 0x1c09 if EARLY_PRINTK_FASTMODEL
> > + default 0x12c2 if EARLY_PRINTK_EXYNOS5250
> > + default 0xfff32000 if EARLY_PRINTK_HIKEY960
> > + default 0x7ff8 if EARLY_PRINTK_JUNO
> > + default 0xe6e6 if EARLY_PRINTK_LAGER
> > + default 0xfff36000 if EARLY_PRINTK_MIDWAY
> > + default 0xd0012000 if EARLY_PRINTK_MVEBU
> > + default 0x4802 if EARLY_PRINTK_OMAP5432
> > + default 0xe6e88000 if EARLY_PRINTK_RCAR3
> > + default 0xe101 if EARLY_PRINTK_SEATTLE
> > + default 0x01c28000 if EARLY_PRINTK_SUN6I
> > + default 0x01c28000 if EARLY_PRINTK_SUN7I
> > + default 0x87e02400 if EARLY_PRINTK_THUNDERX
> > + default 0x1c09 if EARLY_PRINTK_VEXPRESS
> > + default 0x1c021000 if EARLY_PRINTK_XGENE_MCDIVITT
> > + default 0x1c02 if EARLY_PRINTK_XGENE_STORM
> > + default 0xff00 if EARLY_PRINTK_ZYNQMP
>
> Today we only have one board with CADENCE UART which is ZynqMP. However,
> only if EARLY_PRINTK_ZYNQMP is selected the BASE_ADDRESS is default to
> 0xff00.
>
> Ideally, BASE_ADDRESS would default to 0xff00 if EARLY_PRINTK_ZYNQMP
> or if EARLY_UART_CADENCE. (There is one more similar example which is
> EARLY_UART_EXYNOS4210.)

As you say *today*. How about in the future? There are no promise any
platform using cadence UART will be wired at the same address.
If you specify a default address, then the 

Re: [Xen-devel] [XEN PATCH v3 2/2] xen/arm: Configure early printk via Kconfig

2020-03-13 Thread Stefano Stabellini
On Fri, 13 Mar 2020, Stefano Stabellini wrote:
> On Mon, 9 Mar 2020, Anthony PERARD wrote:
> > At the moment, early printk can only be configured on the make command
> > line. It is not very handy because a user has to remove the option
> > everytime it is using another command other than compiling the
> > hypervisor.
> > 
> > Furthermore, early printk is one of the few odds one that are not
> > using Kconfig.
> > 
> > So this is about time to move it to Kconfig.
> > 
> > The new kconfigs options allow a user to eather select a UART driver
> > to use at boot time, and set the parameters, or it is still possible
> > to select a platform which will set the parameters.
> > 
> > If CONFIG_EARLY_PRINTK is present in the environment or on the make
> > command line, make will return an error.
> > 
> > Signed-off-by: Julien Grall 
> > Signed-off-by: Anthony PERARD 
> > 
> > ---
> > 
> > Original patch:
> > [PATCH for-4.13] xen/arm: Add Skeleton for using configuring early 
> > printk using Kconfig
> > <20190913103953.8182-1-julien.gr...@arm.com>
> > ---
> > 
> > Notes:
> > v3:
> > - rename EARLY_PRINK to CONFIG_EARLY_PRINTK in makefile here (which
> >   select which object to build).
> > - rename EARLY_UART_BAUD_RATE to EARLY_UART_PL011_BAUD_RATE
> > - typos
> > - drop the list of aliases in early-printk.txt. Kconfig choice menu
> >   should be enough.
> > - reword early-printk.txt.
> > - rework how EARLY_PRINTK is set to Y
> >   and use that instead of a list of all EARLY_UART_*
> > - Add a check to ask user to use Kconfig to set early printk.
> > - rework the possible choice to have all uart driver and platform
> >   specific option together.
> > - have added or reword prompt and help messages of the different
> >   options. The platform specific option don't have extended help, the
> >   prompt is probably enough.
> >   (The non-platform specific options have the help message that Julien
> >   have written in the first version.)
> > - have made EARLY_UART_INIT dependent on the value of
> >   EARLY_UART_PL011_BAUD_RATE so that there is no need to expose _INIT to
> >   users.
> > 
> 
> The patch is fine by me. I only have one very minor comment below.

I forgot to add

Tested-by: Stefano Stabellini 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [XEN PATCH v3 2/2] xen/arm: Configure early printk via Kconfig

2020-03-13 Thread Stefano Stabellini
On Mon, 9 Mar 2020, Anthony PERARD wrote:
> At the moment, early printk can only be configured on the make command
> line. It is not very handy because a user has to remove the option
> everytime it is using another command other than compiling the
> hypervisor.
> 
> Furthermore, early printk is one of the few odds one that are not
> using Kconfig.
> 
> So this is about time to move it to Kconfig.
> 
> The new kconfigs options allow a user to eather select a UART driver
> to use at boot time, and set the parameters, or it is still possible
> to select a platform which will set the parameters.
> 
> If CONFIG_EARLY_PRINTK is present in the environment or on the make
> command line, make will return an error.
> 
> Signed-off-by: Julien Grall 
> Signed-off-by: Anthony PERARD 
> 
> ---
> 
> Original patch:
> [PATCH for-4.13] xen/arm: Add Skeleton for using configuring early printk 
> using Kconfig
> <20190913103953.8182-1-julien.gr...@arm.com>
> ---
> 
> Notes:
> v3:
> - rename EARLY_PRINK to CONFIG_EARLY_PRINTK in makefile here (which
>   select which object to build).
> - rename EARLY_UART_BAUD_RATE to EARLY_UART_PL011_BAUD_RATE
> - typos
> - drop the list of aliases in early-printk.txt. Kconfig choice menu
>   should be enough.
> - reword early-printk.txt.
> - rework how EARLY_PRINTK is set to Y
>   and use that instead of a list of all EARLY_UART_*
> - Add a check to ask user to use Kconfig to set early printk.
> - rework the possible choice to have all uart driver and platform
>   specific option together.
> - have added or reword prompt and help messages of the different
>   options. The platform specific option don't have extended help, the
>   prompt is probably enough.
>   (The non-platform specific options have the help message that Julien
>   have written in the first version.)
> - have made EARLY_UART_INIT dependent on the value of
>   EARLY_UART_PL011_BAUD_RATE so that there is no need to expose _INIT to
>   users.
> 

The patch is fine by me. I only have one very minor comment below.


> + config EARLY_UART_CHOICE_CADENCE
> + select EARLY_UART_CADENCE
> + depends on ARM_64
> + bool "Early printk via Cadence UART"
> + help
> + Say Y here if you wish the early printk to direct their
> + output to a Cadence UART. You can use this option to
> + provide the parameters for the Cadence UART rather than
> + selecting one of the platform specific options below if
> + you know the parameters for the port.
> +
> + This option is preferred over the platform specific
> + options; the platform specific options are deprecated
> + and will soon be removed.

[...]

> + config EARLY_PRINTK_ZYNQMP
> + bool "Early printk with Cadence UART for Xilinx ZynqMP SoCs"
> + select EARLY_UART_CADENCE
> + depends on ARM_64
> +endchoice

[...]

> +config EARLY_UART_BASE_ADDRESS
> + depends on EARLY_PRINTK
> + hex "Early printk, physical base address of debug UART"
> + default 0xF040AB00 if EARLY_PRINTK_BRCM
> + default 0x4806A000 if EARLY_PRINTK_DRA7
> + default 0x1c09 if EARLY_PRINTK_FASTMODEL
> + default 0x12c2 if EARLY_PRINTK_EXYNOS5250
> + default 0xfff32000 if EARLY_PRINTK_HIKEY960
> + default 0x7ff8 if EARLY_PRINTK_JUNO
> + default 0xe6e6 if EARLY_PRINTK_LAGER
> + default 0xfff36000 if EARLY_PRINTK_MIDWAY
> + default 0xd0012000 if EARLY_PRINTK_MVEBU
> + default 0x4802 if EARLY_PRINTK_OMAP5432
> + default 0xe6e88000 if EARLY_PRINTK_RCAR3
> + default 0xe101 if EARLY_PRINTK_SEATTLE
> + default 0x01c28000 if EARLY_PRINTK_SUN6I
> + default 0x01c28000 if EARLY_PRINTK_SUN7I
> + default 0x87e02400 if EARLY_PRINTK_THUNDERX
> + default 0x1c09 if EARLY_PRINTK_VEXPRESS
> + default 0x1c021000 if EARLY_PRINTK_XGENE_MCDIVITT
> + default 0x1c02 if EARLY_PRINTK_XGENE_STORM
> + default 0xff00 if EARLY_PRINTK_ZYNQMP

Today we only have one board with CADENCE UART which is ZynqMP. However,
only if EARLY_PRINTK_ZYNQMP is selected the BASE_ADDRESS is default to
0xff00.

Ideally, BASE_ADDRESS would default to 0xff00 if EARLY_PRINTK_ZYNQMP
or if EARLY_UART_CADENCE. (There is one more similar example which is
EARLY_UART_EXYNOS4210.)

I don't know if it is worth optimizing, I'll let you and Julien decide.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [XEN PATCH v3 1/2] xen/arm: Rename all early printk macro

2020-03-13 Thread Stefano Stabellini
On Thu, 12 Mar 2020, Julien Grall wrote:
> On 11/03/2020 14:46, Anthony PERARD wrote:
> > On Wed, Mar 11, 2020 at 01:57:37PM +, Julien Grall wrote:
> > > Hi Anthony,
> > > 
> > > On 09/03/2020 17:45, Anthony PERARD wrote:
> > > > diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
> > > > index e9d356f05c2b..2b593c5ef99a 100644
> > > > --- a/xen/arch/arm/arm32/head.S
> > > > +++ b/xen/arch/arm/arm32/head.S
> > > > @@ -36,8 +36,8 @@
> > > >#define XEN_FIRST_SLOT  first_table_offset(XEN_VIRT_START)
> > > >#define XEN_SECOND_SLOT second_table_offset(XEN_VIRT_START)
> > > > -#if (defined (CONFIG_EARLY_PRINTK)) && (defined (EARLY_PRINTK_INC))
> > > > -#include EARLY_PRINTK_INC
> > > > +#if (defined (CONFIG_EARLY_PRINTK)) && (defined
> > > > (CONFIG_EARLY_PRINTK_INC))
> > > 
> > > NIT: I would also take the opportunity to clean-up the line by remove the
> > > extra () and the space before (. Something like:
> > > 
> > > #if define(CONFIG_EARLY_PRINTK) && defined(CONFIG_EARLY_PRINTK_INC)
> > > 
> > > [...]
> > > 
> > > > diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> > > > index e5015f93a2d8..4d45ea3dac3c 100644
> > > > --- a/xen/arch/arm/arm64/head.S
> > > > +++ b/xen/arch/arm/arm64/head.S
> > > > @@ -45,8 +45,8 @@
> > > >#define __HEAD_FLAGS((__HEAD_FLAG_PAGE_SIZE << 1) | \
> > > > (__HEAD_FLAG_PHYS_BASE << 3))
> > > > -#if (defined (CONFIG_EARLY_PRINTK)) && (defined (EARLY_PRINTK_INC))
> > > > -#include EARLY_PRINTK_INC
> > > > +#if (defined (CONFIG_EARLY_PRINTK)) && (defined
> > > > (CONFIG_EARLY_PRINTK_INC))
> > > 
> > > Same here.
> > 
> > Those clean-up sounds good.
> 
> I will give a couple of days so Stefano can have an opportunity to comment. If
> I don't hear anything by Monday, I will commit it.

It's fine by me. Also:

Tested-by: Stefano Stabellini 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 02/10] scripts: Coccinelle script to use ERRP_AUTO_PROPAGATE()

2020-03-13 Thread Eric Blake

On 3/13/20 4:54 PM, Markus Armbruster wrote:



I append my hacked up version of auto-propagated-errp.cocci.  It
produces the same patch as yours for the complete tree.



// Use ERRP_AUTO_PROPAGATE (see include/qapi/error.h)
//



//
// Usage example:
// spatch --sp-file scripts/coccinelle/auto-propagated-errp.cocci \
//  --macro-file scripts/cocci-macro-file.h --in-place \
//  --no-show-diff --max-width 80 FILES...
//
// Note: --max-width 80 is needed because coccinelle default is less
// than 80, and without this parameter coccinelle may reindent some
// lines which fit into 80 characters but not to coccinelle default,
// which in turn produces extra patch hunks for no reason.


Do we really need this note?  And/or should we update other Coccinelle 
script examples to mention --max-width?




// Switch unusual Error ** parameter names to errp
// (this is necessary to use ERRP_AUTO_PROPAGATE).
//
// Disable optional_qualifier to skip functions with
// "Error *const *errp" parameter.
//
// Skip functions with "assert(_errp && *_errp)" statement, because
// that signals unusual semantics, and the parameter name may well
// serve a purpose. (like nbd_iter_channel_error()).
//
// Skip util/error.c to not touch, for example, error_propagate() and
// error_propagate_prepend().
@ depends on !(file in "util/error.c") disable optional_qualifier@


The comments are definitely helpful.


identifier fn;
identifier _errp != errp;
@@

  fn(...,
-   Error **_errp
+   Error **errp
 ,...)
  {
(
  ... when != assert(_errp && *_errp)
&
  <...
-_errp
+errp
  ...>
)
  }

// Add invocation of ERRP_AUTO_PROPAGATE to errp-functions where
// necessary
//
// Note, that without "when any" the final "..." does not mach
// something matched by previous pattern, i.e. the rule will not match
// double error_prepend in control flow like in
// vfio_set_irq_signaling().


How likely are we that this comment might go stale over time?  But I'm 
not opposed to it.



//
// Note, "exists" says that we want apply rule even if it matches not
// on all possible control flows (otherwise, it will not match


s/matches not on/does not match on/


// standard pattern when error_propagate() call is in if branch).
@ disable optional_qualifier exists@
identifier fn, local_err;
symbol errp;
@@

  fn(..., Error **errp, ...)
  {
+   ERRP_AUTO_PROPAGATE();
 ...  when != ERRP_AUTO_PROPAGATE();
(
(
 error_append_hint(errp, ...);
|
 error_prepend(errp, ...);
|
 error_vprepend(errp, ...);
)
 ... when any
|
 Error *local_err = NULL;
 ...
(
 error_propagate_prepend(errp, local_err, ...);
|
 error_propagate(errp, local_err);
)
 ...
)
  }


// Match functions with propagation of local error to errp.
// We want to refer these functions in several following rules, but I
// don't know a proper way to inherit a function, not just its name
// (to not match another functions with same name in following rules).
// Not-proper way is as follows: rename errp parameter in functions
// header and match it in following rules. Rename it back after all
// transformations.
//
// The simplest case of propagation scheme is single definition of
// local_err with at most one error_propagate_prepend or
// error_propagate on each control-flow. Still, we want to match more
// complex schemes too. We'll warn them with help of further rules.


We'll warn for those with the help of further rules.


@rule1 disable optional_qualifier exists@
identifier fn, local_err;
symbol errp;
@@

  fn(..., Error **
-errp
+
 , ...)
  {
  ...
  Error *local_err = NULL;
  ...
(
  error_propagate_prepend(errp, local_err, ...);
|
  error_propagate(errp, local_err);
)
  ...
  }


// Warn several Error * definitions.


// Warn when there are several Error * definitions.



@check1 disable optional_qualifier exists@
identifier fn, _errp, local_err, local_err2;
position p1, p2;
@@

  fn(..., Error **_errp, ...)
  {
  ...
  Error *local_err = NULL;@p1
  ... when any
  Error *local_err2 = NULL;@p2
  ... when any
  }

@ script:python @
fn << check1.fn;
p1 << check1.p1;
p2 << check1.p2;
@@

print('Warning: function {} has several definitions of '
   'Error * local variable: at {}:{} and then at {}:{}'.format(
   fn, p1[0].file, p1[0].line, p2[0].file, p2[0].line))

// Warn several propagations in control flow.


// Warn when several propagations are in the control flow.


@check2 disable optional_qualifier exists@
identifier fn, _errp;
position p1, p2;
@@

  fn(..., Error **_errp, ...)
  {
  ...
(
  error_propagate_prepend(_errp, ...);@p1
|
  error_propagate(_errp, ...);@p1
)
  ...
(
  error_propagate_prepend(_errp, ...);@p2
|
  error_propagate(_errp, ...);@p2
)
  ... when any
  }

@ script:python @
fn << check2.fn;
p1 << check2.p1;
p2 << check2.p2;
@@

print('Warning: function {} propagates to errp several times in '
   'one control flow: at {}:{} 

Re: [Xen-devel] [PATCH v9 02/10] scripts: Coccinelle script to use ERRP_AUTO_PROPAGATE()

2020-03-13 Thread Markus Armbruster
Vladimir Sementsov-Ogievskiy  writes:

> 13.03.2020 18:42, Markus Armbruster wrote:
>> Vladimir Sementsov-Ogievskiy  writes:
>>
>>> 12.03.2020 19:36, Markus Armbruster wrote:
 I may have a second look tomorrow with fresher eyes, but let's get this
 out now as is.

 Vladimir Sementsov-Ogievskiy  writes:

> Script adds ERRP_AUTO_PROPAGATE macro invocation where appropriate and
> does corresponding changes in code (look for details in
> include/qapi/error.h)
>
> Usage example:
> spatch --sp-file scripts/coccinelle/auto-propagated-errp.cocci \
>--macro-file scripts/cocci-macro-file.h --in-place --no-show-diff \
>--max-width 80 FILES...
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>
> Cc: Eric Blake 
> Cc: Kevin Wolf 
> Cc: Max Reitz 
> Cc: Greg Kurz 
> Cc: Christian Schoenebeck 
> Cc: Stefano Stabellini 
> Cc: Anthony Perard 
> Cc: Paul Durrant 
> Cc: Stefan Hajnoczi 
> Cc: "Philippe Mathieu-Daudé" 
> Cc: Laszlo Ersek 
> Cc: Gerd Hoffmann 
> Cc: Stefan Berger 
> Cc: Markus Armbruster 
> Cc: Michael Roth 
> Cc: qemu-de...@nongnu.org
> Cc: qemu-bl...@nongnu.org
> Cc: xen-devel@lists.xenproject.org
>
>scripts/coccinelle/auto-propagated-errp.cocci | 327 ++
>include/qapi/error.h  |   3 +
>MAINTAINERS   |   1 +
>3 files changed, 331 insertions(+)
>create mode 100644 scripts/coccinelle/auto-propagated-errp.cocci
>
> diff --git a/scripts/coccinelle/auto-propagated-errp.cocci 
> b/scripts/coccinelle/auto-propagated-errp.cocci
> new file mode 100644
> index 00..7dac2dcfa4
> --- /dev/null
> +++ b/scripts/coccinelle/auto-propagated-errp.cocci
> @@ -0,0 +1,327 @@
> +// Use ERRP_AUTO_PROPAGATE (see include/qapi/error.h)
> +//
> +// Copyright (c) 2020 Virtuozzo International GmbH.
> +//
> +// This program is free software; you can redistribute it and/or
> +// modify it under the terms of the GNU General Public License as
> +// published by the Free Software Foundation; either version 2 of the
> +// License, or (at your option) any later version.
> +//
> +// This program is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +// GNU General Public License for more details.
> +//
> +// You should have received a copy of the GNU General Public License
> +// along with this program.  If not, see
> +// .
> +//
> +// Usage example:
> +// spatch --sp-file scripts/coccinelle/auto-propagated-errp.cocci \
> +//  --macro-file scripts/cocci-macro-file.h --in-place \
> +//  --no-show-diff --max-width 80 FILES...
> +//
> +// Note: --max-width 80 is needed because coccinelle default is less
> +// than 80, and without this parameter coccinelle may reindent some
> +// lines which fit into 80 characters but not to coccinelle default,
> +// which in turn produces extra patch hunks for no reason.

 This is about unwanted reformatting of parameter lists due to the ___
 chaining hack.  --max-width 80 makes that less likely, but not
 impossible.

 We can search for unwanted reformatting of parameter lists.  I think
 grepping diffs for '^\+.*Error \*\*' should do the trick.  For the whole
 tree, I get one false positive (not a parameter list), and one hit:

   @@ -388,8 +388,10 @@ static void object_post_init_with_type(O
}
}

   -void object_apply_global_props(Object *obj, const GPtrArray *props, 
 Error **errp)
   +void object_apply_global_props(Object *obj, const GPtrArray *props,
   +   Error **errp)
{
   +ERRP_AUTO_PROPAGATE();
int i;

if (!props) {

 Reformatting, but not unwanted.
>>>
>>> Yes, I saw it. This line is 81 character length, so it's OK to fix it in 
>>> one hunk with
>>> ERRP_AUTO_PROPAGATE addition even for non-automatic patch.
>>
>> Agree.
>>

 The --max-width 80 hack is good enough for me.

 It does result in slightly long transformed lines, e.g. this one in
 replication.c:

   @@ -113,7 +113,7 @@ static int replication_open(BlockDriverS
s->mode = REPLICATION_MODE_PRIMARY;
top_id = qemu_opt_get(opts, REPLICATION_TOP_ID);
if (top_id) {
   -error_setg(_err, "The primary side does not 
 support option top-id");
   +error_setg(errp, "The primary side does not support 
 option top-id");
goto fail;

[Xen-devel] [linux-linus test] 148469: regressions - FAIL

2020-03-13 Thread osstest service owner
flight 148469 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/148469/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
133580

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 12 guest-start  fail REGR. vs. 133580

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 133580
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 133580
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 133580
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 133580
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 133580
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 133580
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 133580
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 133580
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 linuxe6e6ec48dd0fa12e8a2d1ff6b55cd907401bd7fe
baseline version:
 linux736706bee3298208343a76096370e4f6a5c55915

Last test of basis   133580  2019-03-04 19:53:09 Z  374 days
Failing since133605  2019-03-05 20:03:14 Z  373 days  219 attempts
Testing same since   148469  2020-03-12 08:47:43 Z1 days1 attempts


6360 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm 

[Xen-devel] [xen-unstable-smoke test] 148522: tolerable all pass - PUSHED

2020-03-13 Thread osstest service owner
flight 148522 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/148522/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  d094e95fb7c61c5f46d8e446b4bdc028438dea1c
baseline version:
 xen  76416c459c6e0b3579c5177df414e0633b8b9565

Last test of basis   148515  2020-03-13 11:01:13 Z0 days
Testing same since   148522  2020-03-13 15:05:36 Z0 days1 attempts


People who touched revisions under test:
  Paweł Marczewski 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   76416c459c..d094e95fb7  d094e95fb7c61c5f46d8e446b4bdc028438dea1c -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add callbacks for PM suspend and hibernation

2020-03-13 Thread Anchal Agarwal
On Thu, Mar 12, 2020 at 10:04:35AM +0100, Roger Pau Monné wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you can confirm the sender and know the 
> content is safe.
> 
> 
> 
> On Wed, Mar 11, 2020 at 10:25:15PM +, Agarwal, Anchal wrote:
> > Hi Roger,
> > I am trying to understand your comments on indirect descriptors specially 
> > without polluting the mailing list hence emailing you personally.
> 
> IMO it's better to send to the mailing list. The issues or questions
> you have about indirect descriptors can be helpful to others in the
> future. If there's no confidential information please send to the
> list next time.
> 
> Feel free to forward this reply to the list also.
>
Sure no problem at all.
> > Hope that's ok by you.  Please see my response inline.
> >
> > On Fri, Mar 06, 2020 at 06:40:33PM +, Anchal Agarwal wrote:
> > > On Fri, Feb 21, 2020 at 03:24:45PM +0100, Roger Pau Monné wrote:
> > > > On Fri, Feb 14, 2020 at 11:25:34PM +, Anchal Agarwal wrote:
> > > > >   blkfront_gather_backend_features(info);
> > > > >   /* Reset limits changed by blk_mq_update_nr_hw_queues(). */
> > > > >   blkif_set_queue_limits(info);
> > > > > @@ -2046,6 +2063,9 @@ static int blkif_recover(struct 
> > blkfront_info *info)
> > > > >   kick_pending_request_queues(rinfo);
> > > > >   }
> > > > >
> > > > > + if (frozen)
> > > > > + return 0;
> > > >
> > > > I have to admit my memory is fuzzy here, but don't you need to
> > > > re-queue requests in case the backend has different limits of 
> > indirect
> > > > descriptors per request for example?
> > > >
> > > > Or do we expect that the frontend is always going to be resumed on 
> > the
> > > > same backend, and thus features won't change?
> > > >
> > > So to understand your question better here, AFAIU the  maximum number 
> > of indirect
> > > grefs is fixed by the backend, but the frontend can issue requests 
> > with any
> > > number of indirect segments as long as it's less than the number 
> > provided by
> > > the backend. So by your question you mean this max number of 
> > MAX_INDIRECT_SEGMENTS
> > > 256 on backend can change ?
> >
> > Yes, number of indirect descriptors supported by the backend can
> > change, because you moved to a different backend, or because the
> > maximum supported by the backend has changed. It's also possible to
> > resume on a backend that has no indirect descriptors support at all.
> >
> > AFAIU, the code for requeuing the requests is only for xen suspend/resume. 
> > These request in the queue are
> > same that gets added to queuelist in blkfront_resume. Also, even if 
> > indirect descriptors change on resume,
> > they just need to be broadcasted to frontend and which means we could just 
> > mean that a request can process
> > more data.
> 
> Or less data. You could legitimately migrate from a host that has
> indirect descriptors to one without, in which case requests would need
> to be smaller to fit the ring slots.
> 
> > We do setup indirect descriptors on front end on blkif_recover before 
> > returning and queue limits are
> > setup accordingly.
> > Am I missing anything here?
> 
> Calling blkif_recover should take care of it AFAICT. As it resets the
> queue limits according to the data announced on xenstore.
> 
> I think I got confused, using blkif_recover should be fine, sorry.
> 
Ok. Thanks for confirming. I will fixup other suggestions in the patch and send
out a v4.
> >
> > > > > @@ -2625,6 +2671,62 @@ static void blkif_release(struct gendisk 
> > *disk, fmode_t mode)
> > > > >   mutex_unlock(_mutex);
> > > > >  }
> > > > >
> > > > > +static int blkfront_freeze(struct xenbus_device *dev)
> > > > > +{
> > > > > + unsigned int i;
> > > > > + struct blkfront_info *info = dev_get_drvdata(>dev);
> > > > > + struct blkfront_ring_info *rinfo;
> > > > > + /* This would be reasonable timeout as used in 
> > xenbus_dev_shutdown() */
> > > > > + unsigned int timeout = 5 * HZ;
> > > > > + int err = 0;
> > > > > +
> > > > > + info->connected = BLKIF_STATE_FREEZING;
> > > > > +
> > > > > + blk_mq_freeze_queue(info->rq);
> > > > > + blk_mq_quiesce_queue(info->rq);
> > > >
> > > > Don't you need to also drain the queue and make sure it's empty?
> > > >
> > > blk_mq_freeze_queue and blk_mq_quiesce_queue should take care of 
> > running HW queues synchronously
> > > and making sure all the ongoing dispatches have finished. Did I 
> > understand your question right?
> >
> > Can you please add some check to that end? (ie: that there are no
> > pending requests on any queue?)
> >
> > Well a check to see if there are any unconsumed responses could be done.
> > I haven't come across use case in my testing where this failed but maybe 

[Xen-devel] xenoprof

2020-03-13 Thread Paul Durrant
Hi,

  I'm trying to determine the status of HYPERVISOR_xenoprof_op. The code behind 
it appears to be unmaintained and I cannot find any
support statement for it. Googling around finds some mentions of Xen and 
oprofile but it's not clear whether it works and most
references I find are quite old. Is it time to remove it?

  Paul


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/4] x86/APIC: restrict certain messages to BSP

2020-03-13 Thread Andrew Cooper
On 13/03/2020 09:26, Jan Beulich wrote:
> All CPUs get an equal setting of EOI broadcast suppression; no need to
> log one message per CPU, even if it's only in verbose APIC mode.
>
> Only the BSP is eligible to possibly get ExtINT enabled; no need to log
> that it gets disabled on all APs, even if - again - it's only in verbose
> APIC mode.

How true is this in practice?

I know it is certainly the recommended configuration, but interrupt
remapping can definitely point ExtINT at arbitrary CPUs, and IIRC, the
MP spec simply says that "only one CPU should be configured to receive
ExtINT".

I know we definitely have bugs with ExiINT handling, but I haven't had
time to debug further.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 02/10] scripts: Coccinelle script to use ERRP_AUTO_PROPAGATE()

2020-03-13 Thread Vladimir Sementsov-Ogievskiy

13.03.2020 18:42, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


12.03.2020 19:36, Markus Armbruster wrote:

I may have a second look tomorrow with fresher eyes, but let's get this
out now as is.

Vladimir Sementsov-Ogievskiy  writes:


Script adds ERRP_AUTO_PROPAGATE macro invocation where appropriate and
does corresponding changes in code (look for details in
include/qapi/error.h)

Usage example:
spatch --sp-file scripts/coccinelle/auto-propagated-errp.cocci \
   --macro-file scripts/cocci-macro-file.h --in-place --no-show-diff \
   --max-width 80 FILES...

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Cc: Eric Blake 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Greg Kurz 
Cc: Christian Schoenebeck 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Paul Durrant 
Cc: Stefan Hajnoczi 
Cc: "Philippe Mathieu-Daudé" 
Cc: Laszlo Ersek 
Cc: Gerd Hoffmann 
Cc: Stefan Berger 
Cc: Markus Armbruster 
Cc: Michael Roth 
Cc: qemu-de...@nongnu.org
Cc: qemu-bl...@nongnu.org
Cc: xen-devel@lists.xenproject.org

   scripts/coccinelle/auto-propagated-errp.cocci | 327 ++
   include/qapi/error.h  |   3 +
   MAINTAINERS   |   1 +
   3 files changed, 331 insertions(+)
   create mode 100644 scripts/coccinelle/auto-propagated-errp.cocci

diff --git a/scripts/coccinelle/auto-propagated-errp.cocci 
b/scripts/coccinelle/auto-propagated-errp.cocci
new file mode 100644
index 00..7dac2dcfa4
--- /dev/null
+++ b/scripts/coccinelle/auto-propagated-errp.cocci
@@ -0,0 +1,327 @@
+// Use ERRP_AUTO_PROPAGATE (see include/qapi/error.h)
+//
+// Copyright (c) 2020 Virtuozzo International GmbH.
+//
+// This program is free software; you can redistribute it and/or
+// modify it under the terms of the GNU General Public License as
+// published by the Free Software Foundation; either version 2 of the
+// License, or (at your option) any later version.
+//
+// This program is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with this program.  If not, see
+// .
+//
+// Usage example:
+// spatch --sp-file scripts/coccinelle/auto-propagated-errp.cocci \
+//  --macro-file scripts/cocci-macro-file.h --in-place \
+//  --no-show-diff --max-width 80 FILES...
+//
+// Note: --max-width 80 is needed because coccinelle default is less
+// than 80, and without this parameter coccinelle may reindent some
+// lines which fit into 80 characters but not to coccinelle default,
+// which in turn produces extra patch hunks for no reason.


This is about unwanted reformatting of parameter lists due to the ___
chaining hack.  --max-width 80 makes that less likely, but not
impossible.

We can search for unwanted reformatting of parameter lists.  I think
grepping diffs for '^\+.*Error \*\*' should do the trick.  For the whole
tree, I get one false positive (not a parameter list), and one hit:

  @@ -388,8 +388,10 @@ static void object_post_init_with_type(O
   }
   }

  -void object_apply_global_props(Object *obj, const GPtrArray *props, 
Error **errp)
  +void object_apply_global_props(Object *obj, const GPtrArray *props,
  +   Error **errp)
   {
  +ERRP_AUTO_PROPAGATE();
   int i;

   if (!props) {

Reformatting, but not unwanted.


Yes, I saw it. This line is 81 character length, so it's OK to fix it in one 
hunk with
ERRP_AUTO_PROPAGATE addition even for non-automatic patch.


Agree.



The --max-width 80 hack is good enough for me.

It does result in slightly long transformed lines, e.g. this one in
replication.c:

  @@ -113,7 +113,7 @@ static int replication_open(BlockDriverS
   s->mode = REPLICATION_MODE_PRIMARY;
   top_id = qemu_opt_get(opts, REPLICATION_TOP_ID);
   if (top_id) {
  -error_setg(_err, "The primary side does not support option 
top-id");
  +error_setg(errp, "The primary side does not support option 
top-id");
   goto fail;
   }
   } else if (!strcmp(mode, "secondary")) {

v8 did break this line (that's how I found it).  However, v9 still
shortens the line, just not below the target.  All your + lines look
quite unlikely to lengthen lines.  Let's not worry about this.


+// Switch unusual Error ** parameter names to errp
+// (this is necessary to use ERRP_AUTO_PROPAGATE).
+//
+// Disable optional_qualifier to skip functions with
+// "Error *const *errp" parameter.
+//
+// Skip functions with "assert(_errp && *_errp)" statement, because
+// that signals unusual semantics, and the parameter name may well
+// serve a purpose. (like nbd_iter_channel_error()).
+//
+// Skip util/error.c to 

Re: [Xen-devel] [linux-linus bisection] complete test-amd64-amd64-qemuu-nested-intel

2020-03-13 Thread Andrew Cooper
On 12/03/2020 23:56, osstest service owner wrote:
> branch xen-unstable
> xenbranch xen-unstable
> job test-amd64-amd64-qemuu-nested-intel
> testid debian-hvm-install/l1/l2
>
> Tree: linux 
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
> Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> Tree: seabios git://xenbits.xen.org/osstest/seabios.git
> Tree: xen git://xenbits.xen.org/xen.git
>
> *** Found and reproduced problem changeset ***
>
>   Bug is in tree:  xen git://xenbits.xen.org/xen.git
>   Bug introduced:  f96e1469ad06b61796c60193daaeb9f8a96d7458
>   Bug not present: 0729830cc425a8ff27a3137e87b93768ae3c853c
>   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/148496/
>
>
>   commit f96e1469ad06b61796c60193daaeb9f8a96d7458
>   Author: Roger Pau Monné 
>   Date:   Wed Feb 5 13:49:09 2020 +0100
>   
>   x86/vvmx: fix virtual interrupt injection when Ack on exit control is 
> used
>   
>   When doing a virtual vmexit (ie: a vmexit handled by the L1 VMM)
>   interrupts shouldn't be injected using the virtual interrupt delivery
>   mechanism unless the Ack on exit vmexit control bit isn't set in the
>   nested vmcs.
>   
>   Gate the call to nvmx_update_apicv helper on whether the nested vmcs
>   has the Ack on exit bit set in the vmexit control field.
>   
>   Note that this fixes the usage of x2APIC by the L1 VMM, at least when
>   the L1 VMM is Xen.
>   
>   Signed-off-by: Roger Pau Monné 
>   Reviewed-by: Kevin Tian 

Looks like there are further problems here.  I'm afraid I haven't
investigated further, but this also might be the source of the
intermittent problems in staging.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/4] x86/APIC: reduce rounding errors in calculations

2020-03-13 Thread Andrew Cooper
On 13/03/2020 09:26, Jan Beulich wrote:
> --- a/xen/arch/x86/apic.c
> +++ b/xen/arch/x86/apic.c
> @@ -1249,17 +1249,16 @@ static int __init calibrate_APIC_clock(v
>  tt2 = apic_read(APIC_TMCCT);
>  t2 = rdtsc_ordered();
>  
> -result = (tt1-tt2)*APIC_DIVISOR/LOOPS;
> +bus_freq = (tt1 - tt2) * APIC_DIVISOR * LOOPS_FRAC;
>  
> -apic_printk(APIC_VERBOSE, ". CPU clock speed is %ld.%04ld MHz.\n",
> -((long)(t2 - t1) / LOOPS) / (100 / HZ),
> -((long)(t2 - t1) / LOOPS) % (100 / HZ));
> +apic_printk(APIC_VERBOSE, ". CPU clock speed is %lu.%04lu MHz.\n",
> +((unsigned long)(t2 - t1) * LOOPS_FRAC) / 100,
> +((unsigned long)(t2 - t1) * LOOPS_FRAC / 100) % 1);

If I'm not mistaken, this expression only works because of the L->R
associativity (given the same precedence of * and /) grouping it as
((t2-t1) * 10)  / 100 as opposed to (t2-t1) * (10 / 100), where the
latter would truncate to 0.  I think some extra brackets would help for
clarity.

That said, what is wrong with keeping the original form?  I realise this
is only for a printk(), but the div instruction can't be shared between
the two halves.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 02/10] scripts: Coccinelle script to use ERRP_AUTO_PROPAGATE()

2020-03-13 Thread Markus Armbruster
Vladimir Sementsov-Ogievskiy  writes:

> 12.03.2020 19:36, Markus Armbruster wrote:
>> I may have a second look tomorrow with fresher eyes, but let's get this
>> out now as is.
>>
>> Vladimir Sementsov-Ogievskiy  writes:
>>
>>> Script adds ERRP_AUTO_PROPAGATE macro invocation where appropriate and
>>> does corresponding changes in code (look for details in
>>> include/qapi/error.h)
>>>
>>> Usage example:
>>> spatch --sp-file scripts/coccinelle/auto-propagated-errp.cocci \
>>>   --macro-file scripts/cocci-macro-file.h --in-place --no-show-diff \
>>>   --max-width 80 FILES...
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>
>>> Cc: Eric Blake 
>>> Cc: Kevin Wolf 
>>> Cc: Max Reitz 
>>> Cc: Greg Kurz 
>>> Cc: Christian Schoenebeck 
>>> Cc: Stefano Stabellini 
>>> Cc: Anthony Perard 
>>> Cc: Paul Durrant 
>>> Cc: Stefan Hajnoczi 
>>> Cc: "Philippe Mathieu-Daudé" 
>>> Cc: Laszlo Ersek 
>>> Cc: Gerd Hoffmann 
>>> Cc: Stefan Berger 
>>> Cc: Markus Armbruster 
>>> Cc: Michael Roth 
>>> Cc: qemu-de...@nongnu.org
>>> Cc: qemu-bl...@nongnu.org
>>> Cc: xen-devel@lists.xenproject.org
>>>
>>>   scripts/coccinelle/auto-propagated-errp.cocci | 327 ++
>>>   include/qapi/error.h  |   3 +
>>>   MAINTAINERS   |   1 +
>>>   3 files changed, 331 insertions(+)
>>>   create mode 100644 scripts/coccinelle/auto-propagated-errp.cocci
>>>
>>> diff --git a/scripts/coccinelle/auto-propagated-errp.cocci 
>>> b/scripts/coccinelle/auto-propagated-errp.cocci
>>> new file mode 100644
>>> index 00..7dac2dcfa4
>>> --- /dev/null
>>> +++ b/scripts/coccinelle/auto-propagated-errp.cocci
>>> @@ -0,0 +1,327 @@
>>> +// Use ERRP_AUTO_PROPAGATE (see include/qapi/error.h)
>>> +//
>>> +// Copyright (c) 2020 Virtuozzo International GmbH.
>>> +//
>>> +// This program is free software; you can redistribute it and/or
>>> +// modify it under the terms of the GNU General Public License as
>>> +// published by the Free Software Foundation; either version 2 of the
>>> +// License, or (at your option) any later version.
>>> +//
>>> +// This program is distributed in the hope that it will be useful,
>>> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> +// GNU General Public License for more details.
>>> +//
>>> +// You should have received a copy of the GNU General Public License
>>> +// along with this program.  If not, see
>>> +// .
>>> +//
>>> +// Usage example:
>>> +// spatch --sp-file scripts/coccinelle/auto-propagated-errp.cocci \
>>> +//  --macro-file scripts/cocci-macro-file.h --in-place \
>>> +//  --no-show-diff --max-width 80 FILES...
>>> +//
>>> +// Note: --max-width 80 is needed because coccinelle default is less
>>> +// than 80, and without this parameter coccinelle may reindent some
>>> +// lines which fit into 80 characters but not to coccinelle default,
>>> +// which in turn produces extra patch hunks for no reason.
>>
>> This is about unwanted reformatting of parameter lists due to the ___
>> chaining hack.  --max-width 80 makes that less likely, but not
>> impossible.
>>
>> We can search for unwanted reformatting of parameter lists.  I think
>> grepping diffs for '^\+.*Error \*\*' should do the trick.  For the whole
>> tree, I get one false positive (not a parameter list), and one hit:
>>
>>  @@ -388,8 +388,10 @@ static void object_post_init_with_type(O
>>   }
>>   }
>>
>>  -void object_apply_global_props(Object *obj, const GPtrArray *props, 
>> Error **errp)
>>  +void object_apply_global_props(Object *obj, const GPtrArray *props,
>>  +   Error **errp)
>>   {
>>  +ERRP_AUTO_PROPAGATE();
>>   int i;
>>
>>   if (!props) {
>>
>> Reformatting, but not unwanted.
>
> Yes, I saw it. This line is 81 character length, so it's OK to fix it in one 
> hunk with
> ERRP_AUTO_PROPAGATE addition even for non-automatic patch.

Agree.

>>
>> The --max-width 80 hack is good enough for me.
>>
>> It does result in slightly long transformed lines, e.g. this one in
>> replication.c:
>>
>>  @@ -113,7 +113,7 @@ static int replication_open(BlockDriverS
>>   s->mode = REPLICATION_MODE_PRIMARY;
>>   top_id = qemu_opt_get(opts, REPLICATION_TOP_ID);
>>   if (top_id) {
>>  -error_setg(_err, "The primary side does not support 
>> option top-id");
>>  +error_setg(errp, "The primary side does not support option 
>> top-id");
>>   goto fail;
>>   }
>>   } else if (!strcmp(mode, "secondary")) {
>>
>> v8 did break this line (that's how I found it).  However, v9 still
>> shortens the line, just not below the target.  All your + lines look
>> quite unlikely to lengthen lines.  Let's not worry about this.
>>
>>> +// Switch unusual Error ** parameter names to errp
>>> +// (this is 

Re: [Xen-devel] [PATCH v9 02/10] scripts: Coccinelle script to use ERRP_AUTO_PROPAGATE()

2020-03-13 Thread Markus Armbruster
Vladimir Sementsov-Ogievskiy  writes:

> 13.03.2020 10:50, Markus Armbruster wrote:
>> Vladimir Sementsov-Ogievskiy  writes:
>>
>> [...]
>>> +// Warn several Error * definitions.
>>> +@check1 disable optional_qualifier exists@
>>> +identifier fn = rule1.fn, local_err, local_err2;
>>> +@@
>>> +
>>> + fn(..., Error ** , ...)
>>> + {
>>> + ...
>>> + Error *local_err = NULL;
>>> + ... when any
>>> + Error *local_err2 = NULL;
>>> + ... when any
>>> + }
>>> +
>>> +@ script:python @
>>> +fn << check1.fn;
>>> +@@
>>> +
>>> +print('Warning: function {} has several definitions of '
>>> +  'Error * local variable'.format(fn))
>>
>> Printing the positions like you do in the next rule is useful when
>> examining these warnings.
>
> I decided that searching for Error * definition is simple, and better for
> user to search all definitions by hand (may be more than too).
>
> But understanding control flows is more complex thing and better to help
> user with line positions.
>
> But if you want, we can add them of course. Note, that for some reasons some
> times coccinelle instead of original filename prints something like 
> /tmp/...original-name...
> so it don't look nice and may be a bit misleading.

I noticed when I actually tried mu idea %-}


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 02/10] scripts: Coccinelle script to use ERRP_AUTO_PROPAGATE()

2020-03-13 Thread Vladimir Sementsov-Ogievskiy

13.03.2020 17:58, Markus Armbruster wrote:

I tried this script on the whole tree.  Observations:

* $ git-diff --shortstat \*.[ch]
333 files changed, 3480 insertions(+), 4586 deletions(-)

* Twelve functions have "several definitions of Error * local variable".

   Eight declare such a variable within a loop.  Reported because
   Coccinelle matches along control flow, not just along text.  Ignore.

   Remaining four:

   * ivshmem_common_realize()

 Two variables (messed up in commit fe44dc91807), should be replaced
 by one.

   * qmp_query_cpu_model_expansion() two times

 Three declarations in separate blocks; two should be replaced by
 _abort, one moved to the function block.

   * xen_block_device_destroy()

 Two declarations in seperate blocks; should be replaced by a single
 one.

   Separate manual cleanup patches, ideally applied before running
   Coccinelle to keep Coccinelle's changes as simple and safe as
   possible.  I'll post patches.  Only the one for
   xen_block_device_destroy() affects by this series.

* No function "propagates to errp several times"

   I tested the rule does detect this as advertized by feeding it an
   obvious example.  We're good.

* ERRP_AUTO_PROPAGATE() inserted 744 times, always right at the
   beginning of a function.

* As far as I can tell, all changed functions have ERRP_AUTO_PROPAGATE()
   inserted.  Good.

* Almost 1100 error propagations dropped:error_propagate() removed,
   error_propagate_prepend() replaced by just error_prepend().

* Four error_propagate() are transformed.  Two instances each in
   aspeed_soc_ast2600_realize() and aspeed_soc_realize().  Pattern:

  {
 +ERRP_AUTO_PROPAGATE();
  ...
 -Error *err = NULL, *local_err = NULL;
 +Error *local_err = NULL;
  ...

  object_property_set_T(...,
 -  );
 +  errp);
  object_property_set_T(...,
_err);
 -error_propagate(, local_err);
 -if (err) {
 -error_propagate(errp, err);
 +error_propagate(errp, local_err);
 +if (*errp) {
  return;
  }

   This is what error.h calls "Receive and accumulate multiple errors
   (first one wins)".

   Result:

 ERRP_AUTO_PROPAGATE();
 ...
 Error *local_err = NULL;
 ...

 object_property_set_T(..., errp);
 object_property_set_T(..., _err);
 error_propagate(errp, local_err);
 if (*errp) {
 return;
 }

   Could be done without the accumulation:

 ERRP_AUTO_PROPAGATE();
 ...

 object_property_set_T(..., errp);
 if (*errp) {
 return;
 }
 object_property_set_T(..., errp);
 if (*errp) {
 return;
 }

   I find this a bit easier to understand.  Matter of taste.  If we want
   to change to this, do it manually and separately.  I'd do it on top.

* Some 90 propagations remain.

   Some of them could use cleanup, e.g. file_memory_backend_set_pmem(),
   css_clear_io_interrupt().  Out of scope for this series.

   Some move errors around in unusual ways, e.g. in block/nbd.c.  Could
   use review.  Out of scope for this series.

   I spotted three that should be transformed, but aren't:

   - qcrypto_block_luks_store_key()

 I believe g_autoptr() confuses Coccinelle.  Undermines all our
 Coccinelle use, not just this patch.  I think we need to update
 scripts/cocci-macro-file.h for it.

   - armsse_realize()

 Something in this huge function confuses Coccinelle, but I don't
 know what exactly.  If I delete most of it, the error_propagate()
 transforms okay.  If I delete less, Coccinelle hangs.

   - apply_cpu_model()

 Gets transformed fine if I remove the #ifndef CONFIG_USER_ONLY.  I
 have no idea why the #if spooks Coccinelle here, but not elsewhere.

   None of these three affects this series.  No need to hold it back for
   further investigation.

* 30 error_free() and two warn_reportf_err() are transformed.  Patterns:

 -error_free(local_err);
 -local_err = NULL;
 +error_free_errp(errp);

   and

 -error_free(local_err);
 +error_free_errp(errp);

   and

 -warn_report_err(local_err);
 -local_err = NULL;
 +warn_report_errp(errp);

   Good.

* Many error_free(), error_reportf_err() and warn_reportf_err() remain.
   None of them have an argument of the form *errp.  Such arguments would
   have to be reviewed for possible interference with
   ERRP_AUTO_PROPAGATE().

* Almost 700 Error *err = NULL removed.  Almost 600 remain.

* Error usage in rdma.c is questionable / wrong.  Out of scope for this
   series.

As far as I can tell, your Coccinelle script is working as intended,
except 

Re: [Xen-devel] [PATCH 2/4] x86/time: reduce rounding errors in calculations

2020-03-13 Thread Andrew Cooper
On 13/03/2020 09:25, Jan Beulich wrote:
> Plain (unsigned) integer division simply truncates the results. The
> overall errors are smaller though if we use proper rounding. (Extend
> this to the purely cosmetic aspect of time.c's freq_string(), which
> before this change I've frequently observed to report e.g. NN.999MHz
> HPET clock speeds.)
>
> Signed-off-by: Jan Beulich 
> ---
> We could switch at least the first div/rem pair in calibrate_APIC_clock()
> to use do_div(), but this would imply switching bus_freq (and then also
> result) to unsigned int (as the divisor has to be 32-bit). While I think
> there's pretty little risk for bus frequencies to go beyond 4GHz in the
> next so many years, I still wasn't certain enough this would be a welcome
> change.

Honestly, do_div() is very easy to get wrong (and in security relevant
ways in Linux).  I'd advocate for phasing its use out, irrespective of
this frequency concern.

As for 4GHz, I think its unlikely, but better to be safe in the code.

>
> --- a/xen/arch/x86/apic.c
> +++ b/xen/arch/x86/apic.c
> @@ -1261,7 +1261,9 @@ static int __init calibrate_APIC_clock(v
>  /* set up multipliers for accurate timer code */
>  bus_freq   = result*HZ;
>  bus_cycle  = (u32) (1LL/bus_freq); /* in pico seconds */
> +bus_cycle += (1UL % bus_freq) * 2 > bus_freq;

These two differ in signedness of the numerator.  GCC seems to cope with
combining the two into a single div instruction, but I think we should
be consistent with the constant nevertheless.

Otherwise, Acked-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 02/10] scripts: Coccinelle script to use ERRP_AUTO_PROPAGATE()

2020-03-13 Thread Markus Armbruster
I tried this script on the whole tree.  Observations:

* $ git-diff --shortstat \*.[ch]
   333 files changed, 3480 insertions(+), 4586 deletions(-)

* Twelve functions have "several definitions of Error * local variable".

  Eight declare such a variable within a loop.  Reported because
  Coccinelle matches along control flow, not just along text.  Ignore.

  Remaining four:

  * ivshmem_common_realize()

Two variables (messed up in commit fe44dc91807), should be replaced
by one.

  * qmp_query_cpu_model_expansion() two times

Three declarations in separate blocks; two should be replaced by
_abort, one moved to the function block.

  * xen_block_device_destroy()

Two declarations in seperate blocks; should be replaced by a single
one.

  Separate manual cleanup patches, ideally applied before running
  Coccinelle to keep Coccinelle's changes as simple and safe as
  possible.  I'll post patches.  Only the one for
  xen_block_device_destroy() affects by this series.

* No function "propagates to errp several times"

  I tested the rule does detect this as advertized by feeding it an
  obvious example.  We're good.

* ERRP_AUTO_PROPAGATE() inserted 744 times, always right at the
  beginning of a function.

* As far as I can tell, all changed functions have ERRP_AUTO_PROPAGATE()
  inserted.  Good.

* Almost 1100 error propagations dropped:error_propagate() removed,
  error_propagate_prepend() replaced by just error_prepend().

* Four error_propagate() are transformed.  Two instances each in
  aspeed_soc_ast2600_realize() and aspeed_soc_realize().  Pattern:

 {
+ERRP_AUTO_PROPAGATE();
 ...
-Error *err = NULL, *local_err = NULL;
+Error *local_err = NULL;
 ...

 object_property_set_T(..., 
-  );
+  errp);
 object_property_set_T(...,
   _err);
-error_propagate(, local_err);
-if (err) {
-error_propagate(errp, err);
+error_propagate(errp, local_err);
+if (*errp) {
 return;
 }

  This is what error.h calls "Receive and accumulate multiple errors
  (first one wins)".

  Result:

ERRP_AUTO_PROPAGATE();
...
Error *local_err = NULL;
...

object_property_set_T(..., errp);
object_property_set_T(..., _err);
error_propagate(errp, local_err);
if (*errp) {
return;
}

  Could be done without the accumulation:

ERRP_AUTO_PROPAGATE();
...

object_property_set_T(..., errp);
if (*errp) {
return;
}
object_property_set_T(..., errp);
if (*errp) {
return;
}

  I find this a bit easier to understand.  Matter of taste.  If we want
  to change to this, do it manually and separately.  I'd do it on top.

* Some 90 propagations remain.

  Some of them could use cleanup, e.g. file_memory_backend_set_pmem(),
  css_clear_io_interrupt().  Out of scope for this series.

  Some move errors around in unusual ways, e.g. in block/nbd.c.  Could
  use review.  Out of scope for this series.

  I spotted three that should be transformed, but aren't:

  - qcrypto_block_luks_store_key()

I believe g_autoptr() confuses Coccinelle.  Undermines all our
Coccinelle use, not just this patch.  I think we need to update
scripts/cocci-macro-file.h for it.

  - armsse_realize()

Something in this huge function confuses Coccinelle, but I don't
know what exactly.  If I delete most of it, the error_propagate()
transforms okay.  If I delete less, Coccinelle hangs.

  - apply_cpu_model()

Gets transformed fine if I remove the #ifndef CONFIG_USER_ONLY.  I
have no idea why the #if spooks Coccinelle here, but not elsewhere.

  None of these three affects this series.  No need to hold it back for
  further investigation.

* 30 error_free() and two warn_reportf_err() are transformed.  Patterns:

-error_free(local_err);
-local_err = NULL;
+error_free_errp(errp);

  and

-error_free(local_err);
+error_free_errp(errp);

  and

-warn_report_err(local_err);
-local_err = NULL;
+warn_report_errp(errp);

  Good.

* Many error_free(), error_reportf_err() and warn_reportf_err() remain.
  None of them have an argument of the form *errp.  Such arguments would
  have to be reviewed for possible interference with
  ERRP_AUTO_PROPAGATE().

* Almost 700 Error *err = NULL removed.  Almost 600 remain.

* Error usage in rdma.c is questionable / wrong.  Out of scope for this
  series.

As far as I can tell, your Coccinelle script is working as intended,
except for three missed error propagations noted above.  We can proceed
with this series regardless, if we want.  I'd prefer to integrate my

[Xen-devel] [linux-4.9 test] 148465: regressions - FAIL

2020-03-13 Thread osstest service owner
flight 148465 linux-4.9 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/148465/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
142947
 test-amd64-i386-xl  20 guest-start/debian.repeat fail REGR. vs. 142947

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 16 guest-localmigrate   fail  like 142893
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 142893
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 142947
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 142947
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 142947
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 142947
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 142947
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass

version targeted for testing:
 linux19c646f01e4ace1e5e5b3de2749de25bc86b79a1
baseline version:
 linux364ef83db0273acc89c6ba8ae1aebee70a133056

Last test of basis   142947  2019-10-20 03:26:28 Z  145 days
Failing since143328  2019-10-29 08:51:20 Z  136 days   28 attempts
Testing same since   148465  2020-03-12 06:16:27 Z1 days1 attempts


1179 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 

Re: [Xen-devel] [PATCH 1/4] x86/APIC: adjust types and comments in calibrate_APIC_clock()

2020-03-13 Thread Andrew Cooper
On 13/03/2020 09:25, Jan Beulich wrote:
> First and foremost the comment talking about potential underflow being
> taken care of by using signed long type variables was true only on
> 32-bit, which we've not been supporting for quite some time. Drop the
> comment and change all involved types to unsigned. Take the opportunity
> and also replace bus_cycle's fixed width type.
>
> Additionally there's no point using an "arbitrary (but long enough)
> timeout" here. Just use the maximum possible value; Linux does so too,
> just as an additional data point.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 0/9] x86: reduce include dependencies

2020-03-13 Thread Andrew Cooper
On 13/03/2020 11:12, Jan Beulich wrote:
> In a number of cases I've noticed the x86 emulator, which is quite
> slow to build especially with not very new gcc, to re-build when
> having changed headers which I wouldn't have expected to be
> included there in the first place. Hence I've gone through the
> dependencies of that object file and tried to get rid of at least
> some of the very odd dependencies there. (Some are being addressed
> also be the separately sent mem-access and vm-event patches with a
> similar subject.)
>
> 1: HVM: reduce domain.h include dependencies
> 2: HVM: reduce vcpu.h include dependencies
> 3: HVM: reduce vpt.h include dependencies
> 4: HVM: reduce vpic.h include dependencies
> 5: HVM: reduce vioapic.h include dependencies
> 6: HVM: reduce vlapic.h include dependencies
> 7: HVM: reduce io.h include dependencies
> 8: HVM: reduce hvm.h include dependencies
> 9: reduce mce.h include dependencies
>
> v3: Address build issues in particular with XSM=y + PV_SHIM=y (patch
> 1 only).
> v2: Address build issues in particular with XSM=y.

Acked-by: Andrew Cooper 

I've tried a variety of configs and everything LGTM now.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 1/2] xen/rwlocks: call preempt_disable() when taking a rwlock

2020-03-13 Thread Juergen Gross
Similar to spinlocks preemption should be disabled while holding a
rwlock.

Signed-off-by: Juergen Gross 
---
V2:
- move preempt_disable()/enable() calls (Jan Beulich, Julien Grall)
---
 xen/include/xen/rwlock.h | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/rwlock.h b/xen/include/xen/rwlock.h
index 1c221dd0d9..4d1b48c722 100644
--- a/xen/include/xen/rwlock.h
+++ b/xen/include/xen/rwlock.h
@@ -2,6 +2,7 @@
 #define __RWLOCK_H__
 
 #include 
+#include 
 #include 
 #include 
 
@@ -54,6 +55,7 @@ static inline int _read_trylock(rwlock_t *lock)
 {
 u32 cnts;
 
+preempt_disable();
 cnts = atomic_read(>cnts);
 if ( likely(_can_read_lock(cnts)) )
 {
@@ -62,6 +64,7 @@ static inline int _read_trylock(rwlock_t *lock)
 return 1;
 atomic_sub(_QR_BIAS, >cnts);
 }
+preempt_enable();
 return 0;
 }
 
@@ -73,6 +76,7 @@ static inline void _read_lock(rwlock_t *lock)
 {
 u32 cnts;
 
+preempt_disable();
 cnts = atomic_add_return(_QR_BIAS, >cnts);
 if ( likely(_can_read_lock(cnts)) )
 return;
@@ -106,6 +110,7 @@ static inline void _read_unlock(rwlock_t *lock)
  * Atomically decrement the reader count
  */
 atomic_sub(_QR_BIAS, >cnts);
+preempt_enable();
 }
 
 static inline void _read_unlock_irq(rwlock_t *lock)
@@ -137,6 +142,7 @@ static inline unsigned int _write_lock_val(void)
 static inline void _write_lock(rwlock_t *lock)
 {
 /* Optimize for the unfair lock case where the fair flag is 0. */
+preempt_disable();
 if ( atomic_cmpxchg(>cnts, 0, _write_lock_val()) == 0 )
 return;
 
@@ -168,17 +174,23 @@ static inline int _write_trylock(rwlock_t *lock)
 {
 u32 cnts;
 
+preempt_disable();
 cnts = atomic_read(>cnts);
-if ( unlikely(cnts) )
+if ( unlikely(cnts) ||
+ unlikely(atomic_cmpxchg(>cnts, 0, _write_lock_val()) != 0) )
+{
+preempt_enable();
 return 0;
+}
 
-return likely(atomic_cmpxchg(>cnts, 0, _write_lock_val()) == 0);
+return 1;
 }
 
 static inline void _write_unlock(rwlock_t *lock)
 {
 ASSERT(_is_write_locked_by_me(atomic_read(>cnts)));
 atomic_and(~(_QW_CPUMASK | _QW_WMASK), >cnts);
+preempt_enable();
 }
 
 static inline void _write_unlock_irq(rwlock_t *lock)
@@ -274,6 +286,7 @@ static inline void _percpu_read_lock(percpu_rwlock_t 
**per_cpudata,
 }
 
 /* Indicate this cpu is reading. */
+preempt_disable();
 this_cpu_ptr(per_cpudata) = percpu_rwlock;
 smp_mb();
 /* Check if a writer is waiting. */
@@ -309,6 +322,7 @@ static inline void _percpu_read_unlock(percpu_rwlock_t 
**per_cpudata,
 }
 this_cpu_ptr(per_cpudata) = NULL;
 smp_wmb();
+preempt_enable();
 }
 
 /* Don't inline percpu write lock as it's a complex function. */
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 0/2] xen/locks: fix preempt disabling in lock handling

2020-03-13 Thread Juergen Gross
Xen's rwlocks don't disable preemption at all, while spinlocks are
doing it only after obtaining the lock.

While not really critical, it is wrong.

This series fixes that.

Changes in V2:
- move the preempt_[dis|en]able() calls

Juergen Gross (2):
  xen/rwlocks: call preempt_disable() when taking a rwlock
  xen/spinlocks: fix placement of preempt_[dis|en]able()

 xen/common/spinlock.c|  9 ++---
 xen/include/xen/rwlock.h | 18 --
 2 files changed, 22 insertions(+), 5 deletions(-)

-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 2/2] xen/spinlocks: fix placement of preempt_[dis|en]able()

2020-03-13 Thread Juergen Gross
In case Xen ever gains preemption support the spinlock coding's
placement of preempt_disable() and preempt_enable() should be outside
of the locked section.

Signed-off-by: Juergen Gross 
---
V2:
- move preempt_enable() to the very end of _spin_unlock() (Jan Beulich)
---
 xen/common/spinlock.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 344981c54a..6c8b62beb0 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -160,6 +160,7 @@ void inline _spin_lock_cb(spinlock_t *lock, void (*cb)(void 
*), void *data)
 LOCK_PROFILE_VAR;
 
 check_lock(>debug);
+preempt_disable();
 tickets.head_tail = arch_fetch_and_add(>tickets.head_tail,
tickets.head_tail);
 while ( tickets.tail != observe_head(>tickets) )
@@ -171,7 +172,6 @@ void inline _spin_lock_cb(spinlock_t *lock, void (*cb)(void 
*), void *data)
 }
 got_lock(>debug);
 LOCK_PROFILE_GOT;
-preempt_disable();
 arch_lock_acquire_barrier();
 }
 
@@ -199,11 +199,11 @@ unsigned long _spin_lock_irqsave(spinlock_t *lock)
 void _spin_unlock(spinlock_t *lock)
 {
 arch_lock_release_barrier();
-preempt_enable();
 LOCK_PROFILE_REL;
 rel_lock(>debug);
 add_sized(>tickets.head, 1);
 arch_lock_signal();
+preempt_enable();
 }
 
 void _spin_unlock_irq(spinlock_t *lock)
@@ -242,15 +242,18 @@ int _spin_trylock(spinlock_t *lock)
 return 0;
 new = old;
 new.tail++;
+preempt_disable();
 if ( cmpxchg(>tickets.head_tail,
  old.head_tail, new.head_tail) != old.head_tail )
+{
+preempt_enable();
 return 0;
+}
 got_lock(>debug);
 #ifdef CONFIG_DEBUG_LOCK_PROFILE
 if (lock->profile)
 lock->profile->time_locked = NOW();
 #endif
-preempt_disable();
 /*
  * cmpxchg() is a full barrier so no need for an
  * arch_lock_acquire_barrier().
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Ping: [PATCH v2] AMD/IOMMU: fix off-by-one in amd_iommu_get_paging_mode() callers

2020-03-13 Thread Andrew Cooper
On 13/03/2020 11:27, Jan Beulich wrote:
> On 03.03.2020 12:02, Jan Beulich wrote:
>> amd_iommu_get_paging_mode() expects a count, not a "maximum possible"
>> value. Prior to b4f042236ae0 dropping the reference, the use of our mis-
>> named "max_page" in amd_iommu_domain_init() may have lead to such a
>> misunderstanding. In an attempt to avoid such confusion in the future,
>> rename the function's parameter and - while at it - convert it to an
>> inline function.
>>
>> Also replace a literal 4 by an expression tying it to a wider use
>> constant, just like amd_iommu_quarantine_init() does.
>>
>> Fixes: ea38867831da ("x86 / iommu: set up a scratch page in the quarantine 
>> domain")
>> Fixes: b4f042236ae0 ("AMD/IOMMU: Cease using a dynamic height for the IOMMU 
>> pagetables")
>> Signed-off-by: Jan Beulich 
>> ---
>> v2: Convert amd_iommu_get_paging_mode() itself to inline function,
>> changing itss parameter's name.
> Ping? Anything else needed here, beyond addressing your v1 comments?

Sorry - fell through the cracks.

Acked-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 148515: tolerable all pass - PUSHED

2020-03-13 Thread osstest service owner
flight 148515 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/148515/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  76416c459c6e0b3579c5177df414e0633b8b9565
baseline version:
 xen  a9b6dacf88fe99fbb69a2ee505833851ffdc9cec

Last test of basis   148450  2020-03-11 17:00:33 Z1 days
Testing same since   148515  2020-03-13 11:01:13 Z0 days1 attempts


People who touched revisions under test:
  Jason Andryuk 
  Julien Grall 
  Roger Pau Monne 
  Roger Pau Monné 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   a9b6dacf88..76416c459c  76416c459c6e0b3579c5177df414e0633b8b9565 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 2/4] xen: don't process rcu callbacks when holding a rcu_read_lock()

2020-03-13 Thread Juergen Gross
Some keyhandlers are calling process_pending_softirqs() while holding
a rcu_read_lock(). This is wrong, as process_pending_softirqs() might
activate rcu calls which should not happen inside a rcu_read_lock().

For that purpose modify process_pending_softirqs() to not allow rcu
callback processing when a rcu_read_lock() is being held.

Signed-off-by: Juergen Gross 
---
V3:
- add RCU_SOFTIRQ to ignore in process_pending_softirqs_norcu()
  (Roger Pau Monné)

V5:
- block rcu processing depending on rch_read_lock() being held or not
  (Jan Beulich)
---
 xen/common/softirq.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/xen/common/softirq.c b/xen/common/softirq.c
index b83ad96d6c..00d676b62c 100644
--- a/xen/common/softirq.c
+++ b/xen/common/softirq.c
@@ -29,6 +29,7 @@ static void __do_softirq(unsigned long ignore_mask)
 {
 unsigned int i, cpu;
 unsigned long pending;
+bool rcu_allowed = !(ignore_mask & (1ul << RCU_SOFTIRQ));
 
 for ( ; ; )
 {
@@ -38,7 +39,7 @@ static void __do_softirq(unsigned long ignore_mask)
  */
 cpu = smp_processor_id();
 
-if ( rcu_pending(cpu) )
+if ( rcu_allowed && rcu_pending(cpu) )
 rcu_check_callbacks(cpu);
 
 if ( ((pending = (softirq_pending(cpu) & ~ignore_mask)) == 0)
@@ -53,9 +54,16 @@ static void __do_softirq(unsigned long ignore_mask)
 
 void process_pending_softirqs(void)
 {
+unsigned long ignore_mask = (1ul << SCHEDULE_SOFTIRQ) |
+(1ul << SCHED_SLAVE_SOFTIRQ);
+
+/* Block RCU processing in case of rcu_read_lock() held. */
+if ( preempt_count() )
+ignore_mask |= 1ul << RCU_SOFTIRQ;
+
 ASSERT(!in_irq() && local_irq_is_enabled());
 /* Do not enter scheduler as it can preempt the calling context. */
-__do_softirq((1ul << SCHEDULE_SOFTIRQ) | (1ul << SCHED_SLAVE_SOFTIRQ));
+__do_softirq(ignore_mask);
 }
 
 void do_softirq(void)
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 4/4] xen/rcu: add per-lock counter in debug builds

2020-03-13 Thread Juergen Gross
Add a lock specific counter to rcu read locks in debug builds. This
allows to test for matching lock/unlock calls.

This will help to avoid cases like the one fixed by commit
98ed1f43cc2c89 where different rcu read locks were referenced in the
lock and unlock calls.

Signed-off-by: Juergen Gross 
---
V5:
- updated commit message (Jan Beulich)
---
 xen/include/xen/rcupdate.h | 46 +-
 1 file changed, 33 insertions(+), 13 deletions(-)

diff --git a/xen/include/xen/rcupdate.h b/xen/include/xen/rcupdate.h
index d3c2b7b093..e0c3b16e7d 100644
--- a/xen/include/xen/rcupdate.h
+++ b/xen/include/xen/rcupdate.h
@@ -37,21 +37,50 @@
 #include 
 #include 
 #include 
+#include 
 
 #define __rcu
 
+#ifndef NDEBUG
+/* * Lock type for passing to rcu_read_{lock,unlock}. */
+struct _rcu_read_lock {
+atomic_t cnt;
+};
+typedef struct _rcu_read_lock rcu_read_lock_t;
+#define DEFINE_RCU_READ_LOCK(x) rcu_read_lock_t x = { .cnt = ATOMIC_INIT(0) }
+#define RCU_READ_LOCK_INIT(x)   atomic_set(&(x)->cnt, 0)
+
+#else
+/*
+ * Dummy lock type for passing to rcu_read_{lock,unlock}. Currently exists
+ * only to document the reason for rcu_read_lock() critical sections.
+ */
+struct _rcu_read_lock {};
+typedef struct _rcu_read_lock rcu_read_lock_t;
+#define DEFINE_RCU_READ_LOCK(x) rcu_read_lock_t x
+#define RCU_READ_LOCK_INIT(x)
+
+#endif
+
 DECLARE_PER_CPU(unsigned int, rcu_lock_cnt);
 
-static inline void rcu_quiesce_disable(void)
+static inline void rcu_quiesce_disable(rcu_read_lock_t *lock)
 {
 preempt_disable();
 this_cpu(rcu_lock_cnt)++;
+#ifndef NDEBUG
+atomic_inc(>cnt);
+#endif
 barrier();
 }
 
-static inline void rcu_quiesce_enable(void)
+static inline void rcu_quiesce_enable(rcu_read_lock_t *lock)
 {
 barrier();
+#ifndef NDEBUG
+ASSERT(atomic_read(>cnt));
+atomic_dec(>cnt);
+#endif
 this_cpu(rcu_lock_cnt)--;
 preempt_enable();
 }
@@ -81,15 +110,6 @@ struct rcu_head {
 int rcu_pending(int cpu);
 int rcu_needs_cpu(int cpu);
 
-/*
- * Dummy lock type for passing to rcu_read_{lock,unlock}. Currently exists
- * only to document the reason for rcu_read_lock() critical sections.
- */
-struct _rcu_read_lock {};
-typedef struct _rcu_read_lock rcu_read_lock_t;
-#define DEFINE_RCU_READ_LOCK(x) rcu_read_lock_t x
-#define RCU_READ_LOCK_INIT(x)
-
 /**
  * rcu_read_lock - mark the beginning of an RCU read-side critical section.
  *
@@ -118,7 +138,7 @@ typedef struct _rcu_read_lock rcu_read_lock_t;
  */
 static inline void rcu_read_lock(rcu_read_lock_t *lock)
 {
-rcu_quiesce_disable();
+rcu_quiesce_disable(lock);
 }
 
 /**
@@ -129,7 +149,7 @@ static inline void rcu_read_lock(rcu_read_lock_t *lock)
 static inline void rcu_read_unlock(rcu_read_lock_t *lock)
 {
 ASSERT(!rcu_quiesce_allowed());
-rcu_quiesce_enable();
+rcu_quiesce_enable(lock);
 }
 
 /*
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 0/4] xen/rcu: let rcu work better with core scheduling

2020-03-13 Thread Juergen Gross
Today the RCU handling in Xen is affecting scheduling in several ways.
It is raising sched softirqs without any real need and it requires
tasklets for rcu_barrier(), which interacts badly with core scheduling.

This small series repairs those issues.

Additionally some ASSERT()s are added for verification of sane rcu
handling. In order to avoid those triggering right away the obvious
violations are fixed. This includes making rcu locking functions type
safe.

Changes in V6:
- added memory barrier in patch 1
- drop cpu_map_lock only at the end of rcu_barrier()
- re-add prempt_disable() in patch 3

Changes in V5:
- dropped already committed patches 1 and 4
- fixed race
- rework blocking of rcu processing with held rcu locks

Changes in V4:
- patch 5: use barrier()

Changes in V3:
- type safe locking functions (functions instead of macros)
- per-lock debug additions
- new patches 4 and 6
- fixed races

Changes in V2:
- use get_cpu_maps() in rcu_barrier() handling
- avoid recursion in rcu_barrier() handling
- new patches 3 and 4

Juergen Gross (4):
  xen/rcu: don't use stop_machine_run() for rcu_barrier()
  xen: don't process rcu callbacks when holding a rcu_read_lock()
  xen/rcu: add assertions to debug build
  xen/rcu: add per-lock counter in debug builds

 xen/common/rcupdate.c  | 97 +-
 xen/common/softirq.c   | 14 ++-
 xen/include/xen/rcupdate.h | 76 +---
 3 files changed, 145 insertions(+), 42 deletions(-)

-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 1/4] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-03-13 Thread Juergen Gross
Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Remove the barrier element from struct rcu_data as it isn't used.

Finally switch rcu_barrier() to return void as it now can never fail.

Partially-based-on-patch-by: Igor Druzhinin 
Signed-off-by: Juergen Gross 
---
V2:
- add recursion detection

V3:
- fix races (Igor Druzhinin)

V5:
- rename done_count to pending_count (Jan Beulich)
- fix race (Jan Beulich)

V6:
- add barrier (Julien Grall)
- add ASSERT() (Julien Grall)
- hold cpu_map lock until end of rcu_barrier() (Julien Grall)
---
 xen/common/rcupdate.c  | 95 +-
 xen/include/xen/rcupdate.h |  2 +-
 2 files changed, 69 insertions(+), 28 deletions(-)

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index 03d84764d2..ed9083d2b2 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -83,7 +83,6 @@ struct rcu_data {
 struct rcu_head **donetail;
 longblimit;   /* Upper limit on a processed batch */
 int cpu;
-struct rcu_head barrier;
 longlast_rs_qlen; /* qlen during the last resched */
 
 /* 3) idle CPUs handling */
@@ -91,6 +90,7 @@ struct rcu_data {
 bool idle_timer_active;
 
 boolprocess_callbacks;
+boolbarrier_active;
 };
 
 /*
@@ -143,51 +143,85 @@ static int qhimark = 1;
 static int qlowmark = 100;
 static int rsinterval = 1000;
 
-struct rcu_barrier_data {
-struct rcu_head head;
-atomic_t *cpu_count;
-};
+/*
+ * rcu_barrier() handling:
+ * cpu_count holds the number of cpus required to finish barrier handling.
+ * pending_count is initialized to nr_cpus + 1.
+ * Cpus are synchronized via softirq mechanism. rcu_barrier() is regarded to
+ * be active if pending_count is not zero. In case rcu_barrier() is called on
+ * multiple cpus it is enough to check for pending_count being not zero on 
entry
+ * and to call process_pending_softirqs() in a loop until pending_count drops 
to
+ * zero, before starting the new rcu_barrier() processing.
+ * In order to avoid hangs when rcu_barrier() is called multiple times on the
+ * same cpu in fast sequence and a slave cpu couldn't drop out of the
+ * barrier handling fast enough a second counter pending_count is needed.
+ * The rcu_barrier() invoking cpu will wait until pending_count reaches 1
+ * (meaning that all cpus have finished processing the barrier) and then will
+ * reset pending_count to 0 to enable entering rcu_barrier() again.
+ */
+static atomic_t cpu_count = ATOMIC_INIT(0);
+static atomic_t pending_count = ATOMIC_INIT(0);
 
 static void rcu_barrier_callback(struct rcu_head *head)
 {
-struct rcu_barrier_data *data = container_of(
-head, struct rcu_barrier_data, head);
-atomic_inc(data->cpu_count);
+smp_wmb(); /* Make all previous writes visible to other cpus. */
+atomic_dec(_count);
 }
 
-static int rcu_barrier_action(void *_cpu_count)
+static void rcu_barrier_action(void)
 {
-struct rcu_barrier_data data = { .cpu_count = _cpu_count };
-
-ASSERT(!local_irq_is_enabled());
-local_irq_enable();
+struct rcu_head head;
 
 /*
  * When callback is executed, all previously-queued RCU work on this CPU
- * is completed. When all CPUs have executed their callback, data.cpu_count
- * will have been incremented to include every online CPU.
+ * is completed. When all CPUs have executed their callback, cpu_count
+ * will have been decremented to 0.
  */
-call_rcu(, rcu_barrier_callback);
+call_rcu(, rcu_barrier_callback);
 
-while ( atomic_read(data.cpu_count) != num_online_cpus() )
+while ( atomic_read(_count) )
 {
 process_pending_softirqs();
 cpu_relax();
 }
 
-local_irq_disable();
-
-return 0;
+atomic_dec(_count);
 }
 
-/*
- * As rcu_barrier() is using stop_machine_run() it is allowed to be used in
- * idle context only (see comment for stop_machine_run()).
- */
-int rcu_barrier(void)
+void rcu_barrier(void)
 {
-atomic_t cpu_count = ATOMIC_INIT(0);
-return stop_machine_run(rcu_barrier_action, _count, NR_CPUS);
+unsigned int n_cpus;
+
+ASSERT(!in_irq() && local_irq_is_enabled());
+
+for ( ;; )
+{
+if ( !atomic_read(_count) && get_cpu_maps() )
+ 

[Xen-devel] [PATCH v6 3/4] xen/rcu: add assertions to debug build

2020-03-13 Thread Juergen Gross
Xen's RCU implementation relies on no softirq handling taking place
while being in a RCU critical section. Add ASSERT()s in debug builds
in order to catch any violations.

For that purpose modify rcu_read_[un]lock() to use a dedicated percpu
counter additional to preempt_[en|dis]able() as this enables to test
that condition in __do_softirq() (ASSERT_NOT_IN_ATOMIC() is not
usable there due to __cpu_up() calling process_pending_softirqs()
while holding the cpu hotplug lock).

While at it switch the rcu_read_[un]lock() implementation to static
inline functions instead of macros.

Signed-off-by: Juergen Gross 
---
V3:
- add barriers to rcu_[en|dis]able() (Roger Pau Monné)
- add rcu_quiesce_allowed() to ASSERT_NOT_IN_ATOMIC (Roger Pau Monné)
- convert macros to static inline functions
- add sanity check in rcu_read_unlock()

V4:
- use barrier() in rcu_[en|dis]able() (Julien Grall)

V5:
- use rcu counter even if not using a debug build

V6:
- keep preempt_[dis|en]able() calls
---
 xen/common/rcupdate.c  |  2 ++
 xen/common/softirq.c   |  4 +++-
 xen/include/xen/rcupdate.h | 36 +---
 3 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index ed9083d2b2..5e7bd7196f 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -46,6 +46,8 @@
 #include 
 #include 
 
+DEFINE_PER_CPU(unsigned int, rcu_lock_cnt);
+
 /* Global control variables for rcupdate callback mechanism. */
 static struct rcu_ctrlblk {
 long cur;   /* Current batch number.  */
diff --git a/xen/common/softirq.c b/xen/common/softirq.c
index 00d676b62c..eba65c5fc0 100644
--- a/xen/common/softirq.c
+++ b/xen/common/softirq.c
@@ -31,6 +31,8 @@ static void __do_softirq(unsigned long ignore_mask)
 unsigned long pending;
 bool rcu_allowed = !(ignore_mask & (1ul << RCU_SOFTIRQ));
 
+ASSERT(!rcu_allowed || rcu_quiesce_allowed());
+
 for ( ; ; )
 {
 /*
@@ -58,7 +60,7 @@ void process_pending_softirqs(void)
 (1ul << SCHED_SLAVE_SOFTIRQ);
 
 /* Block RCU processing in case of rcu_read_lock() held. */
-if ( preempt_count() )
+if ( !rcu_quiesce_allowed() )
 ignore_mask |= 1ul << RCU_SOFTIRQ;
 
 ASSERT(!in_irq() && local_irq_is_enabled());
diff --git a/xen/include/xen/rcupdate.h b/xen/include/xen/rcupdate.h
index 31c8b86d13..d3c2b7b093 100644
--- a/xen/include/xen/rcupdate.h
+++ b/xen/include/xen/rcupdate.h
@@ -32,12 +32,35 @@
 #define __XEN_RCUPDATE_H
 
 #include 
+#include 
 #include 
 #include 
+#include 
 #include 
 
 #define __rcu
 
+DECLARE_PER_CPU(unsigned int, rcu_lock_cnt);
+
+static inline void rcu_quiesce_disable(void)
+{
+preempt_disable();
+this_cpu(rcu_lock_cnt)++;
+barrier();
+}
+
+static inline void rcu_quiesce_enable(void)
+{
+barrier();
+this_cpu(rcu_lock_cnt)--;
+preempt_enable();
+}
+
+static inline bool rcu_quiesce_allowed(void)
+{
+return !this_cpu(rcu_lock_cnt);
+}
+
 /**
  * struct rcu_head - callback structure for use with RCU
  * @next: next update requests in a list
@@ -91,16 +114,23 @@ typedef struct _rcu_read_lock rcu_read_lock_t;
  * will be deferred until the outermost RCU read-side critical section
  * completes.
  *
- * It is illegal to block while in an RCU read-side critical section.
+ * It is illegal to process softirqs while in an RCU read-side critical 
section.
  */
-#define rcu_read_lock(x)   ({ ((void)(x)); preempt_disable(); })
+static inline void rcu_read_lock(rcu_read_lock_t *lock)
+{
+rcu_quiesce_disable();
+}
 
 /**
  * rcu_read_unlock - marks the end of an RCU read-side critical section.
  *
  * See rcu_read_lock() for more information.
  */
-#define rcu_read_unlock(x) ({ ((void)(x)); preempt_enable(); })
+static inline void rcu_read_unlock(rcu_read_lock_t *lock)
+{
+ASSERT(!rcu_quiesce_allowed());
+rcu_quiesce_enable();
+}
 
 /*
  * So where is rcu_write_lock()?  It does not exist, as there is no
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/3] Add support for generic notifier lists

2020-03-13 Thread Maximilian Heyne
From: Anthony Liguori 

Notifiers are data-less callbacks and a notifier list is a list of registered
notifiers that all are interested in a particular event.

We'll use this in a few patches to implement mouse change notification.

Signed-off-by: Anthony Liguori 
---
v1 -> v2
 - Do not do memory allocations by placing list nodes in notifier

[cherry-picked from d1e70c5e6d1472856c52969301247fe8c3c8389d
conflicts: used the sys-qeue interface and added required
LIST_REMOVE_SAFE function to that]
Signed-off-by: Maximilian Heyne 
---
 Makefile|  1 +
 notify.c| 39 +++
 notify.h| 43 +++
 sys-queue.h |  5 +
 4 files changed, 88 insertions(+)
 create mode 100644 notify.c
 create mode 100644 notify.h

diff --git a/Makefile b/Makefile
index 0fbec990b..d921bcdf8 100644
--- a/Makefile
+++ b/Makefile
@@ -93,6 +93,7 @@ OBJS+=sd.o ssi-sd.o
 OBJS+=bt.o bt-host.o bt-vhci.o bt-l2cap.o bt-sdp.o bt-hci.o bt-hid.o usb-bt.o
 OBJS+=buffered_file.o migration.o migration-tcp.o net.o qemu-sockets.o
 OBJS+=qemu-char.o aio.o net-checksum.o savevm.o cache-utils.o
+OBJS+=notify.o
 
 ifdef CONFIG_BRLAPI
 OBJS+= baum.o
diff --git a/notify.c b/notify.c
new file mode 100644
index 0..59e1e7c7d
--- /dev/null
+++ b/notify.c
@@ -0,0 +1,39 @@
+/*
+ * Notifier lists
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Authors:
+ *  Anthony Liguori   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include "notify.h"
+
+void notifier_list_init(NotifierList *list)
+{
+LIST_INIT(>notifiers);
+}
+
+void notifier_list_add(NotifierList *list, Notifier *notifier)
+{
+LIST_INSERT_HEAD(>notifiers, notifier, node);
+}
+
+void notifier_list_remove(Notifier *notifier)
+{
+LIST_REMOVE(notifier, node);
+}
+
+void notifier_list_notify(NotifierList *list)
+{
+Notifier *notifier, *next;
+
+LIST_FOREACH_SAFE(notifier, >notifiers, node, next) {
+notifier->notify(notifier);
+}
+}
diff --git a/notify.h b/notify.h
new file mode 100644
index 0..093c63f19
--- /dev/null
+++ b/notify.h
@@ -0,0 +1,43 @@
+/*
+ * Notifier lists
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Authors:
+ *  Anthony Liguori   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef QEMU_NOTIFY_H
+#define QEMU_NOTIFY_H
+
+#include "sys-queue.h"
+
+typedef struct Notifier Notifier;
+
+struct Notifier
+{
+void (*notify)(Notifier *notifier);
+LIST_ENTRY(Notifier) node;
+};
+
+typedef struct NotifierList
+{
+LIST_HEAD(, Notifier) notifiers;
+} NotifierList;
+
+#define NOTIFIER_LIST_INITIALIZER(head) \
+{ LIST_HEAD_INITIALIZER((head).notifiers) }
+
+void notifier_list_init(NotifierList *list);
+
+void notifier_list_add(NotifierList *list, Notifier *notifier);
+
+void notifier_list_remove(Notifier *notifier);
+
+void notifier_list_notify(NotifierList *list);
+
+#endif
diff --git a/sys-queue.h b/sys-queue.h
index 55c26fe7f..81ab044a8 100644
--- a/sys-queue.h
+++ b/sys-queue.h
@@ -132,6 +132,11 @@ struct {   
 \
 (var);  \
 (var) = ((var)->field.le_next))
 
+#define LIST_FOREACH_SAFE(var, head, field, next_var)   \
+for ((var) = ((head)->lh_first);\
+(var) && ((next_var) = ((var)->field.le_next), 1);  \
+(var) = (next_var))
+
 /*
  * List access methods.
  */
-- 
2.16.6




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 3/3] xen: cleanup IOREQ server on exit

2020-03-13 Thread Maximilian Heyne
Use the backported Notifier interface to register an atexit handler to
cleanup the IOREQ server. This is required since Xen commit a5a180f9
("x86/domain: don't destroy IOREQ servers on soft reset") is introduced
which requires Qemu to explicitly close the IOREQ server.

This is can be seen as a backport of ba7fdd64 ("xen: cleanup IOREQ
server on exit").

Signed-off-by: Maximilian Heyne 
---
 hw/xen_machine_fv.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c
index f0989fad4..66eb4a1eb 100644
--- a/hw/xen_machine_fv.c
+++ b/hw/xen_machine_fv.c
@@ -31,6 +31,7 @@
 #include "qemu-aio.h"
 #include "xen_backend.h"
 #include "pci.h"
+#include "sysemu.h"
 
 #include 
 #include 
@@ -67,6 +68,8 @@ TAILQ_HEAD(map_cache_head, map_cache_rev) locked_entries = 
TAILQ_HEAD_INITIALIZE
 static unsigned long last_address_page = ~0UL;
 static uint8_t  *last_address_vaddr;
 
+static Notifier exit_notifier;
+
 static int qemu_map_cache_init(void)
 {
 unsigned long size;
@@ -283,6 +286,11 @@ void xen_disable_io(void)
 xc_hvm_set_ioreq_server_state(xc_handle, domid, ioservid, 0);
 }
 
+static void xen_exit_notifier(Notifier *n)
+{
+xc_hvm_destroy_ioreq_server(xc_handle, domid, ioservid);
+}
+
 static void xen_init_fv(ram_addr_t ram_size, int vga_ram_size,
const char *boot_device,
const char *kernel_filename,const char *kernel_cmdline,
@@ -317,6 +325,9 @@ static void xen_init_fv(ram_addr_t ram_size, int 
vga_ram_size,
 exit(-1);
 }
 
+exit_notifier.notify = xen_exit_notifier;
+qemu_add_exit_notifier(_notifier);
+
 if (xc_hvm_get_ioreq_server_info(xc_handle, domid, ioservid,
  _pfn, _pfn,
  _evtchn)) {
-- 
2.16.6




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 0/3] Cleanup IOREQ server on exit

2020-03-13 Thread Maximilian Heyne
Following up on commit 9c0eed61 ("qemu-trad: stop using the default IOREQ
server"), clean up the IOREQ server on exit. This fixes a bug with soft-reset
that shows up as "bind interdomain ioctl error 22" because the event channels
were not closed at the soft-reset and can't be bound again.

For this I used the exit notifiers from QEMU that I backported together with the
required generic notifier lists.

Anthony Liguori (1):
  Add support for generic notifier lists

Gerd Hoffmann (1):
  Add exit notifiers.

Maximilian Heyne (1):
  xen: cleanup IOREQ server on exit

 Makefile|  1 +
 hw/xen_machine_fv.c | 11 +++
 notify.c| 39 +++
 notify.h| 43 +++
 sys-queue.h |  5 +
 sysemu.h|  5 +
 vl.c| 20 
 7 files changed, 124 insertions(+)
 create mode 100644 notify.c
 create mode 100644 notify.h

-- 
2.16.6




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/3] Add exit notifiers.

2020-03-13 Thread Maximilian Heyne
From: Gerd Hoffmann 

Hook up any cleanup work which needs to be done here.  Advantages over
using atexit(3):

  (1) You get passed in a pointer to the notifier.  If you embed that
  into your state struct you can use container_of() to get get your
  state info.
  (2) You can unregister, say when un-plugging a device.

[ v2: move code out of #ifndef _WIN32 ]

Signed-off-by: Anthony Liguori 
(cherry picked from commit fd42deeb4cb42f90084046e3ebdb4383953195e3)
Signed-off-by: Maximilian Heyne 
---
 sysemu.h |  5 +
 vl.c | 20 
 2 files changed, 25 insertions(+)

diff --git a/sysemu.h b/sysemu.h
index 968258a84..759d0e9d5 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -2,6 +2,8 @@
 #define SYSEMU_H
 /* Misc. things related to the system emulator.  */
 
+#include "notify.h"
+
 /* vl.c */
 extern const char *bios_name;
 extern const char *bios_dir;
@@ -39,6 +41,9 @@ void qemu_system_powerdown(void);
 #endif
 void qemu_system_reset(void);
 
+void qemu_add_exit_notifier(Notifier *notify);
+void qemu_remove_exit_notifier(Notifier *notify);
+
 void do_savevm(const char *name);
 void do_loadvm(const char *name);
 void do_delvm(const char *name);
diff --git a/vl.c b/vl.c
index c3c5d630e..2163217ec 100644
--- a/vl.c
+++ b/vl.c
@@ -282,6 +282,9 @@ uint8_t qemu_uuid[16];
 
 #include "xen-vl-extra.c"
 
+static NotifierList exit_notifiers =
+NOTIFIER_LIST_INITIALIZER(exit_notifiers);
+
 /***/
 /* x86 ISA bus support */
 
@@ -4843,6 +4846,21 @@ static void vcpu_hex_str_to_bitmap(const char *optarg)
 }
 }
 
+void qemu_add_exit_notifier(Notifier *notify)
+{
+notifier_list_add(_notifiers, notify);
+}
+
+void qemu_remove_exit_notifier(Notifier *notify)
+{
+notifier_list_remove(notify);
+}
+
+static void qemu_run_exit_notifiers(void)
+{
+notifier_list_notify(_notifiers);
+}
+
 int main(int argc, char **argv, char **envp)
 {
 #ifdef CONFIG_GDBSTUB
@@ -4887,6 +4905,8 @@ int main(int argc, char **argv, char **envp)
 const char *chroot_dir = NULL;
 const char *run_as = NULL;
 
+atexit(qemu_run_exit_notifiers);
+
 qemu_cache_utils_init(envp);
 logfile = stderr; /* initial value */
 
-- 
2.16.6




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4] x86/cpu: Sync any remaining RCU callbacks before CPU up/down

2020-03-13 Thread Igor Druzhinin
On 13/03/2020 11:23, Jan Beulich wrote:
> On 10.03.2020 19:06, Igor Druzhinin wrote:
>> During CPU down operation RCU callbacks are scheduled to finish
>> off some actions later as soon as CPU is fully dead (the same applies
>> to CPU up operation in case error path is taken). If in the same grace
>> period another CPU up operation is performed on the same CPU, RCU callback
>> will be called later on a CPU in a potentially wrong (already up again
>> instead of still being down) state leading to eventual state inconsistency
>> and/or crash.
>>
>> In order to avoid it - flush RCU callbacks explicitly before starting the
>> next CPU up/down operation.
>>
>> Reviewed-by: Juergen Gross 
>> Signed-off-by: Igor Druzhinin 
>> ---
>> This got discovered trying to resume PV shim with multiple vCPUs on AMD
>> machine (where park_offline_cpus == 0). RCU callback responsible for
>> freeing percpu area on CPU offline got finally called after CPU went
>> online again as the guest performed regular vCPU offline/online operations
>> on resume.
>>
>> Note: this patch requires RCU series v4 from Juergen to be applied -
>> https://lists.xenproject.org/archives/html/xen-devel/2020-03/msg00668.html
> 
> I was about to apply the patch yesterday (I think) when I stumbled
> across this note. Is this actually still true? If so, would you
> mind helping me see the dependency?

Yes, that's the case otherwise you're risking to crash near 100% of
installations as rcu_barrirer without Juergen's fixes is simply broken.

Igor

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] kconfig: expose all{yes,no}config targets

2020-03-13 Thread Andrew Cooper
On 13/03/2020 12:14, Wei Liu wrote:
> On Fri, Mar 13, 2020 at 12:05:01PM +0100, Jan Beulich wrote:
>> Without having them at least at the xen/Makefile level they're (close
>> to?) inaccessible. As I'm uncertain about their utility at the top
>> level, I'm leaving it at that for now.
>>
>> Signed-off-by: Jan Beulich 
> Acked-by: Wei Liu 

Acked-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 2/2] libxl: make creation of xenstore 'suspend event channel' node optional...

2020-03-13 Thread paul
From: Paul Durrant 

... and make the top level 'device' node in xenstore writable by the
guest

The purpose and semantics of the suspend event channel node are explained
in xenstore-paths.pandoc [1]. It was originally introduced in xend by
commit 17636f47a474 "Teach xc_save to use event-channel-based domain
suspend if available.". Note that, because, the top-level frontend
'device' node was created writable by the guest in xend, there was no
need to explicitly create the 'suspend-event-channel' node as writable
node.

However, libxl creates the 'device' node as read-only by the guest and so
explicit creation of the 'suspend-event-channel' node is necessary to make
it usable. This unfortunately has the side-effect of making some old
Windows PV drivers [2] cease to function. This is because they scan the top
level 'device' node, find the 'suspend' node and expect it to contain the
usual sub-nodes describing a PV frontend. When this is found not to be the
case, enumeration ceases and (because the 'suspend' node is observed before
the 'vbd' node) no system disk is enumerated. Windows will then crash with
bugcheck code 0x7B.

This patch adds a boolean 'suspend_event_channel' field into
libxl_create_info to control whether the xenstore node is created and a
similarly named option in xl.cfg which, for compatibility with previous
libxl behaviour, defaults to true. It also makes the top level device node
writable, as xend did, and updates xenstore-paths.pandoc to say that the
suspend event channel node may not exist and that the guest may create it
if it does not exist.

[1] 
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/xenstore-paths.pandoc;hb=HEAD#l177
[2] 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/para-virtualized_windows_drivers_guide/sect-para-virtualized_windows_drivers_guide-installing_and_configuring_the_para_virtualized_drivers-installing_the_para_virtualized_drivers

NOTE: While adding the new LIBXL_HAVE_CREATEINFO_SUSPEND_EVENT_CHANNEL
  definition into libxl.h, this patch corrects the previous stanza
  which erroneously implies libxl_domain_create_info is a function.

Signed-off-by: Paul Durrant 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Julien Grall 
Cc: Stefano Stabellini 
Cc: Anthony PERARD 

v3:
 - Actually define LIBXL_HAVE_CREATEINFO_SUSPEND_EVENT_CHANNEL as well
   as commenting on it

v2:
 - Update xenstore-paths.pandoc and squash patch #3
---
 docs/man/xl.cfg.5.pod.in|  7 +++
 docs/misc/xenstore-paths.pandoc |  7 ---
 tools/libxl/libxl.h | 14 +-
 tools/libxl/libxl_create.c  | 14 ++
 tools/libxl/libxl_types.idl |  1 +
 tools/xl/xl_parse.c |  3 +++
 6 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 0cad561375..5f476f1e1d 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -668,6 +668,13 @@ file.
 
 =back
 
+=item B
+
+Create the xenstore path for the domain's suspend event channel. The
+existence of this path can cause problems with older PV drivers running
+in the guest. If this option is not specified then it will default to
+B.
+
 =back
 
 =head2 Devices
diff --git a/docs/misc/xenstore-paths.pandoc b/docs/misc/xenstore-paths.pandoc
index e2ab5da54e..a8eecdb7ed 100644
--- a/docs/misc/xenstore-paths.pandoc
+++ b/docs/misc/xenstore-paths.pandoc
@@ -176,10 +176,11 @@ The size of the video RAM this domain is configured with.
 
  ~/device/suspend/event-channel = ""|EVTCHN [w]
 
-The domain's suspend event channel. The toolstack will create this
-path with an empty value which the guest may choose to overwrite.
+The domain's suspend event channel. The toolstack may create this
+path with an empty value which the guest may choose to overwrite. If
+the path does not exist then the guest may create it.
 
-If the guest overwrites this, it will be with the number of an unbound
+If the guest writes this, it will be with the number of an unbound
 event channel port it has acquired.  The toolstack is expected to use
 an interdomain bind, and then, when it wishes to ask the guest to
 suspend, to signal the event channel.
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 35e13428b2..b97823823b 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1272,10 +1272,22 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
const libxl_mac *src);
  * LIBXL_HAVE_CREATEINFO_DOMID
  *
  * libxl_domain_create_new() and libxl_domain_create_restore() will use
- * a domid specified in libxl_domain_create_info().
+ * a domid specified in libxl_domain_create_info.
  */
 #define LIBXL_HAVE_CREATEINFO_DOMID
 
+/*
+ * LIBXL_HAVE_CREATEINFO_SUSPEND_EVENT_CHANNEL
+ *
+ * libxl_domain_create_info contains a boolean 'suspend_event_channel'
+ * value to control whether the xenstore path:
+ *
+ * 

[Xen-devel] [PATCH v3 0/2] PV driver compatibility fixes

2020-03-13 Thread paul
From: Paul Durrant 

Paul Durrant (2):
  libxl: create domain 'error' node in xenstore
  libxl: make creation of xenstore 'suspend event channel' node
optional...

 docs/man/xl.cfg.5.pod.in|  7 +++
 docs/misc/xenstore-paths.pandoc | 12 +---
 tools/libxl/libxl.h | 14 +-
 tools/libxl/libxl_create.c  | 17 +
 tools/libxl/libxl_types.idl |  1 +
 tools/xl/xl_parse.c |  3 +++
 6 files changed, 46 insertions(+), 8 deletions(-)

-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 1/2] libxl: create domain 'error' node in xenstore

2020-03-13 Thread paul
From: Paul Durrant 

Several PV drivers (both historically and currently [1]) report errors
by writing text into /local/domain/$DOMID/error. This patch creates the
node in libxl and makes it writable by the domain, and also adds some
text into xenstore-paths.pandoc to state what the node is for.

[1] 
https://xenbits.xen.org/gitweb/?p=pvdrivers/win/xenvif.git;a=blob;f=src/xenvif/frontend.c;hb=HEAD#l459

Signed-off-by: Paul Durrant 
Acked-by: Ian Jackson 
---
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Wei Liu 
Cc: Anthony PERARD 
---
 docs/misc/xenstore-paths.pandoc | 5 +
 tools/libxl/libxl_create.c  | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/docs/misc/xenstore-paths.pandoc b/docs/misc/xenstore-paths.pandoc
index 0a6b36146e..e2ab5da54e 100644
--- a/docs/misc/xenstore-paths.pandoc
+++ b/docs/misc/xenstore-paths.pandoc
@@ -539,6 +539,11 @@ address written in one of these paths to, for example, 
establish a VNC
 session to the guest (although clearly some level of trust is placed
 in the value supplied by the guest in this case).
 
+ ~/error [w]
+
+A domain writable path used by some PV drivers to pass error messages
+to the toolstack.
+
 ### Paths private to the toolstack
 
  ~/device-model/$DOMID/state [w]
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 7891fae426..fb7b3999ae 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -797,6 +797,9 @@ retry_transaction:
 libxl__xs_mknod(gc, t,
 GCSPRINTF("%s/attr", dom_path),
 rwperm, ARRAY_SIZE(rwperm));
+libxl__xs_mknod(gc, t,
+GCSPRINTF("%s/error", dom_path),
+rwperm, ARRAY_SIZE(rwperm));
 
 if (libxl_defbool_val(info->driver_domain)) {
 /*
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] kconfig: expose all{yes,no}config targets

2020-03-13 Thread Wei Liu
On Fri, Mar 13, 2020 at 12:05:01PM +0100, Jan Beulich wrote:
> Without having them at least at the xen/Makefile level they're (close
> to?) inaccessible. As I'm uncertain about their utility at the top
> level, I'm leaving it at that for now.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] tools/helpers: xen-init-dom0: Mark clear_domid_history() static

2020-03-13 Thread Paul Durrant
> -Original Message-
> From: jul...@xen.org 
> Sent: 12 March 2020 20:24
> To: xen-devel@lists.xenproject.org
> Cc: Julien Grall ; Ian Jackson 
> ; Wei Liu ;
> p...@xen.org
> Subject: [PATCH] tools/helpers: xen-init-dom0: Mark clear_domid_history() 
> static
> 
> From: Julien Grall 
> 
> xen-init-dom0 is a standalone binary, so all the functions but the
> main() should be static.
> 
> Signed-off-by: Julien Grall 
> Cc: p...@xen.org

Reviewed-by: Paul Durrant 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/rwlocks: call preempt_disable() when taking a rwlock

2020-03-13 Thread Jürgen Groß

On 13.03.20 12:39, Julien Grall wrote:

Hi Juergen,

On 13/03/2020 11:23, Jürgen Groß wrote:

On 13.03.20 11:40, Julien Grall wrote:

Hi Juergen,

On 13/03/2020 10:15, Jürgen Groß wrote:

On 13.03.20 11:02, Julien Grall wrote:

Hi Juergen,

On 13/03/2020 08:05, Juergen Gross wrote:

Similar to spinlocks preemption should be disabled while holding a
rwlock.

Signed-off-by: Juergen Gross 
---
  xen/include/xen/rwlock.h | 18 +-
  1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/xen/include/xen/rwlock.h b/xen/include/xen/rwlock.h
index 1c221dd0d9..4ee341a182 100644
--- a/xen/include/xen/rwlock.h
+++ b/xen/include/xen/rwlock.h
@@ -2,6 +2,7 @@
  #define __RWLOCK_H__
  #include 
+#include 
  #include 
  #include 
@@ -57,10 +58,12 @@ static inline int _read_trylock(rwlock_t *lock)
  cnts = atomic_read(>cnts);
  if ( likely(_can_read_lock(cnts)) )
  {


If you get preempted here, then it means the check below is likely 
going to fail. So I think it would be best to disable preemption 
before, to give more chance to succeed.


As preemption probability at this very point should be much lower than
that of held locks I think that is optimizing the wrong path.


Why so? Lock contention should be fairly limited or you already have 
a problem on your system. So preemption is more likely.


Today probability of preemption is 0.


I am aware of that...



Even with preemption added the probability to be preempted in a sequence
of about a dozen instructions is _very_ low.


... but I am not convinced of the low probability here.






I'm not
opposed doing the modification you are requesting, but would like to
hear a second opinion on that topic, especially as I'd need to add
another preempt_enable() call when following your advice.


I don't really see the problem with adding a new preemption_enable() 
call. But the code can also be reworked to have only one call...


Its the dynamical path I'm speaking of. Accessing a local cpu variable
is not that cheap, and in case we add preemption in the future
preempt_enable() will become even more expensive.


Do you realize that the lock is most likely be uncontented? And if it 
were, the caller would likely not spin in a tight loop, otherwise it 
would have used read_lock().


So until you proved me otherwise (with numbers), this is 
micro-optimization that is not going to be seen in a workload.


Fine, in case you feeling so strong about that, I'll change it.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/rwlocks: call preempt_disable() when taking a rwlock

2020-03-13 Thread Julien Grall

Hi Juergen,

On 13/03/2020 11:23, Jürgen Groß wrote:

On 13.03.20 11:40, Julien Grall wrote:

Hi Juergen,

On 13/03/2020 10:15, Jürgen Groß wrote:

On 13.03.20 11:02, Julien Grall wrote:

Hi Juergen,

On 13/03/2020 08:05, Juergen Gross wrote:

Similar to spinlocks preemption should be disabled while holding a
rwlock.

Signed-off-by: Juergen Gross 
---
  xen/include/xen/rwlock.h | 18 +-
  1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/xen/include/xen/rwlock.h b/xen/include/xen/rwlock.h
index 1c221dd0d9..4ee341a182 100644
--- a/xen/include/xen/rwlock.h
+++ b/xen/include/xen/rwlock.h
@@ -2,6 +2,7 @@
  #define __RWLOCK_H__
  #include 
+#include 
  #include 
  #include 
@@ -57,10 +58,12 @@ static inline int _read_trylock(rwlock_t *lock)
  cnts = atomic_read(>cnts);
  if ( likely(_can_read_lock(cnts)) )
  {


If you get preempted here, then it means the check below is likely 
going to fail. So I think it would be best to disable preemption 
before, to give more chance to succeed.


As preemption probability at this very point should be much lower than
that of held locks I think that is optimizing the wrong path.


Why so? Lock contention should be fairly limited or you already have a 
problem on your system. So preemption is more likely.


Today probability of preemption is 0.


I am aware of that...



Even with preemption added the probability to be preempted in a sequence
of about a dozen instructions is _very_ low.


... but I am not convinced of the low probability here.






I'm not
opposed doing the modification you are requesting, but would like to
hear a second opinion on that topic, especially as I'd need to add
another preempt_enable() call when following your advice.


I don't really see the problem with adding a new preemption_enable() 
call. But the code can also be reworked to have only one call...


Its the dynamical path I'm speaking of. Accessing a local cpu variable
is not that cheap, and in case we add preemption in the future
preempt_enable() will become even more expensive.


Do you realize that the lock is most likely be uncontented? And if it 
were, the caller would likely not spin in a tight loop, otherwise it 
would have used read_lock().


So until you proved me otherwise (with numbers), this is 
micro-optimization that is not going to be seen in a workload.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [XEN PATCH] libxl: fix cleanup bug in initiate_domain_create()

2020-03-13 Thread Ian Jackson
Paweł Marczewski writes ("[XEN PATCH] libxl: fix cleanup bug in 
initiate_domain_create()"):
> In case of errors, we immediately call domcreate_complete()
> which cleans up the console_xswait object. Make sure it is initialized
> before we start cleanup.
> 
> Signed-off-by: Paweł Marczewski 

Reviewed-by: Ian Jackson 

I will push this in a moment.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Ping: [PATCH v2] AMD/IOMMU: fix off-by-one in amd_iommu_get_paging_mode() callers

2020-03-13 Thread Jan Beulich
On 03.03.2020 12:02, Jan Beulich wrote:
> amd_iommu_get_paging_mode() expects a count, not a "maximum possible"
> value. Prior to b4f042236ae0 dropping the reference, the use of our mis-
> named "max_page" in amd_iommu_domain_init() may have lead to such a
> misunderstanding. In an attempt to avoid such confusion in the future,
> rename the function's parameter and - while at it - convert it to an
> inline function.
> 
> Also replace a literal 4 by an expression tying it to a wider use
> constant, just like amd_iommu_quarantine_init() does.
> 
> Fixes: ea38867831da ("x86 / iommu: set up a scratch page in the quarantine 
> domain")
> Fixes: b4f042236ae0 ("AMD/IOMMU: Cease using a dynamic height for the IOMMU 
> pagetables")
> Signed-off-by: Jan Beulich 
> ---
> v2: Convert amd_iommu_get_paging_mode() itself to inline function,
> changing itss parameter's name.

Ping? Anything else needed here, beyond addressing your v1 comments?

Thanks, Jan

> ---
> Note: I'm not at the same time adding error checking here, despite
>   amd_iommu_get_paging_mode() possibly returning one, as I think
>   that's a sufficiently orthogonal aspect.
> 
> --- a/xen/drivers/passthrough/amd/iommu.h
> +++ b/xen/drivers/passthrough/amd/iommu.h
> @@ -218,7 +218,6 @@ int amd_iommu_init_late(void);
>  int amd_iommu_update_ivrs_mapping_acpi(void);
>  int iov_adjust_irq_affinities(void);
>  
> -int amd_iommu_get_paging_mode(unsigned long entries);
>  int amd_iommu_quarantine_init(struct domain *d);
>  
>  /* mapping functions */
> @@ -341,6 +340,22 @@ static inline unsigned long region_to_pa
>  return (PAGE_ALIGN(addr + size) - (addr & PAGE_MASK)) >> PAGE_SHIFT;
>  }
>  
> +static inline int amd_iommu_get_paging_mode(unsigned long max_frames)
> +{
> +int level = 1;
> +
> +BUG_ON(!max_frames);
> +
> +while ( max_frames > PTE_PER_TABLE_SIZE )
> +{
> +max_frames = PTE_PER_TABLE_ALIGN(max_frames) >> PTE_PER_TABLE_SHIFT;
> +if ( ++level > 6 )
> +return -ENOMEM;
> +}
> +
> +return level;
> +}
> +
>  static inline struct page_info *alloc_amd_iommu_pgtable(void)
>  {
>  struct page_info *pg = alloc_domheap_page(NULL, 0);
> --- a/xen/drivers/passthrough/amd/iommu_map.c
> +++ b/xen/drivers/passthrough/amd/iommu_map.c
> @@ -445,9 +445,9 @@ int amd_iommu_reserve_domain_unity_map(s
>  int __init amd_iommu_quarantine_init(struct domain *d)
>  {
>  struct domain_iommu *hd = dom_iommu(d);
> -unsigned long max_gfn =
> -PFN_DOWN((1ul << DEFAULT_DOMAIN_ADDRESS_WIDTH) - 1);
> -unsigned int level = amd_iommu_get_paging_mode(max_gfn);
> +unsigned long end_gfn =
> +1ul << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT);
> +unsigned int level = amd_iommu_get_paging_mode(end_gfn);
>  struct amd_iommu_pte *table;
>  
>  if ( hd->arch.root_table )
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -228,22 +228,6 @@ static int __must_check allocate_domain_
>  return rc;
>  }
>  
> -int amd_iommu_get_paging_mode(unsigned long entries)
> -{
> -int level = 1;
> -
> -BUG_ON( !entries );
> -
> -while ( entries > PTE_PER_TABLE_SIZE )
> -{
> -entries = PTE_PER_TABLE_ALIGN(entries) >> PTE_PER_TABLE_SHIFT;
> -if ( ++level > 6 )
> -return -ENOMEM;
> -}
> -
> -return level;
> -}
> -
>  static int amd_iommu_domain_init(struct domain *d)
>  {
>  struct domain_iommu *hd = dom_iommu(d);
> @@ -256,8 +240,10 @@ static int amd_iommu_domain_init(struct
>   *   physical address space we give it, but this isn't known yet so use 4
>   *   unilaterally.
>   */
> -hd->arch.paging_mode = is_hvm_domain(d)
> -? 4 : amd_iommu_get_paging_mode(get_upper_mfn_bound());
> +hd->arch.paging_mode = amd_iommu_get_paging_mode(
> +is_hvm_domain(d)
> +? 1ul << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT)
> +: get_upper_mfn_bound() + 1);
>  
>  return 0;
>  }
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [XEN PATCH] libxl: fix cleanup bug in initiate_domain_create()

2020-03-13 Thread Paweł Marczewski
In case of errors, we immediately call domcreate_complete()
which cleans up the console_xswait object. Make sure it is initialized
before we start cleanup.

Signed-off-by: Paweł Marczewski 
---
This is a follow up to my previous patch, 'libxl: wait for console path before
firing console_available'. We discovered the bug when running integration tests
for Qubes OS (and verified that this patch helps). Sorry for the trouble.

 tools/libxl/libxl_create.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index ada942bc8d..fc36c4263d 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1102,6 +1102,8 @@ static void initiate_domain_create(libxl__egc *egc,
 libxl_domain_config *const d_config = dcs->guest_config;
 const int restore_fd = dcs->restore_fd;
 
+libxl__xswait_init(>console_xswait);
+
 domid = dcs->domid;
 libxl__domain_build_state_init(>build_state);
 
@@ -1153,8 +1155,6 @@ static void initiate_domain_create(libxl__egc *egc,
 if (ret)
 goto error_out;
 
-libxl__xswait_init(>console_xswait);
-
 if (restore_fd >= 0 || dcs->soft_reset) {
 LOGD(DEBUG, domid, "restoring, not running bootloader");
 domcreate_bootloader_done(egc, >bl, 0);
-- 
2.24.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4] x86/cpu: Sync any remaining RCU callbacks before CPU up/down

2020-03-13 Thread Jan Beulich
On 10.03.2020 19:06, Igor Druzhinin wrote:
> During CPU down operation RCU callbacks are scheduled to finish
> off some actions later as soon as CPU is fully dead (the same applies
> to CPU up operation in case error path is taken). If in the same grace
> period another CPU up operation is performed on the same CPU, RCU callback
> will be called later on a CPU in a potentially wrong (already up again
> instead of still being down) state leading to eventual state inconsistency
> and/or crash.
> 
> In order to avoid it - flush RCU callbacks explicitly before starting the
> next CPU up/down operation.
> 
> Reviewed-by: Juergen Gross 
> Signed-off-by: Igor Druzhinin 
> ---
> This got discovered trying to resume PV shim with multiple vCPUs on AMD
> machine (where park_offline_cpus == 0). RCU callback responsible for
> freeing percpu area on CPU offline got finally called after CPU went
> online again as the guest performed regular vCPU offline/online operations
> on resume.
> 
> Note: this patch requires RCU series v4 from Juergen to be applied -
> https://lists.xenproject.org/archives/html/xen-devel/2020-03/msg00668.html

I was about to apply the patch yesterday (I think) when I stumbled
across this note. Is this actually still true? If so, would you
mind helping me see the dependency?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/rwlocks: call preempt_disable() when taking a rwlock

2020-03-13 Thread Jürgen Groß

On 13.03.20 11:40, Julien Grall wrote:

Hi Juergen,

On 13/03/2020 10:15, Jürgen Groß wrote:

On 13.03.20 11:02, Julien Grall wrote:

Hi Juergen,

On 13/03/2020 08:05, Juergen Gross wrote:

Similar to spinlocks preemption should be disabled while holding a
rwlock.

Signed-off-by: Juergen Gross 
---
  xen/include/xen/rwlock.h | 18 +-
  1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/xen/include/xen/rwlock.h b/xen/include/xen/rwlock.h
index 1c221dd0d9..4ee341a182 100644
--- a/xen/include/xen/rwlock.h
+++ b/xen/include/xen/rwlock.h
@@ -2,6 +2,7 @@
  #define __RWLOCK_H__
  #include 
+#include 
  #include 
  #include 
@@ -57,10 +58,12 @@ static inline int _read_trylock(rwlock_t *lock)
  cnts = atomic_read(>cnts);
  if ( likely(_can_read_lock(cnts)) )
  {


If you get preempted here, then it means the check below is likely 
going to fail. So I think it would be best to disable preemption 
before, to give more chance to succeed.


As preemption probability at this very point should be much lower than
that of held locks I think that is optimizing the wrong path.


Why so? Lock contention should be fairly limited or you already have a 
problem on your system. So preemption is more likely.


Today probability of preemption is 0.

Even with preemption added the probability to be preempted in a sequence
of about a dozen instructions is _very_ low.




I'm not
opposed doing the modification you are requesting, but would like to
hear a second opinion on that topic, especially as I'd need to add
another preempt_enable() call when following your advice.


I don't really see the problem with adding a new preemption_enable() 
call. But the code can also be reworked to have only one call...


Its the dynamical path I'm speaking of. Accessing a local cpu variable
is not that cheap, and in case we add preemption in the future
preempt_enable() will become even more expensive.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 1/4] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-03-13 Thread Jan Beulich
On 13.03.2020 12:18, Jürgen Groß wrote:
> On 13.03.20 12:02, Julien Grall wrote:
>> On 12/03/2020 08:28, Juergen Gross wrote:
>>> +    for ( ;; )
>>> +    {
>>> +    if ( !atomic_read(_count) && get_cpu_maps() )
>>> +    {
>>> +    n_cpus = num_online_cpus();
>>> +
>>> +    if ( atomic_cmpxchg(_count, 0, n_cpus + 1) == 0 )
>>> +    break;
>>> +
>>> +    put_cpu_maps();
>>> +    }
>>> +
>>> +    process_pending_softirqs();
>>> +    cpu_relax();
>>> +    }
>>> +
>>> +    atomic_set(_count, n_cpus);
>>> +    cpumask_raise_softirq(_online_map, RCU_SOFTIRQ);
>>> +
>>> +    put_cpu_maps();
>>
>> If you put the CPU maps, wouldn't it be possible to have a CPU turned 
>> off? If not, can you add a comment in the code why this is safe?
> 
> Yes, you are right. This might be possible, even if rather
> unlikely as a cpu being removed has to be in idle already, so
> the pending softirq should be picked up rather fast.

I think that's not the main aspect. The CPU to be removed may
already be spinning in cpu_hotplug_begin() (and may in particular
also already be past the rcu_barrier() that Igor's patch is going
to put there).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 1/4] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-03-13 Thread Jürgen Groß

On 13.03.20 12:02, Julien Grall wrote:



On 12/03/2020 08:28, Juergen Gross wrote:

Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Remove the barrier element from struct rcu_data as it isn't used.

Finally switch rcu_barrier() to return void as it now can never fail.

Partially-based-on-patch-by: Igor Druzhinin 
Signed-off-by: Juergen Gross 
---
V2:
- add recursion detection

V3:
- fix races (Igor Druzhinin)

V5:
- rename done_count to pending_count (Jan Beulich)
- fix race (Jan Beulich)
---
  xen/common/rcupdate.c  | 92 
--

  xen/include/xen/rcupdate.h |  2 +-
  2 files changed, 66 insertions(+), 28 deletions(-)

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index 03d84764d2..c5ef6acb1e 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -83,7 +83,6 @@ struct rcu_data {
  struct rcu_head **donetail;
  long    blimit;   /* Upper limit on a processed 
batch */

  int cpu;
-    struct rcu_head barrier;
  long    last_rs_qlen; /* qlen during the last 
resched */

  /* 3) idle CPUs handling */
@@ -91,6 +90,7 @@ struct rcu_data {
  bool idle_timer_active;
  bool    process_callbacks;
+    bool    barrier_active;
  };
  /*
@@ -143,51 +143,82 @@ static int qhimark = 1;
  static int qlowmark = 100;
  static int rsinterval = 1000;
-struct rcu_barrier_data {
-    struct rcu_head head;
-    atomic_t *cpu_count;
-};
+/*
+ * rcu_barrier() handling:
+ * cpu_count holds the number of cpu required to finish barrier 
handling.


NIT: the number of cpus (I think)


+ * pending_count is initialized to nr_cpus + 1.
+ * Cpus are synchronized via softirq mechanism. rcu_barrier() is 
regarded to
+ * be active if pending_count is not zero. In case rcu_barrier() is 
called on
+ * multiple cpus it is enough to check for pending_count being not 
zero on entry
+ * and to call process_pending_softirqs() in a loop until 
pending_count drops to

+ * zero, before starting the new rcu_barrier() processing.
+ * In order to avoid hangs when rcu_barrier() is called multiple 
times on the

+ * same cpu in fast sequence and a slave cpu couldn't drop out of the
+ * barrier handling fast enough a second counter pending_count is 
needed.


As an aside question, don't we miss a memory barrier in 
rcu_barrier_callback or rcu_barrier_action()? This barrier would ensure 
that the RCU changes have been seen before we tell the "master" CPU we 
are done.


Sounds like a sensible idea.



+ * The rcu_barrier() invoking cpu will wait until pending_count 
reaches 1
+ * (meaning that all cpus have finished processing the barrier) and 
then will

+ * reset pending_count to 0 to enable entering rcu_barrier() again.
+ */
+static atomic_t cpu_count = ATOMIC_INIT(0);
+static atomic_t pending_count = ATOMIC_INIT(0);
  static void rcu_barrier_callback(struct rcu_head *head)
  {
-    struct rcu_barrier_data *data = container_of(
-    head, struct rcu_barrier_data, head);
-    atomic_inc(data->cpu_count);
+    atomic_dec(_count);
  }
-static int rcu_barrier_action(void *_cpu_count)
+static void rcu_barrier_action(void)
  {
-    struct rcu_barrier_data data = { .cpu_count = _cpu_count };
-
-    ASSERT(!local_irq_is_enabled());
-    local_irq_enable();
+    struct rcu_head head;
  /*
   * When callback is executed, all previously-queued RCU work on 
this CPU
- * is completed. When all CPUs have executed their callback, 
data.cpu_count

- * will have been incremented to include every online CPU.
+ * is completed. When all CPUs have executed their callback, 
cpu_count

+ * will have been decremented to 0.
   */
-    call_rcu(, rcu_barrier_callback);
+    call_rcu(, rcu_barrier_callback);
-    while ( atomic_read(data.cpu_count) != num_online_cpus() )
+    while ( atomic_read(_count) )
  {
  process_pending_softirqs();
  cpu_relax();
  }
-    local_irq_disable();
-
-    return 0;
+    atomic_dec(_count);
  }
-/*
- * As rcu_barrier() is using stop_machine_run() it is allowed to be 
used in

- * idle context only (see comment for stop_machine_run()).
- */
-int rcu_barrier(void)
+void rcu_barrier(void)
  {
-    atomic_t cpu_count = ATOMIC_INIT(0);
-    return 

[Xen-devel] [PATCH v3 8/9] x86/HVM: reduce hvm.h include dependencies

2020-03-13 Thread Jan Beulich
Drop #include-s not needed by the header itself, and add smaller scope
ones instead. Put the ones needed into whichever other files actually
need them.

Signed-off-by: Jan Beulich 
Reviewed-by: Paul Durrant 
Reviewed-by: Kevin Tian 
---
v2: Also make things build with PV_SHIM_EXCLUSIVE=y.

--- a/xen/include/asm-x86/hvm/emulate.h
+++ b/xen/include/asm-x86/hvm/emulate.h
@@ -13,6 +13,7 @@
 #define __ASM_X86_HVM_EMULATE_H__
 
 #include 
+#include 
 #include 
 #include 
 
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -20,12 +20,11 @@
 #ifndef __ASM_X86_HVM_HVM_H__
 #define __ASM_X86_HVM_HVM_H__
 
+#include 
+#include 
 #include 
 #include 
 #include 
-#include 
-#include 
-#include 
 
 #ifdef CONFIG_HVM_FEP
 /* Permit use of the Forced Emulation Prefix in HVM guests */
@@ -326,6 +325,7 @@ int hvm_debug_op(struct vcpu *v, int32_t
 void hvm_toggle_singlestep(struct vcpu *v);
 void hvm_fast_singlestep(struct vcpu *v, uint16_t p2midx);
 
+struct npfec;
 int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
   struct npfec npfec);
 
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -18,6 +18,8 @@
 #ifndef __ASM_X86_HVM_VMX_VMCS_H__
 #define __ASM_X86_HVM_VMX_VMCS_H__
 
+#include 
+
 extern void vmcs_dump_vcpu(struct vcpu *v);
 extern void setup_vmcs_dump(void);
 extern int  vmx_cpu_up_prepare(unsigned int cpu);
--- a/xen/include/asm-x86/hvm/vpt.h
+++ b/xen/include/asm-x86/hvm/vpt.h
@@ -21,6 +21,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 /*


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 9/9] x86: reduce mce.h include dependencies

2020-03-13 Thread Jan Beulich
Drop the public header #include as not needed by the header itself. Add
one that was missing, and move all inside the inclusion guard.

Signed-off-by: Jan Beulich 

--- a/xen/include/asm-x86/mce.h
+++ b/xen/include/asm-x86/mce.h
@@ -1,8 +1,9 @@
-#include 
-#include 
 #ifndef _XEN_X86_MCE_H
 #define _XEN_X86_MCE_H
 
+#include 
+#include 
+
 /*
  * Emulate 2 banks for guest
  * Bank0: reserved for 'bank0 quirk' occur at some very old processors:
@@ -32,6 +33,9 @@ struct vmce {
 struct vmce_bank bank[GUEST_MC_BANK_NUM];
 };
 
+struct domain;
+struct vcpu;
+
 /* Guest vMCE MSRs virtualization */
 extern void vmce_init_vcpu(struct vcpu *);
 extern int vmce_restore_vcpu(struct vcpu *, const struct hvm_vmce_vcpu *);


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 7/9] x86/HVM: reduce io.h include dependencies

2020-03-13 Thread Jan Beulich
Drop #include-s not needed by the header itself as well as one include
of the header which isn't needed. Put the one needed into the file
actually requiring it.

Signed-off-by: Jan Beulich 
Reviewed-by: Kevin Tian 

--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -19,12 +19,8 @@
 #ifndef __ASM_X86_HVM_IO_H__
 #define __ASM_X86_HVM_IO_H__
 
-#include 
 #include 
-#include 
-#include 
 #include 
-#include 
 
 #define NR_IO_HANDLERS 32
 
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 enum hvm_io_completion {
 HVMIO_no_completion,
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -18,8 +18,6 @@
 #ifndef __ASM_X86_HVM_VMX_VMCS_H__
 #define __ASM_X86_HVM_VMX_VMCS_H__
 
-#include 
-
 extern void vmcs_dump_vcpu(struct vcpu *v);
 extern void setup_vmcs_dump(void);
 extern int  vmx_cpu_up_prepare(unsigned int cpu);


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 4/9] x86/HVM: reduce vpic.h include dependencies

2020-03-13 Thread Jan Beulich
Drop an #include not needed by the header itself.

Signed-off-by: Jan Beulich 

--- a/xen/include/asm-x86/hvm/vpic.h
+++ b/xen/include/asm-x86/hvm/vpic.h
@@ -27,7 +27,8 @@
 #ifndef __ASM_X86_HVM_VPIC_H__
 #define __ASM_X86_HVM_VPIC_H__
 
-#include 
+struct domain;
+struct vcpu;
 
 void vpic_irq_positive_edge(struct domain *d, int irq);
 void vpic_irq_negative_edge(struct domain *d, int irq);


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 6/9] x86/HVM: reduce vlapic.h include dependencies

2020-03-13 Thread Jan Beulich
Drop #include-s not needed by the header itself.

Signed-off-by: Jan Beulich 

--- a/xen/include/asm-x86/hvm/vlapic.h
+++ b/xen/include/asm-x86/hvm/vlapic.h
@@ -21,8 +21,6 @@
 #define __ASM_X86_HVM_VLAPIC_H__
 
 #include 
-#include 
-#include 
 #include 
 
 #define vcpu_vlapic(x)   (&(x)->arch.hvm.vlapic)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 5/9] x86/HVM: reduce vioapic.h include dependencies

2020-03-13 Thread Jan Beulich
Drop an #include not needed by the header itself. While verifying the
header (now) builds standalone, I noticed an omission in a public header
which gets taken care of here as well.

Signed-off-by: Jan Beulich 

--- a/xen/include/asm-x86/hvm/vioapic.h
+++ b/xen/include/asm-x86/hvm/vioapic.h
@@ -25,7 +25,6 @@
 #define __ASM_X86_HVM_VIOAPIC_H__
 
 #include 
-#include 
 #include 
 
 #define VIOAPIC_VERSION_ID 0x11 /* IOAPIC version */
--- a/xen/include/public/arch-x86/hvm/save.h
+++ b/xen/include/public/arch-x86/hvm/save.h
@@ -26,6 +26,8 @@
 #ifndef __XEN_PUBLIC_HVM_SAVE_X86_H__
 #define __XEN_PUBLIC_HVM_SAVE_X86_H__
 
+#include "../../xen.h"
+
 /*
  * Save/restore header: general info about the save file.
  */


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 1/9] x86/HVM: reduce domain.h include dependencies

2020-03-13 Thread Jan Beulich
Drop #include-s not needed by the header itself. Put the ones needed
into whichever other files actually need them.

Signed-off-by: Jan Beulich 
Reviewed-by: Paul Durrant 
---
v3: Also make things build with all{yes,no}config.
v2: Also make things build with XSM=y.

--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -27,6 +27,8 @@
 
 #include 
 
+#include 
+
 struct dmop_args {
 domid_t domid;
 unsigned int nr_bufs;
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Have the TSS cover the ISA port range, which makes it
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define domain_vhpet(x) (&(x)->arch.hvm.pl_time->vhpet)
 #define vcpu_vhpet(x)   (domain_vhpet((x)->domain))
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -24,6 +24,9 @@
 
 #include 
 
+#include 
+#include 
+
 static long hvm_memory_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
 const struct vcpu *curr = current;
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -34,6 +34,7 @@
 #include 
 
 #include 
+#include 
 
 static void set_ioreq_server(struct domain *d, unsigned int id,
  struct hvm_ioreq_server *s)
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 bool hvm_domain_use_pirq(const struct domain *d, const struct pirq *pirq)
 {
--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define USEC_PER_SEC100UL
 #define NS_PER_USEC 1000UL
--- a/xen/arch/x86/hvm/viridian/private.h
+++ b/xen/arch/x86/hvm/viridian/private.h
@@ -4,6 +4,7 @@
 #define X86_HVM_VIRIDIAN_PRIVATE_H
 
 #include 
+#include 
 
 int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val);
 int viridian_synic_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val);
--- a/xen/arch/x86/hvm/vpt.c
+++ b/xen/arch/x86/hvm/vpt.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define mode_is(d, name) \
 ((d)->arch.hvm.params[HVM_PARAM_TIMER_MODE] == HVMPTM_##name)
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -28,6 +28,8 @@
 #include 
 #include 
 
+#include 
+
 DEFINE_PER_CPU(uint32_t, tsc_aux);
 
 struct msr_policy __read_mostly raw_msr_policy,
--- a/xen/arch/x86/pv/shim.c
+++ b/xen/arch/x86/pv/shim.c
@@ -35,6 +35,7 @@
 #include 
 
 #include 
+#include 
 
 #include 
 
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -76,6 +76,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* for public/io/ring.h macros */
 #define xen_mb()   smp_mb()
--- a/xen/drivers/char/xen_pv_console.c
+++ b/xen/drivers/char/xen_pv_console.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 
+#include 
 #include 
 
 static struct xencons_interface *cons_ring;
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -20,20 +20,14 @@
 #ifndef __ASM_X86_HVM_DOMAIN_H__
 #define __ASM_X86_HVM_DOMAIN_H__
 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+
 #include 
-#include 
 #include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
+
 #include 
 
 struct hvm_ioreq_page {
--- a/xen/include/asm-x86/hvm/nestedhvm.h
+++ b/xen/include/asm-x86/hvm/nestedhvm.h
@@ -22,6 +22,7 @@
 #include  /* for uintNN_t */
 #include  /* for struct vcpu, struct domain */
 #include   /* for vcpu_nestedhvm */
+#include 
 
 enum nestedhvm_vmexits {
 NESTEDHVM_VMEXIT_ERROR = 0, /* inject VMEXIT w/ invalid VMCB */
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -17,6 +17,7 @@
 
 #include 
 #include 
+#include 
 
 /* Cannot use BUILD_BUG_ON here because the expressions we check are not
  * considered constant at compile time. Instead, rely on constant propagation 
to
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -28,7 +28,7 @@
 #include 
 #include 
 #include 
-
+#include 
 #include 
 
 #include 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 3/9] x86/HVM: reduce vpt.h include dependencies

2020-03-13 Thread Jan Beulich
Drop #include-s not needed by the header itself.

Signed-off-by: Jan Beulich 

--- a/xen/include/asm-x86/hvm/vpt.h
+++ b/xen/include/asm-x86/hvm/vpt.h
@@ -19,16 +19,9 @@
 #ifndef __ASM_X86_HVM_VPT_H__
 #define __ASM_X86_HVM_VPT_H__
 
-#include 
-#include 
-#include 
-#include 
-#include 
 #include 
 #include 
-#include 
-#include 
-#include 
+#include 
 
 /*
  * Abstract layer of periodic time, one short time.
@@ -145,6 +138,7 @@ struct pl_time {/* platform time */
 void pt_save_timer(struct vcpu *v);
 void pt_restore_timer(struct vcpu *v);
 int pt_update_irq(struct vcpu *v);
+struct hvm_intack;
 void pt_intr_post(struct vcpu *v, struct hvm_intack intack);
 void pt_migrate(struct vcpu *v);
 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 2/9] x86/HVM: reduce vcpu.h include dependencies

2020-03-13 Thread Jan Beulich
Drop #include-s not needed by the header itself. Put the ones needed
into whichever other files actually need them.

Signed-off-by: Jan Beulich 
Reviewed-by: Paul Durrant 
---
v2: Re-base over changes to previous patch.

--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -23,6 +23,7 @@
 #include 
 
 #include 
+#include 
 
 #include 
 #include 
--- a/xen/arch/x86/hvm/viridian/private.h
+++ b/xen/arch/x86/hvm/viridian/private.h
@@ -4,6 +4,7 @@
 #define X86_HVM_VIRIDIAN_PRIVATE_H
 
 #include 
+#include 
 #include 
 
 int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val);
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -25,6 +25,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -20,9 +20,7 @@
 #define __ASM_X86_HVM_VCPU_H__
 
 #include 
-#include 
 #include 
-#include 
 #include 
 #include 
 #include 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 0/9] x86: reduce include dependencies

2020-03-13 Thread Jan Beulich
In a number of cases I've noticed the x86 emulator, which is quite
slow to build especially with not very new gcc, to re-build when
having changed headers which I wouldn't have expected to be
included there in the first place. Hence I've gone through the
dependencies of that object file and tried to get rid of at least
some of the very odd dependencies there. (Some are being addressed
also be the separately sent mem-access and vm-event patches with a
similar subject.)

1: HVM: reduce domain.h include dependencies
2: HVM: reduce vcpu.h include dependencies
3: HVM: reduce vpt.h include dependencies
4: HVM: reduce vpic.h include dependencies
5: HVM: reduce vioapic.h include dependencies
6: HVM: reduce vlapic.h include dependencies
7: HVM: reduce io.h include dependencies
8: HVM: reduce hvm.h include dependencies
9: reduce mce.h include dependencies

v3: Address build issues in particular with XSM=y + PV_SHIM=y (patch
1 only).
v2: Address build issues in particular with XSM=y.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] kconfig: expose all{yes,no}config targets

2020-03-13 Thread Jan Beulich
Without having them at least at the xen/Makefile level they're (close
to?) inaccessible. As I'm uncertain about their utility at the top
level, I'm leaving it at that for now.

Signed-off-by: Jan Beulich 

--- a/xen/Makefile
+++ b/xen/Makefile
@@ -277,7 +277,7 @@ $(foreach base,arch/x86/mm/guest_walk_%
arch/x86/mm/shadow/guest_%, \
 $(foreach ext,o i s,$(call build-intermediate,$(base).$(ext
 
-kconfig := oldconfig config menuconfig defconfig \
+kconfig := oldconfig config menuconfig defconfig allyesconfig allnoconfig \
nconfig xconfig gconfig savedefconfig listnewconfig olddefconfig \
randconfig $(notdir $(wildcard arch/$(SRCARCH)/configs/*_defconfig))
 .PHONY: $(kconfig)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 1/4] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-03-13 Thread Julien Grall



On 12/03/2020 08:28, Juergen Gross wrote:

Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Remove the barrier element from struct rcu_data as it isn't used.

Finally switch rcu_barrier() to return void as it now can never fail.

Partially-based-on-patch-by: Igor Druzhinin 
Signed-off-by: Juergen Gross 
---
V2:
- add recursion detection

V3:
- fix races (Igor Druzhinin)

V5:
- rename done_count to pending_count (Jan Beulich)
- fix race (Jan Beulich)
---
  xen/common/rcupdate.c  | 92 --
  xen/include/xen/rcupdate.h |  2 +-
  2 files changed, 66 insertions(+), 28 deletions(-)

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index 03d84764d2..c5ef6acb1e 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -83,7 +83,6 @@ struct rcu_data {
  struct rcu_head **donetail;
  longblimit;   /* Upper limit on a processed batch */
  int cpu;
-struct rcu_head barrier;
  longlast_rs_qlen; /* qlen during the last resched */
  
  /* 3) idle CPUs handling */

@@ -91,6 +90,7 @@ struct rcu_data {
  bool idle_timer_active;
  
  boolprocess_callbacks;

+boolbarrier_active;
  };
  
  /*

@@ -143,51 +143,82 @@ static int qhimark = 1;
  static int qlowmark = 100;
  static int rsinterval = 1000;
  
-struct rcu_barrier_data {

-struct rcu_head head;
-atomic_t *cpu_count;
-};
+/*
+ * rcu_barrier() handling:
+ * cpu_count holds the number of cpu required to finish barrier handling.


NIT: the number of cpus (I think)


+ * pending_count is initialized to nr_cpus + 1.
+ * Cpus are synchronized via softirq mechanism. rcu_barrier() is regarded to
+ * be active if pending_count is not zero. In case rcu_barrier() is called on
+ * multiple cpus it is enough to check for pending_count being not zero on 
entry
+ * and to call process_pending_softirqs() in a loop until pending_count drops 
to
+ * zero, before starting the new rcu_barrier() processing.
+ * In order to avoid hangs when rcu_barrier() is called multiple times on the
+ * same cpu in fast sequence and a slave cpu couldn't drop out of the
+ * barrier handling fast enough a second counter pending_count is needed.


As an aside question, don't we miss a memory barrier in 
rcu_barrier_callback or rcu_barrier_action()? This barrier would ensure 
that the RCU changes have been seen before we tell the "master" CPU we 
are done.



+ * The rcu_barrier() invoking cpu will wait until pending_count reaches 1
+ * (meaning that all cpus have finished processing the barrier) and then will
+ * reset pending_count to 0 to enable entering rcu_barrier() again.
+ */
+static atomic_t cpu_count = ATOMIC_INIT(0);
+static atomic_t pending_count = ATOMIC_INIT(0);
  
  static void rcu_barrier_callback(struct rcu_head *head)

  {
-struct rcu_barrier_data *data = container_of(
-head, struct rcu_barrier_data, head);
-atomic_inc(data->cpu_count);
+atomic_dec(_count);
  }
  
-static int rcu_barrier_action(void *_cpu_count)

+static void rcu_barrier_action(void)
  {
-struct rcu_barrier_data data = { .cpu_count = _cpu_count };
-
-ASSERT(!local_irq_is_enabled());
-local_irq_enable();
+struct rcu_head head;
  
  /*

   * When callback is executed, all previously-queued RCU work on this CPU
- * is completed. When all CPUs have executed their callback, data.cpu_count
- * will have been incremented to include every online CPU.
+ * is completed. When all CPUs have executed their callback, cpu_count
+ * will have been decremented to 0.
   */
-call_rcu(, rcu_barrier_callback);
+call_rcu(, rcu_barrier_callback);
  
-while ( atomic_read(data.cpu_count) != num_online_cpus() )

+while ( atomic_read(_count) )
  {
  process_pending_softirqs();
  cpu_relax();
  }
  
-local_irq_disable();

-
-return 0;
+atomic_dec(_count);
  }
  
-/*

- * As rcu_barrier() is using stop_machine_run() it is allowed to be used in
- * idle context only (see comment for stop_machine_run()).
- */
-int rcu_barrier(void)
+void rcu_barrier(void)
  {
-atomic_t cpu_count = ATOMIC_INIT(0);
-return stop_machine_run(rcu_barrier_action, _count, NR_CPUS);
+

Re: [Xen-devel] [PATCH 1/2] xen/rwlocks: call preempt_disable() when taking a rwlock

2020-03-13 Thread Julien Grall

Hi Juergen,

On 13/03/2020 10:15, Jürgen Groß wrote:

On 13.03.20 11:02, Julien Grall wrote:

Hi Juergen,

On 13/03/2020 08:05, Juergen Gross wrote:

Similar to spinlocks preemption should be disabled while holding a
rwlock.

Signed-off-by: Juergen Gross 
---
  xen/include/xen/rwlock.h | 18 +-
  1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/xen/include/xen/rwlock.h b/xen/include/xen/rwlock.h
index 1c221dd0d9..4ee341a182 100644
--- a/xen/include/xen/rwlock.h
+++ b/xen/include/xen/rwlock.h
@@ -2,6 +2,7 @@
  #define __RWLOCK_H__
  #include 
+#include 
  #include 
  #include 
@@ -57,10 +58,12 @@ static inline int _read_trylock(rwlock_t *lock)
  cnts = atomic_read(>cnts);
  if ( likely(_can_read_lock(cnts)) )
  {


If you get preempted here, then it means the check below is likely 
going to fail. So I think it would be best to disable preemption 
before, to give more chance to succeed.


As preemption probability at this very point should be much lower than
that of held locks I think that is optimizing the wrong path.


Why so? Lock contention should be fairly limited or you already have a 
problem on your system. So preemption is more likely.



I'm not
opposed doing the modification you are requesting, but would like to
hear a second opinion on that topic, especially as I'd need to add
another preempt_enable() call when following your advice.


I don't really see the problem with adding a new preemption_enable() 
call. But the code can also be reworked to have only one call...


Cheers,


--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/rwlocks: call preempt_disable() when taking a rwlock

2020-03-13 Thread Jan Beulich
On 13.03.2020 11:15, Jürgen Groß wrote:
> On 13.03.20 11:02, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 13/03/2020 08:05, Juergen Gross wrote:
>>> Similar to spinlocks preemption should be disabled while holding a
>>> rwlock.
>>>
>>> Signed-off-by: Juergen Gross 
>>> ---
>>>   xen/include/xen/rwlock.h | 18 +-
>>>   1 file changed, 17 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/xen/include/xen/rwlock.h b/xen/include/xen/rwlock.h
>>> index 1c221dd0d9..4ee341a182 100644
>>> --- a/xen/include/xen/rwlock.h
>>> +++ b/xen/include/xen/rwlock.h
>>> @@ -2,6 +2,7 @@
>>>   #define __RWLOCK_H__
>>>   #include 
>>> +#include 
>>>   #include 
>>>   #include 
>>> @@ -57,10 +58,12 @@ static inline int _read_trylock(rwlock_t *lock)
>>>   cnts = atomic_read(>cnts);
>>>   if ( likely(_can_read_lock(cnts)) )
>>>   {
>>
>> If you get preempted here, then it means the check below is likely going 
>> to fail. So I think it would be best to disable preemption before, to 
>> give more chance to succeed.
> 
> As preemption probability at this very point should be much lower than
> that of held locks I think that is optimizing the wrong path. I'm not
> opposed doing the modification you are requesting, but would like to
> hear a second opinion on that topic, especially as I'd need to add
> another preempt_enable() call when following your advice.

I can see arguments for both placements, and hence I'm fine either
way.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Stopping much Linux testing in Xen Project CI

2020-03-13 Thread Jürgen Groß

On 13.03.20 10:13, Jan Beulich wrote:

On 12.03.2020 18:55, Roger Pau Monné wrote:

On Thu, Mar 12, 2020 at 04:49:51PM +, Ian Jackson wrote:

Linux stable branches, and Linux upstream tip, are badly broken and
have been for months.  Apparently no-one is able to (or has time to)
to investigate and fix.

   linux-4.4  218 days to be suspended
   linux-4.9  134 days to be suspended
   linux-4.14 134 days to be suspended
   linux-4.19 134 days to be suspended
   linux-5.4   55 days
   linux-arm-xen up to date
   linux-linus372 days to be suspended

These are times since the last push - ie, how long it has been broken.
Evidently no-one is paying any attention to this.[1]  I looked at the
reports myself and:

Nested HVM is broken on Intel in all of the 4.x branches.


FWIW, it's the Debian installer kernel the one that crashes AFAICT,
all the failures are:

[0.00] Linux version 4.9.0-6-amd64 (debian-ker...@lists.debian.org) 
(gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1) ) #1 SMP Debian 
4.9.82-1+deb9u3 (2018-03-02)
[...]
[0.00] clocksource: hpet: mask: 0x max_cycles: 0x, 
max_idle_ns: 30580167144 ns
[0.00] tsc: Fast TSC calibration failed
[0.00] tsc: Unable to calibrate against PIT
[0.00] tsc: HPET/PMTIMER calibration failed
[0.00] divide error:  [#1] SMP
[0.00] Modules linked in:
[0.00] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-6-amd64 #1 
Debian 4.9.82-1+deb9u3
[0.00] Hardware name: Xen HVM domU, BIOS 4.14-unstable 03/11/2020
[0.00] task: ab611500 task.stack: ab60
[0.00] RIP: 0010:[]  [] 
pvclock_tsc_khz+0xf/0x30


Seeing this and ...


[0.00] RSP: :ab603f38  EFLAGS: 00010246
[0.00] RAX: 000f4240 RBX:  RCX: 
[0.00] RDX:  RSI: 0246 RDI: ab939020
[0.00] RBP: 93806e8f1540 R08: 3a637374 R09: 6f6974617262696c
[0.00] R10: 0032f3af6dcd R11: 4d502f5445504820 R12: ab7dc920
[0.00] R13: ab7e82e0 R14: 000146f0 R15: 008e
[0.00] FS:  () GS:93806e60() 
knlGS:
[0.00] CS:  0010 DS:  ES:  CR0: 80050033
[0.00] CR2: 938065f3a000 CR3: 25c08000 CR4: 000406b0
[0.00] Stack:
[0.00]  ab74b1b6 93806e8f1540 ab7dc920 
ba81e537ba81e512
[0.00]   93806e8f1540 ab73deb6 
ab7e82e0
[0.00]   0020 ab73 

[0.00] Call Trace:
[0.00]  [] ? tsc_init+0x39/0x25b


... this and looking at xen_tsc_khz(), isn't it supposed to use
per_cpu(xen_vcpu, 0) instead, in case vCPU info got relocated?
(Code looks to be the same in 4.9 and 5.5. I'd also question
the hard-coded zero in there, but that's a different topic.)


It should use per_cpu(xen_vcpu, 0), but OTOH it shouldn't matter that
much if it doesn't, as the time information from the shared info page
wouldn't go away.

Seeing a zero divisor here indicates that HYPERVISOR_shared_info might
still point to the dummy shared info structure.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] libfsimage: fix clang 10 build

2020-03-13 Thread Wei Liu
On Fri, Mar 13, 2020 at 09:45:57AM +0100, Roger Pau Monne wrote:
> clang complains with:
> 
> fsys_zfs.c:826:2: error: converting the enum constant to a boolean 
> [-Werror,-Wint-in-bool-context]
> VERIFY_DN_TYPE(dn, DMU_OT_PLAIN_FILE_CONTENTS);
> ^
> /wrkdirs/usr/ports/sysutils/xen-tools/work/xen-4.13.0/tools/libfsimage/zfs/../../../tools/libfsimage/zfs/fsys_zfs.h:74:11:
>  note: expanded from macro 'VERIFY_DN_TYPE'
> if (type && (dnp)->dn_type != type) { \
>  ^
> 1 error generated.
> 
> Fix this by not forcing an implicit conversion of the enum into a
> boolean and instead comparing with the 0 enumerator.
> 
> Signed-off-by: Roger Pau Monné 

Both patches:

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/rwlocks: call preempt_disable() when taking a rwlock

2020-03-13 Thread Jürgen Groß

On 13.03.20 11:02, Julien Grall wrote:

Hi Juergen,

On 13/03/2020 08:05, Juergen Gross wrote:

Similar to spinlocks preemption should be disabled while holding a
rwlock.

Signed-off-by: Juergen Gross 
---
  xen/include/xen/rwlock.h | 18 +-
  1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/xen/include/xen/rwlock.h b/xen/include/xen/rwlock.h
index 1c221dd0d9..4ee341a182 100644
--- a/xen/include/xen/rwlock.h
+++ b/xen/include/xen/rwlock.h
@@ -2,6 +2,7 @@
  #define __RWLOCK_H__
  #include 
+#include 
  #include 
  #include 
@@ -57,10 +58,12 @@ static inline int _read_trylock(rwlock_t *lock)
  cnts = atomic_read(>cnts);
  if ( likely(_can_read_lock(cnts)) )
  {


If you get preempted here, then it means the check below is likely going 
to fail. So I think it would be best to disable preemption before, to 
give more chance to succeed.


As preemption probability at this very point should be much lower than
that of held locks I think that is optimizing the wrong path. I'm not
opposed doing the modification you are requesting, but would like to
hear a second opinion on that topic, especially as I'd need to add
another preempt_enable() call when following your advice.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] tools/helpers: xen-init-dom0: Mark clear_domid_history() static

2020-03-13 Thread Wei Liu
On Thu, Mar 12, 2020 at 08:24:07PM +, jul...@xen.org wrote:
> From: Julien Grall 
> 
> xen-init-dom0 is a standalone binary, so all the functions but the
> main() should be static.
> 
> Signed-off-by: Julien Grall 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/rwlocks: call preempt_disable() when taking a rwlock

2020-03-13 Thread Julien Grall

Hi Juergen,

On 13/03/2020 08:05, Juergen Gross wrote:

Similar to spinlocks preemption should be disabled while holding a
rwlock.

Signed-off-by: Juergen Gross 
---
  xen/include/xen/rwlock.h | 18 +-
  1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/xen/include/xen/rwlock.h b/xen/include/xen/rwlock.h
index 1c221dd0d9..4ee341a182 100644
--- a/xen/include/xen/rwlock.h
+++ b/xen/include/xen/rwlock.h
@@ -2,6 +2,7 @@
  #define __RWLOCK_H__
  
  #include 

+#include 
  #include 
  #include 
  
@@ -57,10 +58,12 @@ static inline int _read_trylock(rwlock_t *lock)

  cnts = atomic_read(>cnts);
  if ( likely(_can_read_lock(cnts)) )
  {


If you get preempted here, then it means the check below is likely going 
to fail. So I think it would be best to disable preemption before, to 
give more chance to succeed.



+preempt_disable();
  cnts = (u32)atomic_add_return(_QR_BIAS, >cnts);
  if ( likely(_can_read_lock(cnts)) )
  return 1;
  atomic_sub(_QR_BIAS, >cnts);
+preempt_enable();
  }
  return 0;
  }
@@ -73,6 +76,7 @@ static inline void _read_lock(rwlock_t *lock)
  {
  u32 cnts;
  
+preempt_disable();

  cnts = atomic_add_return(_QR_BIAS, >cnts);
  if ( likely(_can_read_lock(cnts)) )
  return;
@@ -106,6 +110,7 @@ static inline void _read_unlock(rwlock_t *lock)
   * Atomically decrement the reader count
   */
  atomic_sub(_QR_BIAS, >cnts);
+preempt_enable();
  }
  
  static inline void _read_unlock_irq(rwlock_t *lock)

@@ -137,6 +142,7 @@ static inline unsigned int _write_lock_val(void)
  static inline void _write_lock(rwlock_t *lock)
  {
  /* Optimize for the unfair lock case where the fair flag is 0. */
+preempt_disable();
  if ( atomic_cmpxchg(>cnts, 0, _write_lock_val()) == 0 )
  return;
  
@@ -172,13 +178,21 @@ static inline int _write_trylock(rwlock_t *lock)

  if ( unlikely(cnts) )
  return 0;
  
-return likely(atomic_cmpxchg(>cnts, 0, _write_lock_val()) == 0);

+preempt_disable();


Similar remark as the read_trylock().


+if ( unlikely(atomic_cmpxchg(>cnts, 0, _write_lock_val()) != 0) )
+{
+preempt_enable();
+return 0;
+}
+
+return 1;
  }
  
  static inline void _write_unlock(rwlock_t *lock)

  {
  ASSERT(_is_write_locked_by_me(atomic_read(>cnts)));
  atomic_and(~(_QW_CPUMASK | _QW_WMASK), >cnts);
+preempt_enable();
  }
  
  static inline void _write_unlock_irq(rwlock_t *lock)

@@ -274,6 +288,7 @@ static inline void _percpu_read_lock(percpu_rwlock_t 
**per_cpudata,
  }
  
  /* Indicate this cpu is reading. */

+preempt_disable();
  this_cpu_ptr(per_cpudata) = percpu_rwlock;
  smp_mb();
  /* Check if a writer is waiting. */
@@ -308,6 +323,7 @@ static inline void _percpu_read_unlock(percpu_rwlock_t 
**per_cpudata,
  return;
  }
  this_cpu_ptr(per_cpudata) = NULL;
+preempt_enable();
  smp_wmb();
  }
  



Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/rwlocks: call preempt_disable() when taking a rwlock

2020-03-13 Thread Julien Grall



On 13/03/2020 08:48, Jan Beulich wrote:

On 13.03.2020 09:05, Juergen Gross wrote:

Similar to spinlocks preemption should be disabled while holding a
rwlock.

Signed-off-by: Juergen Gross 


Just one note/question:


@@ -308,6 +323,7 @@ static inline void _percpu_read_unlock(percpu_rwlock_t 
**per_cpudata,
  return;
  }
  this_cpu_ptr(per_cpudata) = NULL;
+preempt_enable();
  smp_wmb();
  }


It would seem more logical to me to insert this after the smp_wmb().


+1


Thoughts? I'll be happy to give my R-b once we've settled on this.

Jan



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 2/2] x86/xen: Make the secondary CPU idle tasks reliable

2020-03-13 Thread Miroslav Benes
On Fri, 13 Mar 2020, Jürgen Groß wrote:

> On 12.03.20 15:20, Miroslav Benes wrote:
> > The unwinder reports the secondary CPU idle tasks' stack on XEN PV as
> > unreliable, which affects at least live patching.
> > cpu_initialize_context() sets up the context of the CPU through
> > VCPUOP_initialise hypercall. After it is woken up, the idle task starts
> > in cpu_bringup_and_idle() function and its stack starts at the offset
> > right below pt_regs. The unwinder correctly detects the end of stack
> > there but it is confused by NULL return address in the last frame.
> > 
> > RFC: I haven't found the way to teach the unwinder about the state of
> > the stack there. Thus the ugly hack using assembly. Similar to what
> > startup_xen() has got for boot CPU.
> > 
> > It introduces objtool "unreachable instruction" warning just right after
> > the jump to cpu_bringup_and_idle(). It should show the idea what needs
> > to be done though, I think. Ideas welcome.
> > 
> > Signed-off-by: Miroslav Benes 
> > ---
> >   arch/x86/xen/smp_pv.c   |  3 ++-
> >   arch/x86/xen/xen-head.S | 10 ++
> >   2 files changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
> > index 802ee5bba66c..6b88cdcbef8f 100644
> > --- a/arch/x86/xen/smp_pv.c
> > +++ b/arch/x86/xen/smp_pv.c
> > @@ -53,6 +53,7 @@ static DEFINE_PER_CPU(struct xen_common_irq, xen_irq_work)
> > = { .irq = -1 };
> >   static DEFINE_PER_CPU(struct xen_common_irq, xen_pmu_irq) = { .irq = -1 };
> >   
> >   static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id);
> > +extern unsigned char asm_cpu_bringup_and_idle[];
> >   
> >   static void cpu_bringup(void)
> >   {
> 
> Would adding this here work?
> 
> + asm volatile (UNWIND_HINT(ORC_REG_UNDEFINED, 0, ORC_TYPE_CALL, 1));

I tried something similar. It did not work, because than the hint is 
"bound" to the closest next call in the function which is cr4_init() in 
this case. The unwinder would not take it into account.

In my case, I placed it at the beginning of cpu_bringup_and_idle(). I also 
open coded it and played with the offset in the orc entry, but that did 
not work for some other reason.

However, now I tried this

diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 6b88cdcbef8f..39afd88309cb 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -92,6 +92,7 @@ asmlinkage __visible void cpu_bringup_and_idle(void)
 {
cpu_bringup();
boot_init_stack_canary();
+   asm volatile (UNWIND_HINT(ORC_REG_UNDEFINED, 0, ORC_TYPE_CALL, 1));
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
 }

and that seems to work. I need to properly verify and test, but the 
explanation is that as opposed to the above, cpu_startup_entry() is on the 
idle task's stack and the hint is then taken into account. The unwound 
stack seems to be complete, so it could indeed be the fix.

Thanks
Miroslav___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/spinlocks: fix placement of preempt_[dis|en]able()

2020-03-13 Thread Julien Grall

Hi,

On 13/03/2020 09:00, Jürgen Groß wrote:

On 13.03.20 09:55, Jan Beulich wrote:

On 13.03.2020 09:05, Juergen Gross wrote:

@@ -199,10 +199,10 @@ unsigned long _spin_lock_irqsave(spinlock_t *lock)
  void _spin_unlock(spinlock_t *lock)
  {
  arch_lock_release_barrier();
-    preempt_enable();
  LOCK_PROFILE_REL;
  rel_lock(>debug);
  add_sized(>tickets.head, 1);
+    preempt_enable();
  arch_lock_signal();
  }


arch_lock_signal() is a barrier on Arm, hence just like for patch 1
I wonder whether the insertion wouldn't better come after it.


The important barrier in spin_unlock() is arch_lock_release_barrier().

The one in arch_lock_signal() is just to ensure that waking up the other 
CPUs will not happen before the unlock is seen. The barrier would not 
have been necessary if the we didn't use 'sev'.




Either way is fine for me. It should be noted that preemption is only
relevant on the local cpu. So this is about trading lock state
visibility against preemption disabled time, and I agree the visible
time of the lock held should be minimized at higher priority than the
preemption disabled time.


I don't think the rationale is about "performance" here. The rationale 
is you don't know the implementation of arch_lock_signal(). If you get 
preempted by a thread trying to acquire the same lock, then it may not 
do the right thing.


Linux will also re-enable preemption only after the unlock has been 
completed. So it would be best to follow the same pattern.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/9] x86/HVM: reduce domain.h include dependencies

2020-03-13 Thread Jan Beulich
On 11.03.2020 14:09, Andrew Cooper wrote:
> On 10/03/2020 15:48, Jan Beulich wrote:
>> Drop #include-s not needed by the header itself. Put the ones needed
>> into whichever other files actually need them.
>>
>> Signed-off-by: Jan Beulich 
>> ---
>> v2: Also make things build with XSM=y.
> 
> Looking better, but still got problems.
> 
> xen_pv_console.c: In function ‘pv_console_init’:
> xen_pv_console.c:51:37: error: ‘HVM_PARAM_CONSOLE_PFN’ undeclared (first
> use in this function)
>  r = xen_hypercall_hvm_get_param(HVM_PARAM_CONSOLE_PFN, _pfn);
>  ^
> 
> and
> 
> shim.c: In function ‘pv_shim_fixup_e820’:
> shim.c:148:20: error: ‘HVM_PARAM_STORE_PFN’ undeclared (first use in
> this function)
>  MARK_PARAM_RAM(HVM_PARAM_STORE_PFN);

In a later reply (which I've lost due to mailbox problems) you've
been suggesting allyesconfig. I'd been considering to use it indeed,
but then forgot. Together with choices though I'm unconvinced this
would provide broad enough coverage. I'd also similarly wonder
whether allnoconfig might not be more telling, as it might result
in fewer things getting included here and there. I'll make sure
both build fine before sending v3, but I'm having trouble seeing
how I would invoke these - neither the top level Makefile nor
xen/Makefile look to permit its use right now. Have you found a
way to successfully use these without first patching the tree?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 4/4] x86/APIC: restrict certain messages to BSP

2020-03-13 Thread Jan Beulich
All CPUs get an equal setting of EOI broadcast suppression; no need to
log one message per CPU, even if it's only in verbose APIC mode.

Only the BSP is eligible to possibly get ExtINT enabled; no need to log
that it gets disabled on all APs, even if - again - it's only in verbose
APIC mode.

Take the opportunity and introduce a "bsp" parameter to the function, to
stop using smp_processor_id() to tell BSP from APs.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -499,7 +499,7 @@ static void resume_x2apic(void)
 __enable_x2apic();
 }
 
-void setup_local_APIC(void)
+void setup_local_APIC(bool bsp)
 {
 unsigned long oldvalue, value, maxlvt;
 int i, j;
@@ -598,8 +598,8 @@ void setup_local_APIC(void)
 if ( directed_eoi_enabled )
 {
 value |= APIC_SPIV_DIRECTED_EOI;
-apic_printk(APIC_VERBOSE, "Suppress EOI broadcast on CPU#%d\n",
-smp_processor_id());
+if ( bsp )
+apic_printk(APIC_VERBOSE, "Suppressing EOI broadcast\n");
 }
 
 apic_write(APIC_SPIV, value);
@@ -615,21 +615,22 @@ void setup_local_APIC(void)
  * TODO: set up through-local-APIC from through-I/O-APIC? --macro
  */
 value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
-if (!smp_processor_id() && (pic_mode || !value)) {
+if (bsp && (pic_mode || !value)) {
 value = APIC_DM_EXTINT;
 apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n",
 smp_processor_id());
 } else {
 value = APIC_DM_EXTINT | APIC_LVT_MASKED;
-apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
-smp_processor_id());
+if (bsp)
+apic_printk(APIC_VERBOSE, "masked ExtINT on CPU#%d\n",
+smp_processor_id());
 }
 apic_write(APIC_LVT0, value);
 
 /*
  * only the BP should see the LINT1 NMI signal, obviously.
  */
-if (!smp_processor_id())
+if (bsp)
 value = APIC_DM_NMI;
 else
 value = APIC_DM_NMI | APIC_LVT_MASKED;
@@ -663,7 +664,7 @@ void setup_local_APIC(void)
 printk("Leaving ESR disabled.\n");
 }
 
-if (nmi_watchdog == NMI_LOCAL_APIC && smp_processor_id())
+if (nmi_watchdog == NMI_LOCAL_APIC && !bsp)
 setup_apic_nmi_watchdog();
 apic_pm_activate();
 }
@@ -1474,7 +1475,7 @@ int __init APIC_init_uniprocessor (void)
 physids_clear(phys_cpu_present_map);
 physid_set(boot_cpu_physical_apicid, phys_cpu_present_map);
 
-setup_local_APIC();
+setup_local_APIC(true);
 
 if (nmi_watchdog == NMI_LOCAL_APIC)
 check_nmi_watchdog();
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -191,7 +191,7 @@ static void smp_callin(void)
  */
 Dprintk("CALLIN, before setup_local_APIC().\n");
 x2apic_ap_setup();
-setup_local_APIC();
+setup_local_APIC(false);
 
 /* Save our processor parameters. */
 if ( !smp_store_cpu_info(cpu) )
@@ -1165,7 +1165,7 @@ void __init smp_prepare_cpus(void)
 verify_local_APIC();
 
 connect_bsp_APIC();
-setup_local_APIC();
+setup_local_APIC(true);
 
 if ( !skip_ioapic_setup && nr_ioapics )
 setup_IO_APIC();
--- a/xen/include/asm-x86/apic.h
+++ b/xen/include/asm-x86/apic.h
@@ -169,7 +169,7 @@ extern int verify_local_APIC (void);
 extern void cache_APIC_registers (void);
 extern void sync_Arb_IDs (void);
 extern void init_bsp_APIC (void);
-extern void setup_local_APIC (void);
+extern void setup_local_APIC(bool bsp);
 extern void init_apic_mappings (void);
 extern void smp_local_timer_interrupt (struct cpu_user_regs *regs);
 extern void setup_boot_APIC_clock (void);


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 3/4] x86/APIC: reduce rounding errors in calculations

2020-03-13 Thread Jan Beulich
Dividing by HZ/10 just to subsequently multiply by HZ again in all uses
of the respective variable is pretty pointlessly introducing rounding
(really: truncation) errors. While transforming the respective
expressions it became apparent that "result" would be left unused except
for its use as function return value. As the sole caller of the function
doesn't look at the returned value, simply convert the function to have
"void" return type.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1204,14 +1204,14 @@ static void wait_tick_pvh(void)
  * APIC irq that way.
  */
 
-static int __init calibrate_APIC_clock(void)
+static void __init calibrate_APIC_clock(void)
 {
 unsigned long long t1, t2;
-unsigned long tt1, tt2, result;
+unsigned long tt1, tt2;
 unsigned int i;
 unsigned long bus_freq; /* KAF: pointer-size avoids compile warns. */
 unsigned int bus_cycle; /* length of one bus cycle in pico-seconds */
-const unsigned int LOOPS = HZ/10;
+#define LOOPS_FRAC 10U  /* measure for one tenth of a second */
 
 apic_printk(APIC_VERBOSE, "calibrating APIC timer ...\n");
 
@@ -1238,9 +1238,9 @@ static int __init calibrate_APIC_clock(v
 tt1 = apic_read(APIC_TMCCT);
 
 /*
- * Let's wait LOOPS ticks:
+ * Let's wait HZ / LOOPS_FRAC ticks:
  */
-for (i = 0; i < LOOPS; i++)
+for (i = 0; i < HZ / LOOPS_FRAC; i++)
 if ( !xen_guest )
 wait_8254_wraparound();
 else
@@ -1249,17 +1249,16 @@ static int __init calibrate_APIC_clock(v
 tt2 = apic_read(APIC_TMCCT);
 t2 = rdtsc_ordered();
 
-result = (tt1-tt2)*APIC_DIVISOR/LOOPS;
+bus_freq = (tt1 - tt2) * APIC_DIVISOR * LOOPS_FRAC;
 
-apic_printk(APIC_VERBOSE, ". CPU clock speed is %ld.%04ld MHz.\n",
-((long)(t2 - t1) / LOOPS) / (100 / HZ),
-((long)(t2 - t1) / LOOPS) % (100 / HZ));
+apic_printk(APIC_VERBOSE, ". CPU clock speed is %lu.%04lu MHz.\n",
+((unsigned long)(t2 - t1) * LOOPS_FRAC) / 100,
+((unsigned long)(t2 - t1) * LOOPS_FRAC / 100) % 1);
 
 apic_printk(APIC_VERBOSE, ". host bus clock speed is %ld.%04ld MHz.\n",
-result / (100 / HZ), result % (100 / HZ));
+bus_freq / 100, (bus_freq / 100) % 1);
 
 /* set up multipliers for accurate timer code */
-bus_freq   = result*HZ;
 bus_cycle  = (u32) (1LL/bus_freq); /* in pico seconds */
 bus_cycle += (1UL % bus_freq) * 2 > bus_freq;
 bus_scale  = (1000*262144)/bus_cycle;
@@ -1269,7 +1268,7 @@ static int __init calibrate_APIC_clock(v
 /* reset APIC to zero timeout value */
 __setup_APIC_LVTT(0);
 
-return result;
+#undef LOOPS_FRAC
 }
 
 void __init setup_boot_APIC_clock(void)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/4] x86/APIC: adjust types and comments in calibrate_APIC_clock()

2020-03-13 Thread Jan Beulich
First and foremost the comment talking about potential underflow being
taken care of by using signed long type variables was true only on
32-bit, which we've not been supporting for quite some time. Drop the
comment and change all involved types to unsigned. Take the opportunity
and also replace bus_cycle's fixed width type.

Additionally there's no point using an "arbitrary (but long enough)
timeout" here. Just use the maximum possible value; Linux does so too,
just as an additional data point.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1207,21 +1207,19 @@ static void wait_tick_pvh(void)
 static int __init calibrate_APIC_clock(void)
 {
 unsigned long long t1, t2;
-long tt1, tt2;
-long result;
-int i;
+unsigned long tt1, tt2, result;
+unsigned int i;
 unsigned long bus_freq; /* KAF: pointer-size avoids compile warns. */
-u32 bus_cycle;  /* length of one bus cycle in pico-seconds */
-const int LOOPS = HZ/10;
+unsigned int bus_cycle; /* length of one bus cycle in pico-seconds */
+const unsigned int LOOPS = HZ/10;
 
 apic_printk(APIC_VERBOSE, "calibrating APIC timer ...\n");
 
 /*
- * Put whatever arbitrary (but long enough) timeout
- * value into the APIC clock, we just want to get the
- * counter running for calibration.
+ * Setup the APIC counter to maximum. There is no way the lapic
+ * can underflow in the 100ms detection time frame.
  */
-__setup_APIC_LVTT(10);
+__setup_APIC_LVTT(0x);
 
 if ( !xen_guest )
 /*
@@ -1251,14 +1249,6 @@ static int __init calibrate_APIC_clock(v
 tt2 = apic_read(APIC_TMCCT);
 t2 = rdtsc_ordered();
 
-/*
- * The APIC bus clock counter is 32 bits only, it
- * might have overflown, but note that we use signed
- * longs, thus no extra care needed.
- *
- * underflown to be exact, as the timer counts down ;)
- */
-
 result = (tt1-tt2)*APIC_DIVISOR/LOOPS;
 
 apic_printk(APIC_VERBOSE, ". CPU clock speed is %ld.%04ld MHz.\n",


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional

2020-03-13 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich 
> Sent: 13 March 2020 08:10
> To: Tian, Kevin 
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper 
> ; Paul Durrant
> 
> Subject: Re: [PATCH v3] IOMMU: make DMA containment of quarantined devices 
> optional
> 
> On 13.03.2020 04:05, Tian, Kevin wrote:
> >> From: Jan Beulich 
> >> Sent: Tuesday, March 10, 2020 6:31 PM
> >>
> >> On 10.03.2020 06:30, Tian, Kevin wrote:
>  From: Jan Beulich 
>  Sent: Monday, March 9, 2020 7:09 PM
> 
>  Containing still in flight DMA was introduced to work around certain
>  devices / systems hanging hard upon hitting a "not-present" IOMMU fault.
>  Passing through (such) devices (on such systems) is inherently insecure
>  (as guests could easily arrange for IOMMU faults of any kind to occur).
>  Defaulting to a mode where admins may not even become aware of
> >> issues
>  with devices can be considered undesirable. Therefore convert this mode
>  of operation to an optional one, not one enabled by default.
> >>>
> >>> Here is another thought. The whole point of quarantine is to contain
> >>> the device after it is deassigned from untrusted guest.
> >>
> >> I'd question the "untrusted" here. Assigning devices to untrusted
> >> guests is problematic anyway, unless you're the device manufacturer
> >> and device firmware writer, and hence you can guarantee the device
> >> to not offer any backdoors or alike. Therefore I view quarantining
> >
> > Aren't all guests untrusted from hypervisor p.o.v, which is the reason
> > why IOMMU was introduced in the first place?
> 
> "Untrusted" is always meant from the perspective of the host admin.
> 
> > I may overlook the history of quarantine feature. Based on my study
> > of quarantine related changes, looks initially Paul pointed out such
> > problem that a device may have in-fly DMAs to touch dom0 memory
> > after it is deassigned. Then he introduced the quarantine concept by
> > putting a quarantined device into dom_io w/o any valid mapping,
> > with a whitelist approach. You later extended as a default behavior
> > for all PCI devices. Now Paul found some device which cannot tolerate
> > read-fault and then came up this scratch-page idea.
> >
> > Now I wonder why we are doing such explicit quarantine in the first
> > place. Shouldn't we always seek resetting the device when it is deassigned
> > from a guest? 'reset' should cancel/quiescent all in-fly DMAs for most
> > devices if they implement 'reset' correctly.
> 
> And the important word here is "should". Paul and colleagues found
> it may not do so in reality.

Yeah... we have to live with what we've got. Yes, it's buggy as hell but we 
have to do our best to stop it wedging hosts whilst trying to prevent 
scribbling over critical parts of memory.

> 
> > If doing that way, we don't
> > need a quarantine option at all, and then the bogus device in Paul's
> > latest finding could be handled by a standalone option w/o struggling
> > whether 'full' is a right name vs. 'basic'. or we may introduce a reset
> > flag when assigning such device to indicate such special requirement,
> > so a scratch page/dom_io can be linked specifically for such device
> > post reset, assuming it is not a platform-level bug from Paul's response?
> 
> Which would imply host admins to know such properties of their
> devices, and better _without_ first having run into problems.
> 

It is a device-level bug. We could, I guess, have a per-device quirk to say 
whether it should get a context entry pointing at a scratch page or not.

  Paul


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/4] x86/time: reduce rounding errors in calculations

2020-03-13 Thread Jan Beulich
Plain (unsigned) integer division simply truncates the results. The
overall errors are smaller though if we use proper rounding. (Extend
this to the purely cosmetic aspect of time.c's freq_string(), which
before this change I've frequently observed to report e.g. NN.999MHz
HPET clock speeds.)

Signed-off-by: Jan Beulich 
---
We could switch at least the first div/rem pair in calibrate_APIC_clock()
to use do_div(), but this would imply switching bus_freq (and then also
result) to unsigned int (as the divisor has to be 32-bit). While I think
there's pretty little risk for bus frequencies to go beyond 4GHz in the
next so many years, I still wasn't certain enough this would be a welcome
change.

--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1261,7 +1261,9 @@ static int __init calibrate_APIC_clock(v
 /* set up multipliers for accurate timer code */
 bus_freq   = result*HZ;
 bus_cycle  = (u32) (1LL/bus_freq); /* in pico seconds */
+bus_cycle += (1UL % bus_freq) * 2 > bus_freq;
 bus_scale  = (1000*262144)/bus_cycle;
+bus_scale += ((1000 * 262144) % bus_cycle) * 2 > bus_cycle;
 
 apic_printk(APIC_VERBOSE, ". bus_scale = %#x\n", bus_scale);
 /* reset APIC to zero timeout value */
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -799,9 +799,9 @@ u64 __init hpet_setup(void)
 hpet_resume(hpet_boot_cfg);
 
 hpet_rate = 1000ULL; /* 10^15 */
-(void)do_div(hpet_rate, hpet_period);
+last = do_div(hpet_rate, hpet_period);
 
-return hpet_rate;
+return hpet_rate + (last * 2 > hpet_period);
 }
 
 void hpet_resume(u32 *boot_cfg)
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -275,7 +275,10 @@ static char *freq_string(u64 freq)
 {
 static char s[20];
 unsigned int x, y;
-y = (unsigned int)do_div(freq, 100) / 1000;
+
+if ( do_div(freq, 1000) > 500 )
+++freq;
+y = (unsigned int)do_div(freq, 1000);
 x = (unsigned int)freq;
 snprintf(s, sizeof(s), "%u.%03uMHz", x, y);
 return s;


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional

2020-03-13 Thread Paul Durrant
> -Original Message-
> From: Tian, Kevin 
> Sent: 13 March 2020 03:23
> To: p...@xen.org; 'Jan Beulich' 
> Cc: xen-devel@lists.xenproject.org; 'Andrew Cooper' 
> 
> Subject: RE: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of 
> quarantined devices optional
> 
> > From: Paul Durrant 
> > Sent: Wednesday, March 11, 2020 12:05 AM
> >
> [...]
> >
> > >
> > > > However, is a really saying that things will break if any of the
> > > > PTEs has their present bit clear?
> > >
> > > Well, you said that read faults are fatal (to the host). Reads will,
> > > for any address with an unpopulated PTE, result in a fault and hence
> > > by implication be fatal.
> >
> > Oh I see. I thought there was an implication that the IOMMU could not cope
> > with non-present PTEs in some way. Agreed that, when the device is assigned
> > to the guest, then it can arrange (via ballooning) for a non-present entry 
> > to
> > be hit by a read transaction, resulting in a lock-up. But dealing with a
> > malicious guest was not the issue at hand... dealing with a buggy device 
> > that
> > still tried to DMA after reset and whilst in quarantine was the problem.
> >
> 
> More thinking on this, I wonder whether the scratch page is sufficient, or
> whether we should support such device in the first place. Looking at
> 0c35d446:
> --
> The reason for doing this is that some hardware may continue to re-try
> DMA (despite FLR) in the event of an error, or even BME being cleared, and
> will fail to deal with DMA read faults gracefully. Having a scratch page
> mapped will allow pending DMA reads to complete and thus such buggy
> hardware will eventually be quiesced.
> --
> 
> 'eventually'... what does it exactly mean?

It means after a period of time we can only determine empirically.

> How would an user know a
> device has been quiesced before he attempts to re-assign the device
> to other domU or dom0? by guess?

Yes, a guess, but an educated one.

> Note the exact behavior of such
> device, after different guest behaviors (hang, kill, bug, etc.), is not
> documented. Who knows whether a in-fly DMA may be triggered when
> the new owner starts to initialize the device again? How many stale
> states are remaining on such device which, even not triggerring in-fly
> DMAs, may change the desired behavior of the new owner? e.g. it's
> possible one control register configured by the old owner, but not
> touched by the new owner. If it cannot be reset, what's the point of
> supporting assignment of such bogus device?
> 

Because I'm afraid it is quite ubiquitous and we need to deal with it.

> Thereby I feel any support of such bogus device should be maintained
> offtree, instead of in upstream Xen. Thoughts?
> 

I don't see the harm in the code being upstream. There may well be other 
devices with similar issues and it provides an option for an admin to try.

  Paul


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 0/4] x86: reduce errors in frequency calculations / unrelated cleanup

2020-03-13 Thread Jan Beulich
While looking into ways to increase the accuracy of the clock speeds
we work with (in particular by possibly obtaining information from
hardware rather than measuring), I first of all noticed some
avoidable rounding errors in some of our calculations. That's what
patches 2 and 3 are intended to deal with. Patch 1 is preparatory
cleanup, while patch 4 simply addresses something I got annoyed by
yet another time while doing this investigation.

1: APIC: adjust types and comments in calibrate_APIC_clock()
2: time: reduce rounding errors in calculations
3: APIC: reduce rounding errors in calculations
4: APIC: restrict certain messages to BSP

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Stopping much Linux testing in Xen Project CI

2020-03-13 Thread Jan Beulich
On 12.03.2020 18:55, Roger Pau Monné wrote:
> On Thu, Mar 12, 2020 at 04:49:51PM +, Ian Jackson wrote:
>> Linux stable branches, and Linux upstream tip, are badly broken and
>> have been for months.  Apparently no-one is able to (or has time to)
>> to investigate and fix.
>>
>>   linux-4.4  218 days to be suspended
>>   linux-4.9  134 days to be suspended
>>   linux-4.14 134 days to be suspended
>>   linux-4.19 134 days to be suspended
>>   linux-5.4   55 days
>>   linux-arm-xen up to date
>>   linux-linus372 days to be suspended
>>
>> These are times since the last push - ie, how long it has been broken.
>> Evidently no-one is paying any attention to this.[1]  I looked at the
>> reports myself and:
>>
>> Nested HVM is broken on Intel in all of the 4.x branches.
> 
> FWIW, it's the Debian installer kernel the one that crashes AFAICT,
> all the failures are:
> 
> [0.00] Linux version 4.9.0-6-amd64 (debian-ker...@lists.debian.org) 
> (gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1) ) #1 SMP Debian 
> 4.9.82-1+deb9u3 (2018-03-02)
> [...]
> [0.00] clocksource: hpet: mask: 0x max_cycles: 0x, 
> max_idle_ns: 30580167144 ns
> [0.00] tsc: Fast TSC calibration failed
> [0.00] tsc: Unable to calibrate against PIT
> [0.00] tsc: HPET/PMTIMER calibration failed
> [0.00] divide error:  [#1] SMP
> [0.00] Modules linked in:
> [0.00] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-6-amd64 #1 
> Debian 4.9.82-1+deb9u3
> [0.00] Hardware name: Xen HVM domU, BIOS 4.14-unstable 03/11/2020
> [0.00] task: ab611500 task.stack: ab60
> [0.00] RIP: 0010:[]  [] 
> pvclock_tsc_khz+0xf/0x30

Seeing this and ...

> [0.00] RSP: :ab603f38  EFLAGS: 00010246
> [0.00] RAX: 000f4240 RBX:  RCX: 
> 
> [0.00] RDX:  RSI: 0246 RDI: 
> ab939020
> [0.00] RBP: 93806e8f1540 R08: 3a637374 R09: 
> 6f6974617262696c
> [0.00] R10: 0032f3af6dcd R11: 4d502f5445504820 R12: 
> ab7dc920
> [0.00] R13: ab7e82e0 R14: 000146f0 R15: 
> 008e
> [0.00] FS:  () GS:93806e60() 
> knlGS:
> [0.00] CS:  0010 DS:  ES:  CR0: 80050033
> [0.00] CR2: 938065f3a000 CR3: 25c08000 CR4: 
> 000406b0
> [0.00] Stack:
> [0.00]  ab74b1b6 93806e8f1540 ab7dc920 
> ba81e537ba81e512
> [0.00]   93806e8f1540 ab73deb6 
> ab7e82e0
> [0.00]   0020 ab73 
> 
> [0.00] Call Trace:
> [0.00]  [] ? tsc_init+0x39/0x25b

... this and looking at xen_tsc_khz(), isn't it supposed to use
per_cpu(xen_vcpu, 0) instead, in case vCPU info got relocated?
(Code looks to be the same in 4.9 and 5.5. I'd also question
the hard-coded zero in there, but that's a different topic.)

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/spinlocks: fix placement of preempt_[dis|en]able()

2020-03-13 Thread Jürgen Groß

On 13.03.20 09:55, Jan Beulich wrote:

On 13.03.2020 09:05, Juergen Gross wrote:

@@ -199,10 +199,10 @@ unsigned long _spin_lock_irqsave(spinlock_t *lock)
  void _spin_unlock(spinlock_t *lock)
  {
  arch_lock_release_barrier();
-preempt_enable();
  LOCK_PROFILE_REL;
  rel_lock(>debug);
  add_sized(>tickets.head, 1);
+preempt_enable();
  arch_lock_signal();
  }


arch_lock_signal() is a barrier on Arm, hence just like for patch 1
I wonder whether the insertion wouldn't better come after it.


Either way is fine for me. It should be noted that preemption is only
relevant on the local cpu. So this is about trading lock state
visibility against preemption disabled time, and I agree the visible
time of the lock held should be minimized at higher priority than the
preemption disabled time.

I'll modify my patches accordingly, adding a note in this regard to
the commit message.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/spinlocks: fix placement of preempt_[dis|en]able()

2020-03-13 Thread Jan Beulich
On 13.03.2020 09:05, Juergen Gross wrote:
> @@ -199,10 +199,10 @@ unsigned long _spin_lock_irqsave(spinlock_t *lock)
>  void _spin_unlock(spinlock_t *lock)
>  {
>  arch_lock_release_barrier();
> -preempt_enable();
>  LOCK_PROFILE_REL;
>  rel_lock(>debug);
>  add_sized(>tickets.head, 1);
> +preempt_enable();
>  arch_lock_signal();
>  }

arch_lock_signal() is a barrier on Arm, hence just like for patch 1
I wonder whether the insertion wouldn't better come after it.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/rwlocks: call preempt_disable() when taking a rwlock

2020-03-13 Thread Jan Beulich
On 13.03.2020 09:05, Juergen Gross wrote:
> Similar to spinlocks preemption should be disabled while holding a
> rwlock.
> 
> Signed-off-by: Juergen Gross 

Just one note/question:

> @@ -308,6 +323,7 @@ static inline void _percpu_read_unlock(percpu_rwlock_t 
> **per_cpudata,
>  return;
>  }
>  this_cpu_ptr(per_cpudata) = NULL;
> +preempt_enable();
>  smp_wmb();
>  }

It would seem more logical to me to insert this after the smp_wmb().
Thoughts? I'll be happy to give my R-b once we've settled on this.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/2] libfsimage: fix parentheses in macro parameters

2020-03-13 Thread Roger Pau Monne
VERIFY_DN_TYPE and VERIFY_OS_TYPE should use parentheses when
accessing the type parameter. Note that none of the current usages
require this, it's just done for correctness.

Signed-off-by: Roger Pau Monné 
---
 tools/libfsimage/zfs/fsys_zfs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libfsimage/zfs/fsys_zfs.h b/tools/libfsimage/zfs/fsys_zfs.h
index 721972a05a..b4be51b50d 100644
--- a/tools/libfsimage/zfs/fsys_zfs.h
+++ b/tools/libfsimage/zfs/fsys_zfs.h
@@ -71,7 +71,7 @@ typedef   unsigned int size_t;
  * Can only be used in functions returning non-0 for failure.
  */
 #defineVERIFY_DN_TYPE(dnp, type) \
-   if (type != DMU_OT_NONE && (dnp)->dn_type != type) { \
+   if ((type) != DMU_OT_NONE && (dnp)->dn_type != (type)) { \
return (ERR_FSYS_CORRUPT); \
}
 
@@ -80,7 +80,7 @@ typedef   unsigned int size_t;
  * Can only be used in functions returning 0 for failure.
  */
 #defineVERIFY_OS_TYPE(osp, type) \
-   if (type != DMU_OST_NONE && (osp)->os_type != type) { \
+   if ((type) != DMU_OST_NONE && (osp)->os_type != (type)) { \
errnum = ERR_FSYS_CORRUPT; \
return (0); \
}
-- 
2.25.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/2] libfsimage: fix clang 10 build

2020-03-13 Thread Roger Pau Monne
clang complains with:

fsys_zfs.c:826:2: error: converting the enum constant to a boolean 
[-Werror,-Wint-in-bool-context]
VERIFY_DN_TYPE(dn, DMU_OT_PLAIN_FILE_CONTENTS);
^
/wrkdirs/usr/ports/sysutils/xen-tools/work/xen-4.13.0/tools/libfsimage/zfs/../../../tools/libfsimage/zfs/fsys_zfs.h:74:11:
 note: expanded from macro 'VERIFY_DN_TYPE'
if (type && (dnp)->dn_type != type) { \
 ^
1 error generated.

Fix this by not forcing an implicit conversion of the enum into a
boolean and instead comparing with the 0 enumerator.

Signed-off-by: Roger Pau Monné 
---
 tools/libfsimage/zfs/fsys_zfs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libfsimage/zfs/fsys_zfs.h b/tools/libfsimage/zfs/fsys_zfs.h
index 5cd627dbac..721972a05a 100644
--- a/tools/libfsimage/zfs/fsys_zfs.h
+++ b/tools/libfsimage/zfs/fsys_zfs.h
@@ -71,7 +71,7 @@ typedef   unsigned int size_t;
  * Can only be used in functions returning non-0 for failure.
  */
 #defineVERIFY_DN_TYPE(dnp, type) \
-   if (type && (dnp)->dn_type != type) { \
+   if (type != DMU_OT_NONE && (dnp)->dn_type != type) { \
return (ERR_FSYS_CORRUPT); \
}
 
@@ -80,7 +80,7 @@ typedef   unsigned int size_t;
  * Can only be used in functions returning 0 for failure.
  */
 #defineVERIFY_OS_TYPE(osp, type) \
-   if (type && (osp)->os_type != type) { \
+   if (type != DMU_OST_NONE && (osp)->os_type != type) { \
errnum = ERR_FSYS_CORRUPT; \
return (0); \
}
-- 
2.25.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH 2/2] x86/xen: Make the secondary CPU idle tasks reliable

2020-03-13 Thread Jürgen Groß

On 12.03.20 15:20, Miroslav Benes wrote:

The unwinder reports the secondary CPU idle tasks' stack on XEN PV as
unreliable, which affects at least live patching.
cpu_initialize_context() sets up the context of the CPU through
VCPUOP_initialise hypercall. After it is woken up, the idle task starts
in cpu_bringup_and_idle() function and its stack starts at the offset
right below pt_regs. The unwinder correctly detects the end of stack
there but it is confused by NULL return address in the last frame.

RFC: I haven't found the way to teach the unwinder about the state of
the stack there. Thus the ugly hack using assembly. Similar to what
startup_xen() has got for boot CPU.

It introduces objtool "unreachable instruction" warning just right after
the jump to cpu_bringup_and_idle(). It should show the idea what needs
to be done though, I think. Ideas welcome.

Signed-off-by: Miroslav Benes 
---
  arch/x86/xen/smp_pv.c   |  3 ++-
  arch/x86/xen/xen-head.S | 10 ++
  2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 802ee5bba66c..6b88cdcbef8f 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -53,6 +53,7 @@ static DEFINE_PER_CPU(struct xen_common_irq, xen_irq_work) = 
{ .irq = -1 };
  static DEFINE_PER_CPU(struct xen_common_irq, xen_pmu_irq) = { .irq = -1 };
  
  static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id);

+extern unsigned char asm_cpu_bringup_and_idle[];
  
  static void cpu_bringup(void)

  {


Would adding this here work?

+   asm volatile (UNWIND_HINT(ORC_REG_UNDEFINED, 0, ORC_TYPE_CALL, 1));


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional

2020-03-13 Thread Jan Beulich
On 13.03.2020 04:05, Tian, Kevin wrote:
>> From: Jan Beulich 
>> Sent: Tuesday, March 10, 2020 6:31 PM
>>
>> On 10.03.2020 06:30, Tian, Kevin wrote:
 From: Jan Beulich 
 Sent: Monday, March 9, 2020 7:09 PM

 Containing still in flight DMA was introduced to work around certain
 devices / systems hanging hard upon hitting a "not-present" IOMMU fault.
 Passing through (such) devices (on such systems) is inherently insecure
 (as guests could easily arrange for IOMMU faults of any kind to occur).
 Defaulting to a mode where admins may not even become aware of
>> issues
 with devices can be considered undesirable. Therefore convert this mode
 of operation to an optional one, not one enabled by default.
>>>
>>> Here is another thought. The whole point of quarantine is to contain
>>> the device after it is deassigned from untrusted guest.
>>
>> I'd question the "untrusted" here. Assigning devices to untrusted
>> guests is problematic anyway, unless you're the device manufacturer
>> and device firmware writer, and hence you can guarantee the device
>> to not offer any backdoors or alike. Therefore I view quarantining
> 
> Aren't all guests untrusted from hypervisor p.o.v, which is the reason
> why IOMMU was introduced in the first place?

"Untrusted" is always meant from the perspective of the host admin.

> I may overlook the history of quarantine feature. Based on my study
> of quarantine related changes, looks initially Paul pointed out such 
> problem that a device may have in-fly DMAs to touch dom0 memory
> after it is deassigned. Then he introduced the quarantine concept by
> putting a quarantined device into dom_io w/o any valid mapping, 
> with a whitelist approach. You later extended as a default behavior
> for all PCI devices. Now Paul found some device which cannot tolerate
> read-fault and then came up this scratch-page idea.
> 
> Now I wonder why we are doing such explicit quarantine in the first
> place. Shouldn't we always seek resetting the device when it is deassigned
> from a guest? 'reset' should cancel/quiescent all in-fly DMAs for most
> devices if they implement 'reset' correctly.

And the important word here is "should". Paul and colleagues found
it may not do so in reality.

> If doing that way, we don't
> need a quarantine option at all, and then the bogus device in Paul's
> latest finding could be handled by a standalone option w/o struggling
> whether 'full' is a right name vs. 'basic'. or we may introduce a reset
> flag when assigning such device to indicate such special requirement,
> so a scratch page/dom_io can be linked specifically for such device 
> post reset, assuming it is not a platform-level bug from Paul's response?  

Which would imply host admins to know such properties of their
devices, and better _without_ first having run into problems.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 3/4] xen/rcu: add assertions to debug build

2020-03-13 Thread Jürgen Groß

On 12.03.20 09:28, Juergen Gross wrote:

Xen's RCU implementation relies on no softirq handling taking place
while being in a RCU critical section. Add ASSERT()s in debug builds
in order to catch any violations.

For that purpose modify rcu_read_[un]lock() to use a dedicated percpu
counter instead of preempt_[en|dis]able() as this enables to test
that condition in __do_softirq() (ASSERT_NOT_IN_ATOMIC() is not
usable there due to __cpu_up() calling process_pending_softirqs()
while holding the cpu hotplug lock).

Dropping the now no longer needed #include of preempt.h in rcupdate.h
requires adding it in some sources.

While at it switch the rcu_read_[un]lock() implementation to static
inline functions instead of macros.

Signed-off-by: Juergen Gross 


Depending on the acceptance of my just sent series for fixing
preemption disabling in locks I might send a fixup to this patch, too,
re-adding preempt_disable() to rcu_read_lock().


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 02/10] scripts: Coccinelle script to use ERRP_AUTO_PROPAGATE()

2020-03-13 Thread Vladimir Sementsov-Ogievskiy

13.03.2020 10:50, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:

[...]

+// Warn several Error * definitions.
+@check1 disable optional_qualifier exists@
+identifier fn = rule1.fn, local_err, local_err2;
+@@
+
+ fn(..., Error ** , ...)
+ {
+ ...
+ Error *local_err = NULL;
+ ... when any
+ Error *local_err2 = NULL;
+ ... when any
+ }
+
+@ script:python @
+fn << check1.fn;
+@@
+
+print('Warning: function {} has several definitions of '
+  'Error * local variable'.format(fn))


Printing the positions like you do in the next rule is useful when
examining these warnings.


I decided that searching for Error * definition is simple, and better for
user to search all definitions by hand (may be more than too).

But understanding control flows is more complex thing and better to help
user with line positions.

But if you want, we can add them of course. Note, that for some reasons some
times coccinelle instead of original filename prints something like 
/tmp/...original-name...
so it don't look nice and may be a bit misleading.




+
+// Warn several propagations in control flow.
+@check2 disable optional_qualifier exists@
+identifier fn = rule1.fn;
+symbol errp;
+position p1, p2;
+@@
+
+ fn(..., Error ** , ...)
+ {
+ ...
+(
+ error_propagate_prepend(errp, ...);@p1
+|
+ error_propagate(errp, ...);@p1
+)
+ ...
+(
+ error_propagate_prepend(errp, ...);@p2
+|
+ error_propagate(errp, ...);@p2
+)
+ ... when any
+ }
+
+@ script:python @
+fn << check2.fn;
+p1 << check2.p1;
+p2 << check2.p2;
+@@
+
+print('Warning: function {} propagates to errp several times in '
+  'one control flow: at {}:{} and then at {}:{}'.format(
+  fn, p1[0].file, p1[0].line, p2[0].file, p2[0].line))

[...]




--
Best regards,
Vladimir

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/2] xen/spinlocks: fix placement of preempt_[dis|en]able()

2020-03-13 Thread Juergen Gross
In case Xen ever gains preemption support the spinlock coding's
placement of preempt_disable() and preempt_enable() should be outside
of the locked section.

Signed-off-by: Juergen Gross 
---
 xen/common/spinlock.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 344981c54a..f05fb068cd 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -160,6 +160,7 @@ void inline _spin_lock_cb(spinlock_t *lock, void (*cb)(void 
*), void *data)
 LOCK_PROFILE_VAR;
 
 check_lock(>debug);
+preempt_disable();
 tickets.head_tail = arch_fetch_and_add(>tickets.head_tail,
tickets.head_tail);
 while ( tickets.tail != observe_head(>tickets) )
@@ -171,7 +172,6 @@ void inline _spin_lock_cb(spinlock_t *lock, void (*cb)(void 
*), void *data)
 }
 got_lock(>debug);
 LOCK_PROFILE_GOT;
-preempt_disable();
 arch_lock_acquire_barrier();
 }
 
@@ -199,10 +199,10 @@ unsigned long _spin_lock_irqsave(spinlock_t *lock)
 void _spin_unlock(spinlock_t *lock)
 {
 arch_lock_release_barrier();
-preempt_enable();
 LOCK_PROFILE_REL;
 rel_lock(>debug);
 add_sized(>tickets.head, 1);
+preempt_enable();
 arch_lock_signal();
 }
 
@@ -242,15 +242,18 @@ int _spin_trylock(spinlock_t *lock)
 return 0;
 new = old;
 new.tail++;
+preempt_disable();
 if ( cmpxchg(>tickets.head_tail,
  old.head_tail, new.head_tail) != old.head_tail )
+{
+preempt_enable();
 return 0;
+}
 got_lock(>debug);
 #ifdef CONFIG_DEBUG_LOCK_PROFILE
 if (lock->profile)
 lock->profile->time_locked = NOW();
 #endif
-preempt_disable();
 /*
  * cmpxchg() is a full barrier so no need for an
  * arch_lock_acquire_barrier().
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-4.14 test] 148458: regressions - FAIL

2020-03-13 Thread osstest service owner
flight 148458 linux-4.14 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/148458/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-credit2   7 xen-boot fail REGR. vs. 142849
 test-armhf-armhf-examine  8 reboot   fail REGR. vs. 142849
 test-armhf-armhf-xl-arndale   7 xen-boot fail REGR. vs. 142849
 test-armhf-armhf-libvirt-raw  7 xen-boot fail REGR. vs. 142849
 test-armhf-armhf-xl-credit1   7 xen-boot fail REGR. vs. 142849
 test-armhf-armhf-xl-multivcpu  7 xen-bootfail REGR. vs. 142849
 test-armhf-armhf-xl   7 xen-boot fail REGR. vs. 142849
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
142849
 test-armhf-armhf-libvirt  7 xen-boot fail REGR. vs. 142849
 test-armhf-armhf-xl-vhd   7 xen-boot fail REGR. vs. 142849

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail  like 142849
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 142849
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass

version targeted for testing:
 linux12cd844a39ed16aa183a820a54fe6f9a0bb4cd14
baseline version:
 linuxb98aebd298246df37b472c52a2ee1023256d02e3

Last test of basis   142849  2019-10-17 21:11:16 Z  147 days
Failing since143327  2019-10-29 08:49:30 Z  135 days   29 attempts
Testing same since   148458  2020-03-11 21:56:38 Z1 days1 attempts


1556 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 

  1   2   >