[Xen-devel] [qemu-mainline test] 101230: regressions - FAIL
flight 101230 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/101230/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-credit2 16 guest-start.2fail REGR. vs. 101216 Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 101216 test-amd64-amd64-xl-rtds 9 debian-install fail like 101216 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 101216 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass version targeted for testing: qemuuc69e3cef21ff3207d7e0b0a96beec3374d0b48bb baseline version: qemuu49540a1f652afd419812bd4d35cc6f45a46a2afe Last test of basis 101216 2016-09-30 01:45:54 Z1 days Testing same since 101230 2016-09-30 23:14:08 Z0 days1 attempts People who touched revisions under test: Dr. David Alan GilbertHervé Poussineau Leon Alrae Peter Maydell Yongbok Kim jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-xl pass test-armhf-armhf-xl
[Xen-devel] [qemu-mainline baseline-only test] 67785: regressions - FAIL
This run is configured for baseline tests only. flight 67785 qemu-mainline real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/67785/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-credit2 6 xen-boot fail REGR. vs. 67781 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-amd64-pvgrub 10 guest-start fail like 67781 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail like 67781 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-midway 12 migrate-support-checkfail never pass test-armhf-armhf-xl-midway 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass version targeted for testing: qemuu49540a1f652afd419812bd4d35cc6f45a46a2afe baseline version: qemuu4a58f35b793d5d09d6cef921bf6ed7ffc39669fd Last test of basis67781 2016-09-28 19:48:03 Z2 days Testing same since67785 2016-09-30 15:17:19 Z0 days1 attempts People who touched revisions under test: Alex BennéeAnthony PERARD Ashijeet Acharya Daniel P. Berrange David Gibson David Gibson (ppc parts) Fam Zheng Felipe Franciosi Gerd Hoffmann Hervé Poussineau John Snow Laurent Vivier Lin Ma LluÃs Vilanova Marc-André Lureau Paolo Bonzini Paulina Szubarczyk Pavel Dovgalyuk Peter Maydell Peter Xu Roger Pau Monné Sergey Fedorov Sergey Fedorov Stefan Hajnoczi Stefan Weil Thomas Huth Yaowei Bai jobs: build-amd64-xsm pass build-armhf-xsm
[Xen-devel] [ovmf baseline-only test] 67786: all pass
This run is configured for baseline tests only. flight 67786 ovmf real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/67786/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf ed72804638c9b240477c5235d72c3823483813b2 baseline version: ovmf ab970515d2c6ec657fceab0ce571054bb43a22f2 Last test of basis67778 2016-09-28 10:50:20 Z2 days Testing same since67786 2016-09-30 15:18:53 Z0 days1 attempts People who touched revisions under test: Chao ZhangHao Wu Laszlo Ersek Liming Gao Qin Long Ruiyu Ni Zhang, Chao B jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. commit ed72804638c9b240477c5235d72c3823483813b2 Author: Hao Wu Date: Thu Sep 29 22:47:38 2016 +0800 BaseTools Build: Fix build break for clean target in Linux In Linux, Command needs to be String instead of list when Command run as shell with True. Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Liming Gao Reviewed-by: Yonghong Zhu commit 0eb330424c194a4869b2e33f42928715a5cad34e Author: Liming Gao Date: Thu Sep 29 22:00:42 2016 +0800 BaseTools VS Makefile: Don't include ms.common in ms.app ms.common is included by every tool Makefile. it is not necessary to be placed in ms.app again. Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Liming Gao Reviewed-by: Yonghong Zhu commit 2dc547da460038e180f28161a44998e1849b2f43 Author: Liming Gao Date: Thu Sep 29 22:00:22 2016 +0800 BaseTools Makefile: Missing LFAGS in app.makefile Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Liming Gao Reviewed-by: Yonghong Zhu commit dab62c5ec8a88def3ee99c04d644720cb201de08 Author: Qin Long Date: Tue Sep 27 16:54:04 2016 +0800 CryptoPkg/OpensslLib: Upgrade OpenSSL version to 1.0.2j Two official releases (OpenSSL 1.0.2i and 1.0.2j) were available with several severity fixes at 22-Sep-2016 and 26-Sep-2016. Refer to https://www.openssl.org/news/secadv/20160922.txt and https://www.openssl.org/news/secadv/20160926.txt. This patch is to upgrade the supported OpenSSL version in CryptoPkg/OpensslLib to catch the latest release 1.0.2j. Cc: Ting Ye Cc: David Woodhouse Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Qin Long Reviewed-by: Ting Ye Tested-by: Laszlo Ersek commit 84bc72fb7ddaa26105bfe5bf36115099da1e60b1 Author: Ruiyu Ni Date: Thu Sep 29 10:37:44 2016 +0800 MdeModulePkg/ImageDecoderLib: Retire it due to new BootLogoLib Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Ruiyu Ni Reviewed-by: Liming Gao commit 6a5974e259ec37eaf39720bdc9ac05a37f5eb6a5 Author: Ruiyu Ni Date: Thu Sep 29 10:38:35 2016 +0800 MdeModulePkg/BmpImageDecoderLib: Retire it due to new BootLogoLib Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Ruiyu Ni Reviewed-by: Liming Gao commit a2674f7fcefbfdaee382f184ace3ee95f1ccfa13 Author: Ruiyu Ni Date: Thu
[Xen-devel] [xen-unstable test] 101228: tolerable FAIL - PUSHED
flight 101228 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/101228/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 101225 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 101225 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 101225 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 101225 test-amd64-amd64-xl-rtds 9 debian-install fail like 101225 Tests which did not succeed, but are not blocking: test-amd64-i386-rumprun-i386 1 build-check(1) blocked n/a test-amd64-amd64-rumprun-amd64 1 build-check(1) blocked n/a build-amd64-rumprun 5 rumprun-buildfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass build-i386-rumprun5 rumprun-buildfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass version targeted for testing: xen 4b997d7aa8f82b5f7b0757bb6b73546dc98714a3 baseline version: xen 002e3ffb2156a135abbfb57374802922e162c5ae Last test of basis 101225 2016-09-30 12:13:29 Z0 days Testing same since 101228 2016-09-30 19:13:58 Z0 days1 attempts People who touched revisions under test: Andrew CooperDario Faggioli George Dunlap George Dunlap Ian Jackson Jan Beulich Julien Grall Lars Kurth Mihai DonÈu Shannon Zhao Wei Liu Zhi Wang jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64-xtf pass build-amd64 pass build-armhf
Re: [Xen-devel] [PATCH v2 06/10] xen: tracing: add trace records for schedule and rate-limiting.
On Thu, Sep 29, 2016 at 10:54 PM, Dario Faggioliwrote: > As far as {csched, csched2, rt}_schedule() are concerned, > an "empty" event, would already make it easier to read and > understand a trace. > > But while there, add a few useful information, like > if the cpu that is going through the scheduler has > been tickled or not, if it is currently idle, etc > (they vary, on a per-scheduler basis). > > For Credit1 and Credit2, add a record about when > rate-limiting kicks in too. > > Signed-off-by: Dario Faggioli > --- > Cc: George Dunlap > Cc: Meng Xu > Cc: Anshul Makkar > --- > Changes from v1: > * corrected the schedule record for sched_rt.c, as pointed out during review; > * pack the Credit1 records as well, as requested during review. > --- > xen/common/sched_credit.c | 32 > xen/common/sched_credit2.c | 40 +++- > xen/common/sched_rt.c | 15 +++ > 3 files changed, 86 insertions(+), 1 deletion(-) > As to sched_rt.c, Reviewed-by: Meng Xu Thanks, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/2] xen: tracing: add trace records for schedule and rate-limiting.
On Fri, Sep 30, 2016 at 10:21 AM, Dario Faggioliwrote: > As far as {csched, csched2, rt}_schedule() are concerned, > an "empty" event, would already make it easier to read and > understand a trace. > > But while there, add a few useful information, like > if the cpu that is going through the scheduler has > been tickled or not, if it is currently idle, etc > (they vary, on a per-scheduler basis). > > For Credit1 and Credit2, add a record about when > rate-limiting kicks in too. > > Signed-off-by: Dario Faggioli > --- > Cc: George Dunlap > Cc: Meng Xu > Cc: Anshul Makkar > --- > Changes from v1: > * corrected the schedule record for sched_rt.c, as pointed out during review; > * pack the Credit1 records as well, as requested during review. > --- > xen/common/sched_credit.c | 32 > xen/common/sched_credit2.c | 38 +- > xen/common/sched_rt.c | 15 +++ > 3 files changed, 84 insertions(+), 1 deletion(-) As to xen/common/sched_rt.c, Reviewed-by: Meng Xu Thanks, Meng -- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 101229: tolerable all pass - PUSHED
flight 101229 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/101229/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass version targeted for testing: xen b3d54cead6459567d9786ad415149868ee7f2f5b baseline version: xen 4b997d7aa8f82b5f7b0757bb6b73546dc98714a3 Last test of basis 101227 2016-09-30 17:01:50 Z0 days Testing same since 101229 2016-09-30 21:02:32 Z0 days1 attempts People who touched revisions under test: Andrew CooperKonrad Rzeszutek Wilk Wei Liu jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=b3d54cead6459567d9786ad415149868ee7f2f5b + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke b3d54cead6459567d9786ad415149868ee7f2f5b + branch=xen-unstable-smoke + revision=b3d54cead6459567d9786ad415149868ee7f2f5b + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.7-testing + '[' xb3d54cead6459567d9786ad415149868ee7f2f5b = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/linux-firmware.git ++ :
Re: [Xen-devel] [PATCH 3/3] tools & docs: add L2 CAT support in tools and docs.
On Thu, Aug 25, 2016 at 01:22:45PM +0800, Yi Sun wrote: > This patch is the xl/xc changes to support Intel L2 CAT > (Cache Allocation Technology). > > The new level option is introduced to original CAT setting > command in order to set CBM for specified level CAT. > - 'xl psr-hwinfo' is updated to show both L3 CAT and L2 CAT > info. > - 'xl psr-cat-cbm-set' is updated to set cache capacity > bitmasks(CBM) for a domain according to input cache level. > - 'xl psr-cat-show' is updated to show CBM of a domain > according to input cache level. > > Examples: > root@:~$ xl psr-hwinfo --cat > Cache Allocation Technology (CAT): L2 > Socket ID : 0 > Maximum COS : 3 > CBM length : 8 > Default CBM : 0xff > > root@:~$ xl psr-cat-cbm-set -l2 1 0x7f > > root@:~$ xl psr-cat-show -l2 1 > Socket ID : 0 > Default CBM : 0xff >ID NAME CBM > 1 ubuntu140x7f > > Signed-off-by: He Chen> Signed-off-by: Yi Sun > --- > docs/man/xl.pod.1.in | 25 +- > docs/misc/xl-psr.markdown | 10 ++- > tools/libxc/include/xenctrl.h | 7 +- > tools/libxc/xc_psr.c | 28 +-- > tools/libxl/libxl.h | 2 + > tools/libxl/libxl_psr.c | 50 ++- > tools/libxl/libxl_types.idl | 1 + > tools/libxl/xl_cmdimpl.c | 191 > +++--- > tools/libxl/xl_cmdtable.c | 4 +- > 9 files changed, 268 insertions(+), 50 deletions(-) > > diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in > index 1adf322..47bdfc6 100644 > --- a/docs/man/xl.pod.1.in > +++ b/docs/man/xl.pod.1.in > @@ -1660,6 +1660,9 @@ occupancy monitoring share the same set of underlying > monitoring service. Once > a domain is attached to the monitoring service, monitoring data can be shown > for any of these monitoring types. > > +There is no cache monitoring and memory bandwidth monitoring on L2 cache so > +far. > + > =over 4 > > =item B [I] > @@ -1684,7 +1687,7 @@ monitor types are: > > Intel Broadwell and later server platforms offer capabilities to configure > and > make use of the Cache Allocation Technology (CAT) mechanisms, which enable > more > -cache resources (i.e. L3 cache) to be made available for high priority > +cache resources (i.e. L3/L2 cache) to be made available for high priority > applications. In the Xen implementation, CAT is used to control cache > allocation > on VM basis. To enforce cache on a specific domain, just set capacity > bitmasks > (CBM) for the domain. > @@ -1694,7 +1697,7 @@ Intel Broadwell and later server platforms also offer > Code/Data Prioritization > applications. CDP is used on a per VM basis in the Xen implementation. To > specify code or data CBM for the domain, CDP feature must be enabled and CBM > type options need to be specified when setting CBM, and the type options > (code > -and data) are mutually exclusive. > +and data) are mutually exclusive. There is no CDP support on L2 so far. > > =over 4 > > @@ -1711,6 +1714,11 @@ B > > Specify the socket to process, otherwise all sockets are processed. > > +=item B<-l LEVEL>, B<--level=LEVEL> > + > +Specify the cache level to process, otherwise the last level cache is s/last level cache/last level cache (L3)/ Because you may have L4 at some point (weren't there some CPUs with L4?) > +processed. > + > =item B<-c>, B<--code> > > Set code CBM when CDP is enabled. > @@ -1721,10 +1729,21 @@ Set data CBM when CDP is enabled. > > =back > > -=item B [I] > +=item B [I] [I] > > Show CAT settings for a certain domain or all domains. > > +B > + > +=over 4 > + > +=item B<-l LEVEL>, B<--level=LEVEL> > + > +Specify the cache level to process, otherwise the last level cache is Again, s/last level/last level (L3)// > +processed. > + > +=back > + > =back > > =head1 IGNORED FOR COMPATIBILITY WITH XM > diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown > index c3c1e8e..bd2b6bd 100644 > --- a/docs/misc/xl-psr.markdown > +++ b/docs/misc/xl-psr.markdown > @@ -70,7 +70,7 @@ total-mem-bandwidth instead of cache-occupancy). E.g. after > a `xl psr-cmt-attach > > Cache Allocation Technology (CAT) is a new feature available on Intel > Broadwell and later server platforms that allows an OS or Hypervisor/VMM to > -partition cache allocation (i.e. L3 cache) based on application priority or > +partition cache allocation (i.e. L3/L2 cache) based on application priority > or > Class of Service (COS). Each COS is configured using capacity bitmasks (CBM) > which represent cache capacity and indicate the degree of overlap and > isolation between classes. System cache resource is divided into numbers of > @@ -119,13 +119,19 @@ A cbm is valid only when: > In a multi-socket system, the same cbm will be set on each socket by default. > Per socket cbm can be specified with the `--socket SOCKET` option.
Re: [Xen-devel] [v2 2/3] x86: add support for L2 CAT in hypervisor.
On Thu, Sep 22, 2016 at 10:15:44AM +0800, Yi Sun wrote: > Add L2 CAT (Cache Allocation Technology) feature support in > hypervisor: > - Implement 'struct feat_ops' callback functions for L2 CAT > and initialize L2 CAT feature and add it into feature list. > - Add new sysctl to get L2 CAT information. > - Add new domctl to set/get L2 CAT CBM. > > Signed-off-by: He Chen> Signed-off-by: Yi Sun > > --- > Changed since v1: > * psr.c > - Function and variables names are changed to express accurately. > - Fix code style issues. > - Fix imprecise comments. > - Add one callback function, get_cos_num(), to fulfill the > abstraction requirement. > - Replace custom list management to system. > - Use 'const' to make codes more safe. > --- > xen/arch/x86/domctl.c | 13 +++ > xen/arch/x86/psr.c | 216 > > xen/arch/x86/sysctl.c | 13 +++ > xen/include/asm-x86/msr-index.h | 1 + > xen/include/asm-x86/psr.h | 2 + > xen/include/public/domctl.h | 2 + > xen/include/public/sysctl.h | 6 ++ > 7 files changed, 253 insertions(+) > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > index c53d819..24f85c7 100644 > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -1424,6 +1424,19 @@ long arch_do_domctl( > copyback = 1; > break; > > +case XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM: > +ret = psr_set_val(d, domctl->u.psr_cat_op.target, > + domctl->u.psr_cat_op.data, > + PSR_MASK_TYPE_L2_CBM); > +break; > + > +case XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM: > +ret = psr_get_val(d, domctl->u.psr_cat_op.target, > + >u.psr_cat_op.data, > + PSR_MASK_TYPE_L2_CBM); > +copyback = 1; > +break; > + > default: > ret = -EOPNOTSUPP; > break; > diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c > index 99e4c78..5000a3c 100644 > --- a/xen/arch/x86/psr.c > +++ b/xen/arch/x86/psr.c > @@ -137,6 +137,7 @@ struct psr_cat_lvl_info { > struct feat_info { > union { > struct psr_cat_lvl_info l3_info; > +struct psr_cat_lvl_info l2_info; > }; > }; > > @@ -158,12 +159,15 @@ struct psr_ref { > unsigned int ref; > }; > > + > #define PSR_SOCKET_L3_CAT 0 > #define PSR_SOCKET_L3_CDP 1 > +#define PSR_SOCKET_L2_CAT 2 > > struct psr_socket_info { > /* > * bit 1~0: [01]->L3 CAT-only, [10]->L3 CDP > + * bit 2: L2 CAT > */ > unsigned int features; > unsigned int nr_feat; > @@ -190,6 +194,7 @@ static DEFINE_PER_CPU(struct psr_assoc, psr_assoc); > static struct psr_ref *temp_cos_ref; > /* Every feature has its own object. */ > static struct feat_list *feat_l3; > +static struct feat_list *feat_l2; > > /* Common functions for supporting feature callback functions. */ > static void free_feature(struct psr_socket_info *info) > @@ -212,6 +217,12 @@ static void free_feature(struct psr_socket_info *info) > xfree(feat_l3); > feat_l3 = NULL; > } > + > +if ( feat_l2 ) > +{ > +xfree(feat_l2); > +feat_l2 = NULL; > +} > } > > static bool psr_check_cbm(unsigned int cbm_len, uint64_t cbm) > @@ -241,6 +252,195 @@ static bool psr_check_cbm(unsigned int cbm_len, > uint64_t cbm) > * Features specific implementations. > */ > > +/* L2 CAT callback functions implementation. */ > +static void l2_cat_init_feature(unsigned int eax, unsigned int ebx, > +unsigned int ecx, unsigned int edx, > +struct feat_list *feat, > +struct psr_socket_info *info) > +{ > +struct psr_cat_lvl_info l2_cat; > +unsigned int socket; > + > +/* No valid value so do not enable feature. */ s/value/values/ s/feature/the feature/ > +if ( 0 == eax || 0 == edx ) > +return; > + > +l2_cat.cbm_len = (eax & 0x1f) + 1; > +l2_cat.cos_max = min(opt_cos_max, edx & 0x); Hm, these 0x1f and 0x look same as L3. Perhaps make this a #define. > + > +/* cos=0 is reserved as default cbm(all ones). */ > +feat->cos_reg_val[0] = (1ull << l2_cat.cbm_len) - 1; > + > +feat->feature = PSR_SOCKET_L2_CAT; > +__set_bit(PSR_SOCKET_L2_CAT, >features); > + > +feat->info.l2_info = l2_cat; > + > +info->nr_feat++; > + > +/* Add this feature into list. */ > +list_add_tail(>list, >feat_list); > + > +socket = cpu_to_socket(smp_processor_id()); > +printk(XENLOG_INFO "L2 CAT: enabled on socket %u, cos_max:%u, > cbm_len:%u.\n", > + socket, feat->info.l2_info.cos_max, feat->info.l2_info.cbm_len); > +} > + > +static int l2_cat_compare_val(const uint64_t val[], > +
Re: [Xen-devel] [v2 1/3] x86: refactor psr implementation in hypervisor.
On Thu, Sep 22, 2016 at 10:15:20AM +0800, Yi Sun wrote: > Current psr.c is designed for supporting L3 CAT/CDP. It has many > limitations to add new feature. Considering to support more PSR > features, we need refactor PSR implementation to make it more > flexible and fulfill the principle, open for extension but closed > for modification. > > The core of the refactoring is to abstract the common actions and > encapsulate them into "struct feat_ops". The detailed steps to add > a new feature are described at the head of psr.c. > > Signed-off-by: Yi Sun> > --- > Changed since v1: > * sysctl.c > - Interface change for abstraction requirement. > * psr.c > - Function and variables names are changed to express accurately. > - Fix code style issues. > - Fix imprecise comments. > - Add one callback function, get_cos_num(), to fulfill the > abstraction requirement. > - Divide get_old_set_new() callback functions into two functions: > get_old_val() and set_new_val() make it more primitive. > - Change feat_info type from an array to union. > - Adjust some functions to replace if to switch to make them > clearer. > - Replace custom list management to system. > - Use 'const' to make codes more safe. > * psr.h > - Change 'enum mask_type' to 'enum psr_val_type' to express > more accurate. > - Change parameters of psr_get_info() to fulfill abstraction > requirement. > --- > xen/arch/x86/domctl.c | 36 +- > xen/arch/x86/psr.c| 1105 > +++-- Whoa. 1K changes. This is a dense patch. > xen/arch/x86/sysctl.c | 14 +- > xen/include/asm-x86/psr.h | 20 +- > 4 files changed, 903 insertions(+), 272 deletions(-) > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > index 2a2fe04..c53d819 100644 > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -1386,41 +1386,41 @@ long arch_do_domctl( > switch ( domctl->u.psr_cat_op.cmd ) > { > case XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM: > -ret = psr_set_l3_cbm(d, domctl->u.psr_cat_op.target, > - domctl->u.psr_cat_op.data, > - PSR_CBM_TYPE_L3); > +ret = psr_set_val(d, domctl->u.psr_cat_op.target, > + domctl->u.psr_cat_op.data, > + PSR_MASK_TYPE_L3_CBM); > break; > > case XEN_DOMCTL_PSR_CAT_OP_SET_L3_CODE: > -ret = psr_set_l3_cbm(d, domctl->u.psr_cat_op.target, > - domctl->u.psr_cat_op.data, > - PSR_CBM_TYPE_L3_CODE); > +ret = psr_set_val(d, domctl->u.psr_cat_op.target, > + domctl->u.psr_cat_op.data, > + PSR_MASK_TYPE_L3_CODE); > break; > > case XEN_DOMCTL_PSR_CAT_OP_SET_L3_DATA: > -ret = psr_set_l3_cbm(d, domctl->u.psr_cat_op.target, > - domctl->u.psr_cat_op.data, > - PSR_CBM_TYPE_L3_DATA); > +ret = psr_set_val(d, domctl->u.psr_cat_op.target, > + domctl->u.psr_cat_op.data, > + PSR_MASK_TYPE_L3_DATA); > break; > > case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM: > -ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target, > - >u.psr_cat_op.data, > - PSR_CBM_TYPE_L3); > +ret = psr_get_val(d, domctl->u.psr_cat_op.target, > + >u.psr_cat_op.data, > + PSR_MASK_TYPE_L3_CBM); > copyback = 1; > break; > > case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE: > -ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target, > - >u.psr_cat_op.data, > - PSR_CBM_TYPE_L3_CODE); > +ret = psr_get_val(d, domctl->u.psr_cat_op.target, > + >u.psr_cat_op.data, > + PSR_MASK_TYPE_L3_CODE); > copyback = 1; > break; > > case XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA: > -ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target, > - >u.psr_cat_op.data, > - PSR_CBM_TYPE_L3_DATA); > +ret = psr_get_val(d, domctl->u.psr_cat_op.target, > + >u.psr_cat_op.data, > + PSR_MASK_TYPE_L3_DATA); > copyback = 1; > break; > > diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c > index 0b5073c..99e4c78 100644 > --- a/xen/arch/x86/psr.c > +++ b/xen/arch/x86/psr.c > @@ -17,28 +17,159 @@ > #include > #include > #include > +#include >
Re: [Xen-devel] Adding new custom devices to Xen via QEMU
On 9/30/2016 3:47 PM, Konrad Rzeszutek Wilk wrote: On Fri, Sep 30, 2016 at 03:20:09PM -0400, Jason Dickens wrote: Thanks Konrad, [CC-ing Xen-devel again.] I think you and David have successfully answered my question and pointed me to the key code. I have already verified that the device operates if I move it into the space of the TPM, but see below for reasons why I don't really want that. The conclusion I'm drawing from your help is that to add a device where I need it, I have to modify xen at least for areas set up in xen_ram_init. I've also made a few comments inline below. Its perhaps worth the Xen team looking at why such modification is not necessary for KVM and considering supporting something more automatic. I don't know but I suspect that for KVM, RAM is anything not overridden by a hardware device. I don't know KVM enough to tell you. Keep in mind that under Xen you can launch guests without QEMU. That means the orchestration and layout of memory is not in the hands of QEMU (like it is with KVM). Hence xen_ram_init follows the suit of what the ABI expects (where the MMIO region is, etc). It makes sense. I;m aware we are doing very non-standard things and it didn't surprise me that it would expect a device where I put it. This is all good if you have an emulated IO device (which are under 1MB) or an emulated PCI device as they all follow the norm an allocate themsevles in well understood locations where there are no RAM. Jason On 9/30/2016 2:42 PM, Konrad Rzeszutek Wilk wrote: On Fri, Sep 30, 2016 at 10:29:20AM -0400, Jason Dickens wrote: Thanks David, This could very well be the issue, but could you please elaborate? The questions that come up are the following: What is the physical address range given to RAM? What range of addresses would work for my device? I am assuming that you implemented the emulation the same way as other devices - that is you picked an MMIO region for your device? Yes its essentially the same way of choosing memory as the tpm-tis.c implementation. Which AFAICT works with Xen. Actually, I think it was originally developed for a custom Xen implementation. And yes, it works. And, if this is the case, how would I unpopulate the RAM? See xen_ram_init. But I would just choose an region that is most definitly in MMIO (or IO) region for your emulation. As I said in a previous post there are important reasons why I need this device in a non-standard location. The nature the project has me searching for a sanitized but satisfying explanation for this post. Its not that I couldn't move it, as I said above I tried the setting the address range in the TPM space and it worked. I think what I'll say is the following: 1. Its for a proprietary, transparent, and invisible security feature. 2. It has to collaborate with other transparent features which help define its location (perhaps restrict the location is more correct). There are reasons for the address chosen, and it works on other hypervisors (e.g. KVM) so although it might be easiest to change the address I really What qemu call do you use to carve out the ranges for your device? The realization function uses: memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)),PORT_ADDR_BASE, >mmio); Which should have worked? It didn't? Of course this does work for the normal device space. It all depends on what PORT_ADDR_BASE is set to whether it works or not. I think David's comment about overlapping with RAM was correct in our case, and the primary problem. don't want to unless its the only way to keep from a Xen modification entirely. Jason On 9/30/2016 9:53 AM, David Vrabel wrote: On 30/09/16 14:35, Jason Dickens wrote: Hi Wei, Thanks for the response. It make sense to me that if the device were on the PCI bus (or other such bus, e.g. USB) that it could be discovered, at least by an OS. Its something to consider. I should mention that our guest VM doesn't actually use an OS. However, the device is not implemented that as PCI it is simply memory mapped. Technically, in QEMU is has type ISA because it was derived as a modification of the TPM device. Is it possible something is lacking in the QEMU model that Xen needs but KVM doesn't? If the answer is that Xen should not need modification for any new devices then this gives me hope. You've also inspired some things to try, like whether or not smaller modifications to the TPM device work. One change that is significant to mention is that the physical address range use is anomalous, by which I mean it not in the normal device range. Does device MMIO overlap with guest RAM? If so, you'll need to unpopulate the RAM first. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Adding new custom devices to Xen via QEMU
On Fri, Sep 30, 2016 at 03:20:09PM -0400, Jason Dickens wrote: > Thanks Konrad, [CC-ing Xen-devel again.] > > I think you and David have successfully answered my question and pointed me > to the key code. I have already verified that the device operates if I move > it into the space of the TPM, but see below for reasons why I don't really > want that. The conclusion I'm drawing from your help is that to add a > device where I need it, I have to modify xen at least for areas set up in > xen_ram_init. I've also made a few comments inline below. Its perhaps worth > the Xen team looking at why such modification is not necessary for KVM and > considering supporting something more automatic. I don't know but I suspect > that for KVM, RAM is anything not overridden by a hardware device. I don't know KVM enough to tell you. Keep in mind that under Xen you can launch guests without QEMU. That means the orchestration and layout of memory is not in the hands of QEMU (like it is with KVM). Hence xen_ram_init follows the suit of what the ABI expects (where the MMIO region is, etc). This is all good if you have an emulated IO device (which are under 1MB) or an emulated PCI device as they all follow the norm an allocate themsevles in well understood locations where there are no RAM. > > Jason > > On 9/30/2016 2:42 PM, Konrad Rzeszutek Wilk wrote: > > On Fri, Sep 30, 2016 at 10:29:20AM -0400, Jason Dickens wrote: > > > Thanks David, > > > > > > This could very well be the issue, but could you please elaborate? > > > The questions that come up are the following: > > > What is the physical address range given to RAM? What range of addresses > > > would work for my device? > > I am assuming that you implemented the emulation the same way > > as other devices - that is you picked an MMIO region for your > > device? > Yes its essentially the same way of choosing memory as the tpm-tis.c > implementation. Which AFAICT works with Xen. > > > > > And, if this is the case, how would I unpopulate the RAM? > > See xen_ram_init. But I would just choose an region that is > > most definitly in MMIO (or IO) region for your emulation. > As I said in a previous post there are important reasons why I need this > device in a non-standard location. The nature the project has me searching > for a sanitized but satisfying explanation for this post. Its not that I > couldn't move it, as I said above I tried the setting the address range in > the TPM space and it worked. I think what I'll say is the following: > 1. Its for a proprietary, transparent, and invisible security feature. > 2. It has to collaborate with other transparent features which help define > its location (perhaps restrict the location is more correct). > > > > There are reasons for the address chosen, and it works on other > > > hypervisors > > > (e.g. KVM) so although it might be easiest to change the address I really > > What qemu call do you use to carve out the ranges for your device? > The realization function uses: > memory_region_add_subregion(isa_address_space(ISA_DEVICE(dev)),PORT_ADDR_BASE, > >mmio); Which should have worked? It didn't? > > > > > don't want to unless its the only way to keep from a Xen modification > > > entirely. > > > Jason > > > > > > On 9/30/2016 9:53 AM, David Vrabel wrote: > > > > On 30/09/16 14:35, Jason Dickens wrote: > > > > > Hi Wei, > > > > > > > > > > Thanks for the response. It make sense to me that if the device were > > > > > on > > > > > the PCI bus (or other such bus, e.g. USB) that it could be discovered, > > > > > at least by an OS. Its something to consider. I should mention that > > > > > our > > > > > guest VM doesn't actually use an OS. > > > > > > > > > > However, the device is not implemented that as PCI it is simply memory > > > > > mapped. Technically, in QEMU is has type ISA because it was derived > > > > > as a > > > > > modification of the TPM device. Is it possible something is lacking in > > > > > the QEMU model that Xen needs but KVM doesn't? > > > > > If the answer is that Xen should not need modification for any new > > > > > devices then this gives me hope. You've also inspired some things to > > > > > try, like whether or not smaller modifications to the TPM device work. > > > > > One change that is significant to mention is that the physical address > > > > > range use is anomalous, by which I mean it not in the normal device > > > > > range. > > > > Does device MMIO overlap with guest RAM? If so, you'll need to > > > > unpopulate the RAM first. > > > > > > > > David > > > > > > > > > ___ > > > Xen-devel mailing list > > > Xen-devel@lists.xen.org > > > https://lists.xen.org/xen-devel > > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 6/9] tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE]..
On Fri, Sep 30, 2016 at 07:51:05PM +0100, Andrew Cooper wrote: > On 30/09/16 19:11, Konrad Rzeszutek Wilk wrote: > > diff --git a/xen/common/tmem_control.c b/xen/common/tmem_control.c > > index fc20a9f..151d8ef 100644 > > --- a/xen/common/tmem_control.c > > +++ b/xen/common/tmem_control.c > > @@ -258,43 +258,56 @@ static int tmemc_list(domid_t cli_id, > > tmem_cli_va_param_t buf, uint32_t len, > > return 0; > > } > > > > -static int __tmemc_set_var(struct client *client, uint32_t subop, uint32_t > > arg1) > > +static int __tmemc_set_client_info(struct client *client, > > + XEN_GUEST_HANDLE(xen_tmem_client_t) buf) > > { > > domid_t cli_id = client->cli_id; > > uint32_t old_weight; > > +xen_tmem_client_t info = { }; > > > > -switch (subop) > > +ASSERT(client); > > This ASSERT() is useless after have already dereferenced client to get > cli_id. Best to defer the initialisation of the domid and keep the > assert like this. > > > @@ -303,13 +316,35 @@ static int tmemc_set_var(domid_t cli_id, uint32_t > > subop, uint32_t arg1) > > { > > client = tmem_client_from_cli_id(cli_id); > > if ( client ) > > -ret = __tmemc_set_var(client, subop, arg1); > > +ret = __tmemc_set_client_info(client, info); > > } > > return ret; > > } > > > > -static int tmemc_save_subop(int cli_id, uint32_t pool_id, > > -uint32_t subop, tmem_cli_va_param_t buf, uint32_t > > arg1) > > +static int tmemc_get_client_info(int cli_id, > > + XEN_GUEST_HANDLE(xen_tmem_client_t) info) > > +{ > > +struct client *client = tmem_client_from_cli_id(cli_id); > > + > > +if ( client ) > > +{ > > +if ( copy_to_guest(info, >info, 1) ) > > +return -EFAULT; > > +} > > +else > > +{ > > +static const xen_tmem_client_t generic = { > > + .version = > > TMEM_SPEC_VERSION, > > + .maxpools = > > MAX_POOLS_PER_DOMAIN }; > > There is some weird alignment/tabs going on here. > > With these fixed, Acked-by: Andrew CooperThank you! Pls see inline: From 1ff700e418b3090d6e230c566d7387ec47d0bad0 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Fri, 30 Sep 2016 10:53:01 -0400 Subject: [PATCH v2.1 6/9] tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE].. Specifically: XEN_SYSCTL_TMEM_OP_SET_[WEIGHT,COMPRESS] are now done via: XEN_SYSCTL_TMEM_SET_CLIENT_INFO and XEN_SYSCTL_TMEM_OP_SAVE_GET_[VERSION,MAXPOOLS, CLIENT_WEIGHT, CLIENT_FLAGS] can now be retrieved via: XEN_SYSCTL_TMEM_GET_CLIENT_INFO All this information is now in 'struct xen_tmem_client' and that is what we pass around. We also rev up the XEN_SYSCTL_INTERFACE_VERSION as we are re-using the value number of the deleted ones (and henceforth the information is retrieved differently). On the toolstack, prior to this patch, the xc_tmem_control would use the bounce buffer only when arg1 was set and the cmd was to list. With the 'XEN_SYSCTL_TMEM_OP_SET_[WEIGHT|COMPRESS]' that made sense as the 'arg1' would have the value. However for the other ones (say XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID) the 'arg1' would be the length of the 'buf'. If this confusing don't despair, patch patch titled: tmem/xc_tmem_control: Rename 'arg1' to 'len' and 'arg2' to arg. takes care of that. The acute reader of the toolstack code will discover that we only used the bounce buffer for LIST, not for any other subcommands that used 'buf'!?! Which means that the contents of 'buf' would never be copied back to the calleer 'buf'! The author is not sure how this could possibly work, perhaps Xen 4.1 (when this was introduced) was more relaxed about the bounce buffer being enabled. Anyhow this fixes xc_tmem_control to do it for any subcommand that has 'arg1'. Lastly some of the checks in xc_tmem_[restore|save] are removed as they can't ever be reached (not even sure how they could have been reached in the original submission). One of them is the check for the weight against -1 when in fact the hypervisor would never have provided that value. Now the checks are simple - as the hypercall always returns ->version and ->maxpools (which is mirroring how it was done prior to this patch). But if one wants to check the if a guest has any tmem activity then the patch titled "tmem: Batch and squash XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_ [FLAGS,NPAGES,UUID] in one sub-call: XEN_SYSCTL_TMEM_OP_GET_POOLS." adds an ->nr_pools to check for that. Also we add the check for ->version and ->maxpools and remove the TODO. Acked-by: Andrew Cooper Acked-by: Wei Liu Signed-off-by: Konrad Rzeszutek Wilk --- Cc: Ian Jackson Cc: Wei Liu v1: New
Re: [Xen-devel] [PATCH v2 9/9] tmem: Batch and squash XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_[FLAGS, NPAGES, UUID] in one sub-call: XEN_SYSCTL_TMEM_OP_GET_POOLS.
On 30/09/16 19:11, Konrad Rzeszutek Wilk wrote: > These operations are used during the save process of migration. > Instead of doing 64 hypercalls lets do just one. We modify > the 'struct xen_tmem_client' structure (used in > XEN_SYSCTL_TMEM_OP_[GET|SET]_CLIENT_INFO) to have an extra field > 'nr_pools'. Armed with that the code slurping up pages from the > hypervisor can allocate a big enough structure (struct tmem_pool_info) > to contain all the active pools. And then just iterate over each > one and save it in the stream. > > We are also re-using one of the subcommands numbers for this, > as such the XEN_SYSCTL_INTERFACE_VERSION should be incremented > and that was done in the patch titled: > "tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE].." > > In the xc_tmem_[save|restore] we also added proper memory handling > of the 'buf' and 'pools'. Because of the loops and to make it as > easy as possible to review we add a goto label and for almost > all error conditions jump in it. > > The include for inttypes is required for the PRId64 macro to > work (which is needed to compile this code under 32-bit). > > Acked-by: Wei Liu> Signed-off-by: Konrad Rzeszutek Wilk Acked-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 101225: tolerable FAIL - PUSHED
flight 101225 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/101225/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-rtds 9 debian-install fail like 101215 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 101219 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 101219 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 101219 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 101219 Tests which did not succeed, but are not blocking: test-amd64-i386-rumprun-i386 1 build-check(1) blocked n/a test-amd64-amd64-rumprun-amd64 1 build-check(1) blocked n/a build-amd64-rumprun 5 rumprun-buildfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass build-i386-rumprun5 rumprun-buildfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass version targeted for testing: xen 002e3ffb2156a135abbfb57374802922e162c5ae baseline version: xen eabe1c39d1cd4fcef18ec50571db3c70c055c1fb Last test of basis 101219 2016-09-30 05:27:57 Z0 days Testing same since 101225 2016-09-30 12:13:29 Z0 days1 attempts People who touched revisions under test: Jan BeulichKonrad Rzeszutek Wilk jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64-xtf pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-oldkern pass
Re: [Xen-devel] [PATCH v2 8/9] tmem/xc_tmem_control: Rename 'arg1' to 'len' and 'arg2' to arg.
On 30/09/16 19:11, Konrad Rzeszutek Wilk wrote: > That is what they are used for. Lets make it more clear. > > Of all the various sub-commands, the only one that needed > semantic change is XEN_SYSCTL_TMEM_OP_SAVE_BEGIN. That in the > past used 'arg1', and now we are moving it to use 'arg'. > Since that code is only used during migration which is tied > to the toolstack it is OK to change it. > > We should increment the XEN_SYSCTL_INTERFACE_VERSION because > of this, and that was fortunatly done in the patch titled: > "tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE].." > > While at it, also fix xc_tmem_control_oid to properly handle > the 'buf' and bounce it as appropiate. > > Acked-by: Wei Liu> Signed-off-by: Konrad Rzeszutek Wilk Acked-by: Andrew cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 7/9] tmem: Unify XEN_SYSCTL_TMEM_OP_[[SAVE_[BEGIN|END]|RESTORE_BEGIN]
On 30/09/16 19:11, Konrad Rzeszutek Wilk wrote: > return values. For success they used to be 1 ([SAVE,RESTORE]_BEGIN), > 0 if guest did not have any tmem (but only for SAVE_BEGIN), and > -1 for any type of failure. > > And SAVE_END (which you would think would mirror SAVE_BEGIN) > had 0 for success and -1 if guest did not any tmem enabled for it. > > This is confusing. Now the code will return 0 if the operation was > success. Various XEN_EXX values are returned if tmem is not enabled > or the operation could not performed. > > The xc_tmem.c code only needs one place to check - where we use > SAVE_BEGIN. The place where RESTORE_BEGIN is used will have errno > with the proper error value and return will be -1, so will still > fail properly. > > Acked-by: Wei Liu> Signed-off-by: Konrad Rzeszutek Wilk Acked-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 6/9] tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE]..
On 30/09/16 19:11, Konrad Rzeszutek Wilk wrote: > diff --git a/xen/common/tmem_control.c b/xen/common/tmem_control.c > index fc20a9f..151d8ef 100644 > --- a/xen/common/tmem_control.c > +++ b/xen/common/tmem_control.c > @@ -258,43 +258,56 @@ static int tmemc_list(domid_t cli_id, > tmem_cli_va_param_t buf, uint32_t len, > return 0; > } > > -static int __tmemc_set_var(struct client *client, uint32_t subop, uint32_t > arg1) > +static int __tmemc_set_client_info(struct client *client, > + XEN_GUEST_HANDLE(xen_tmem_client_t) buf) > { > domid_t cli_id = client->cli_id; > uint32_t old_weight; > +xen_tmem_client_t info = { }; > > -switch (subop) > +ASSERT(client); This ASSERT() is useless after have already dereferenced client to get cli_id. Best to defer the initialisation of the domid and keep the assert like this. > @@ -303,13 +316,35 @@ static int tmemc_set_var(domid_t cli_id, uint32_t > subop, uint32_t arg1) > { > client = tmem_client_from_cli_id(cli_id); > if ( client ) > -ret = __tmemc_set_var(client, subop, arg1); > +ret = __tmemc_set_client_info(client, info); > } > return ret; > } > > -static int tmemc_save_subop(int cli_id, uint32_t pool_id, > -uint32_t subop, tmem_cli_va_param_t buf, uint32_t > arg1) > +static int tmemc_get_client_info(int cli_id, > + XEN_GUEST_HANDLE(xen_tmem_client_t) info) > +{ > +struct client *client = tmem_client_from_cli_id(cli_id); > + > +if ( client ) > +{ > +if ( copy_to_guest(info, >info, 1) ) > +return -EFAULT; > +} > +else > +{ > +static const xen_tmem_client_t generic = { > + .version = > TMEM_SPEC_VERSION, > + .maxpools = > MAX_POOLS_PER_DOMAIN }; There is some weird alignment/tabs going on here. With these fixed, Acked-by: Andrew Cooper___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Adding new custom devices to Xen via QEMU
On Fri, Sep 30, 2016 at 10:29:20AM -0400, Jason Dickens wrote: > Thanks David, > > This could very well be the issue, but could you please elaborate? > The questions that come up are the following: > What is the physical address range given to RAM? What range of addresses > would work for my device? I am assuming that you implemented the emulation the same way as other devices - that is you picked an MMIO region for your device? > And, if this is the case, how would I unpopulate the RAM? See xen_ram_init. But I would just choose an region that is most definitly in MMIO (or IO) region for your emulation. > > There are reasons for the address chosen, and it works on other hypervisors > (e.g. KVM) so although it might be easiest to change the address I really What qemu call do you use to carve out the ranges for your device? > don't want to unless its the only way to keep from a Xen modification > entirely. > > Jason > > On 9/30/2016 9:53 AM, David Vrabel wrote: > > On 30/09/16 14:35, Jason Dickens wrote: > > > Hi Wei, > > > > > > Thanks for the response. It make sense to me that if the device were on > > > the PCI bus (or other such bus, e.g. USB) that it could be discovered, > > > at least by an OS. Its something to consider. I should mention that our > > > guest VM doesn't actually use an OS. > > > > > > However, the device is not implemented that as PCI it is simply memory > > > mapped. Technically, in QEMU is has type ISA because it was derived as a > > > modification of the TPM device. Is it possible something is lacking in > > > the QEMU model that Xen needs but KVM doesn't? > > > If the answer is that Xen should not need modification for any new > > > devices then this gives me hope. You've also inspired some things to > > > try, like whether or not smaller modifications to the TPM device work. > > > One change that is significant to mention is that the physical address > > > range use is anomalous, by which I mean it not in the normal device range. > > Does device MMIO overlap with guest RAM? If so, you'll need to > > unpopulate the RAM first. > > > > David > > > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 101227: tolerable all pass - PUSHED
flight 101227 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/101227/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass version targeted for testing: xen 4b997d7aa8f82b5f7b0757bb6b73546dc98714a3 baseline version: xen 47bd00674201bb1a030d4eb604b074a156b1f4d5 Last test of basis 101226 2016-09-30 14:02:55 Z0 days Testing same since 101227 2016-09-30 17:01:50 Z0 days1 attempts People who touched revisions under test: Dario FaggioliGeorge Dunlap Jan Beulich Wei Liu jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=4b997d7aa8f82b5f7b0757bb6b73546dc98714a3 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 4b997d7aa8f82b5f7b0757bb6b73546dc98714a3 + branch=xen-unstable-smoke + revision=4b997d7aa8f82b5f7b0757bb6b73546dc98714a3 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.7-testing + '[' x4b997d7aa8f82b5f7b0757bb6b73546dc98714a3 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ :
Re: [Xen-devel] [PATCH v2 4/9] tmem: Move client weight, frozen, live_migrating, and compress
On 30/09/16 19:11, Konrad Rzeszutek Wilk wrote: > in its own structure. This paves the way to make only > one hypercall to retrieve/set this information instead of multiple > ones. > > Acked-by: Wei Liu> Signed-off-by: Konrad Rzeszutek Wilk Acked-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 5/9] tmem/sysctl: Add union in struct xen_sysctl_tmem_op
On 30/09/16 19:11, Konrad Rzeszutek Wilk wrote: > No functional change. We do this to prepare for another > entry to be added in the union. See patch titled: > "tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE]" > > Signed-off-by: Konrad Rzeszutek WilkAcked-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/9] tmem: Delete deduplication (and tze) code.
On 30/09/16 19:11, Konrad Rzeszutek Wilk wrote: > Couple of reasons: > - It can lead to security issues (see row-hammer, KSM and such >attacks). > - Code is quite complex. > - Deduplication is good if the pages themselves are the same >but that is hardly guaranteed. > - We got some gains (if pages are deduped) but at the cost of >making code less maintainable. > - tze depends on deduplication code. > > As such, deleting it. > > Signed-off-by: Konrad Rzeszutek WilkAcked-by: Andrew Cooper > --- > v1: First submission. > v2: Squash " tmem: Wrap tmem dedup code with CONFIG_TMEM_DEDUP" > and "tmem: Wrap tmem tze code with CONFIG_TMEM_TZE" > --- > docs/misc/xen-command-line.markdown | 6 - > xen/common/tmem.c | 310 > +--- > xen/common/tmem_control.c | 7 - > xen/common/tmem_xen.c | 28 > xen/include/xen/tmem_xen.h | 115 - > 5 files changed, 3 insertions(+), 463 deletions(-) <3 this diffstat :) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 6/9] tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE]..
On Fri, Sep 30, 2016 at 02:11:51PM -0400, Konrad Rzeszutek Wilk wrote: > --- > tools/libxc/xc_tmem.c | 73 +++ > tools/libxl/libxl.c | 26 +++--- > tools/python/xen/lowlevel/xc/xc.c | 2 - Acked-by: Wei Liu> xen/common/tmem.c | 2 + > xen/common/tmem_control.c | 102 > ++ > xen/include/public/sysctl.h | 23 ++--- > xen/include/xen/tmem_xen.h| 2 +- > 7 files changed, 126 insertions(+), 104 deletions(-) > > diff --git a/tools/libxc/xc_tmem.c b/tools/libxc/xc_tmem.c > index e1de16e..7c06841 100644 > --- a/tools/libxc/xc_tmem.c > +++ b/tools/libxc/xc_tmem.c > @@ -18,6 +18,7 @@ > */ > > #include "xc_private.h" > +#include > #include > > static int do_tmem_op(xc_interface *xch, tmem_op_t *op) > @@ -67,7 +68,12 @@ int xc_tmem_control(xc_interface *xch, > sysctl.u.tmem_op.oid.oid[1] = 0; > sysctl.u.tmem_op.oid.oid[2] = 0; > > -if ( cmd == XEN_SYSCTL_TMEM_OP_LIST && arg1 != 0 ) > +if ( cmd == XEN_SYSCTL_TMEM_OP_SET_CLIENT_INFO ) > +{ > +HYPERCALL_BOUNCE_SET_DIR(buf, XC_HYPERCALL_BUFFER_BOUNCE_IN); > +} "Braces should be omitted for blocks with a single statement." ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 1/9] libxc/tmem/restore: Remove call to XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION
The only thing this hypercall returns is TMEM_SPEC_VERSION. The comment around is also misleading - this call does not do any domain operation. Acked-by: Wei LiuSigned-off-by: Konrad Rzeszutek Wilk --- Cc: Ian Jackson Cc: Wei Liu v1: Initial submission. v2: Added Wei's Ack. --- tools/libxc/xc_tmem.c | 4 1 file changed, 4 deletions(-) diff --git a/tools/libxc/xc_tmem.c b/tools/libxc/xc_tmem.c index 4e5c278..31ae3f5 100644 --- a/tools/libxc/xc_tmem.c +++ b/tools/libxc/xc_tmem.c @@ -381,16 +381,12 @@ static int xc_tmem_restore_new_pool( int xc_tmem_restore(xc_interface *xch, int dom, int io_fd) { -uint32_t save_version; uint32_t this_max_pools, this_version; uint32_t pool_id; uint32_t minusone; uint32_t weight, cap, flags; int checksum = 0; -save_version = xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION,dom,0,0,NULL); -if ( save_version == -1 ) -return -1; /* domain doesn't exist */ if ( read_exact(io_fd, _version, sizeof(this_version)) ) return -1; if ( read_exact(io_fd, _max_pools, sizeof(this_max_pools)) ) -- 2.4.11 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 4/9] tmem: Move client weight, frozen, live_migrating, and compress
in its own structure. This paves the way to make only one hypercall to retrieve/set this information instead of multiple ones. Acked-by: Wei LiuSigned-off-by: Konrad Rzeszutek Wilk --- v1: First submission. v2: Rebase on " tmem: Retire XEN_SYSCTL_TMEM_OP_[SET_CAP|SAVE_GET_CLIENT_CAP]" which deleted the ->cap field. s/tmem_client/xen_tmem_client/ Deleted 'version' and 'maxpools' from the structure. Add spaces in the title after the comma. Added Wei's Ack --- xen/common/tmem.c | 38 +++--- xen/common/tmem_control.c | 18 +- xen/include/public/sysctl.h | 12 xen/include/xen/tmem_xen.h | 5 + 4 files changed, 41 insertions(+), 32 deletions(-) diff --git a/xen/common/tmem.c b/xen/common/tmem.c index cf5271c..95b97b8 100644 --- a/xen/common/tmem.c +++ b/xen/common/tmem.c @@ -333,7 +333,7 @@ static void pgp_free(struct tmem_page_descriptor *pgp) atomic_dec(>pgp_count); ASSERT(_atomic_read(pool->pgp_count) >= 0); pgp->size = -1; -if ( is_persistent(pool) && pool->client->live_migrating ) +if ( is_persistent(pool) && pool->client->info.flags.u.migrating ) { pgp->inv_oid = pgp->us.obj->oid; pgp->pool_id = pool->pool_id; @@ -370,7 +370,7 @@ static void pgp_delist_free(struct tmem_page_descriptor *pgp) } else { -if ( client->live_migrating ) +if ( client->info.flags.u.migrating ) { spin_lock(_lists_spinlock); list_add_tail(>client_inv_pages, @@ -791,7 +791,7 @@ static void pool_flush(struct tmem_pool *pool, domid_t cli_id) is_persistent(pool) ? "persistent" : "ephemeral" , is_shared(pool) ? "shared" : "private", tmem_cli_id_str, pool->client->cli_id, pool->pool_id); -if ( pool->client->live_migrating ) +if ( pool->client->info.flags.u.migrating ) { tmem_client_warn("can't destroy pool while %s is live-migrating\n", tmem_client_str); @@ -843,7 +843,7 @@ static struct client *client_create(domid_t cli_id) rcu_unlock_domain(d); client->cli_id = cli_id; -client->compress = tmem_compression_enabled(); +client->info.flags.u.compress = tmem_compression_enabled(); client->shared_auth_required = tmem_shared_auth(); for ( i = 0; i < MAX_GLOBAL_SHARED_POOLS; i++) client->shared_auth_uuid[i][0] = @@ -887,11 +887,11 @@ static bool_t client_over_quota(struct client *client) int total = _atomic_read(tmem_global.client_weight_total); ASSERT(client != NULL); -if ( (total == 0) || (client->weight == 0) || +if ( (total == 0) || (client->info.weight == 0) || (client->eph_count == 0) ) return 0; return ( ((tmem_global.eph_count*100L) / client->eph_count ) > - ((total*100L) / client->weight) ); + ((total*100L) / client->info.weight) ); } / MEMORY REVOCATION ROUTINES ***/ @@ -1067,10 +1067,10 @@ static int do_tmem_dup_put(struct tmem_page_descriptor *pgp, xen_pfn_t cmfn, pool = obj->pool; ASSERT(pool != NULL); client = pool->client; -if ( client->live_migrating ) +if ( client->info.flags.u.migrating ) goto failed_dup; /* No dups allowed when migrating. */ /* Can we successfully manipulate pgp to change out the data? */ -if ( client->compress && pgp->size != 0 ) +if ( client->info.flags.u.compress && pgp->size != 0 ) { ret = do_tmem_put_compress(pgp, cmfn, clibuf); if ( ret == 1 ) @@ -1142,7 +1142,7 @@ static int do_tmem_put(struct tmem_pool *pool, ASSERT(pool != NULL); client = pool->client; ASSERT(client != NULL); -ret = client->frozen ? -EFROZEN : -ENOMEM; +ret = client->info.flags.u.frozen ? -EFROZEN : -ENOMEM; pool->puts++; refind: @@ -1156,14 +1156,14 @@ refind: else { /* No puts allowed into a frozen pool (except dup puts). */ -if ( client->frozen ) +if ( client->info.flags.u.frozen ) goto unlock_obj; } } else { /* No puts allowed into a frozen pool (except dup puts). */ -if ( client->frozen ) +if ( client->info.flags.u.frozen ) return ret; if ( (obj = obj_alloc(pool, oidp)) == NULL ) return -ENOMEM; @@ -1198,7 +1198,7 @@ refind: pgp->index = index; pgp->size = 0; -if ( client->compress ) +if ( client->info.flags.u.compress ) { ASSERT(pgp->pfp == NULL); ret = do_tmem_put_compress(pgp, cmfn, clibuf); @@ -1390,7 +1390,7 @@ static int do_tmem_flush_page(struct tmem_pool *pool, pool->flushs_found++; out: -if ( pool->client->frozen ) +if ( pool->client->info.flags.u.frozen ) return -EFROZEN; else
[Xen-devel] [PATCH v2 2/9] tmem: Retire XEN_SYSCTL_TMEM_OP_[SET_CAP|SAVE_GET_CLIENT_CAP]
It is not used by anything. Its intent was to complement the 'weight' attribute but there hadn't been any request for this. If there is a need to resurface it, it can be integrated back via the XEN_SYSCTL_TMEM_SET_CLIENT_INFO introduced in "tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE].." Acked-by: Wei LiuSigned-off-by: Konrad Rzeszutek Wilk --- Cc: Ian Jackson Cc: Wei Liu v1: First submission v2: Added Wei's Ack. Deleted ->cap from 'struct client' --- docs/man/xl.pod.1.in | 4 tools/libxc/xc_tmem.c | 13 +++-- tools/libxl/libxl.c | 4 +--- tools/libxl/xl_cmdtable.c | 1 - tools/python/xen/lowlevel/xc/xc.c | 1 - xen/common/tmem_control.c | 16 ++-- xen/include/public/sysctl.h | 2 -- xen/include/xen/tmem_xen.h| 1 - 8 files changed, 6 insertions(+), 36 deletions(-) diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in index 803c67e..8e2aa5b 100644 --- a/docs/man/xl.pod.1.in +++ b/docs/man/xl.pod.1.in @@ -1589,10 +1589,6 @@ B Weight (int) -=item B<-c> I - -Cap (int) - =item B<-p> I Compress (int) diff --git a/tools/libxc/xc_tmem.c b/tools/libxc/xc_tmem.c index 31ae3f5..24c8b43 100644 --- a/tools/libxc/xc_tmem.c +++ b/tools/libxc/xc_tmem.c @@ -212,7 +212,7 @@ int xc_tmem_save(xc_interface *xch, int marker = field_marker; int i, j; uint32_t max_pools, version; -uint32_t weight, cap, flags; +uint32_t weight, flags; uint32_t pool_id; uint32_t minusone = -1; struct tmem_handle *h; @@ -238,10 +238,7 @@ int xc_tmem_save(xc_interface *xch, weight = xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT,dom,0,0,NULL); if ( write_exact(io_fd, , sizeof(weight)) ) return -1; -cap = xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP,dom,0,0,NULL); -if ( write_exact(io_fd, , sizeof(cap)) ) -return -1; -if ( flags == -1 || weight == -1 || cap == -1 ) +if ( flags == -1 || weight == -1 ) return -1; if ( write_exact(io_fd, , sizeof(minusone)) ) return -1; @@ -384,7 +381,7 @@ int xc_tmem_restore(xc_interface *xch, int dom, int io_fd) uint32_t this_max_pools, this_version; uint32_t pool_id; uint32_t minusone; -uint32_t weight, cap, flags; +uint32_t weight, flags; int checksum = 0; if ( read_exact(io_fd, _version, sizeof(this_version)) ) @@ -410,10 +407,6 @@ int xc_tmem_restore(xc_interface *xch, int dom, int io_fd) return -1; if ( xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SET_WEIGHT,dom,0,0,NULL) < 0 ) return -1; -if ( read_exact(io_fd, , sizeof(cap)) ) -return -1; -if ( xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SET_CAP,dom,0,0,NULL) < 0 ) -return -1; if ( read_exact(io_fd, , sizeof(minusone)) ) return -1; while ( read_exact(io_fd, _id, sizeof(pool_id)) == 0 && pool_id != -1 ) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 9bf1f4c..ab44e3c 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -6156,8 +6156,6 @@ static int32_t tmem_setop_from_string(char *set_name) { if (!strcmp(set_name, "weight")) return XEN_SYSCTL_TMEM_OP_SET_WEIGHT; -else if (!strcmp(set_name, "cap")) -return XEN_SYSCTL_TMEM_OP_SET_CAP; else if (!strcmp(set_name, "compress")) return XEN_SYSCTL_TMEM_OP_SET_COMPRESS; else @@ -6171,7 +6169,7 @@ int libxl_tmem_set(libxl_ctx *ctx, uint32_t domid, char* name, uint32_t set) GC_INIT(ctx); if (subop == -1) { -LOGEV(ERROR, -1, "Invalid set, valid sets are "); +LOGEV(ERROR, -1, "Invalid set, valid sets are "); rc = ERROR_INVAL; goto out; } diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c index 5a57342..588d5d9 100644 --- a/tools/libxl/xl_cmdtable.c +++ b/tools/libxl/xl_cmdtable.c @@ -407,7 +407,6 @@ struct cmd_spec cmd_table[] = { "[|-a] [-w[=WEIGHT]|-c[=CAP]|-p[=COMPRESS]]", " -a Operate on all tmem\n" " -w WEIGHT Weight (int)\n" - " -c CAP Cap (int)\n" " -p COMPRESSCompress (int)", }, { "tmem-shared-auth", diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c index 8411789..287f590 100644 --- a/tools/python/xen/lowlevel/xc/xc.c +++ b/tools/python/xen/lowlevel/xc/xc.c @@ -1641,7 +1641,6 @@ static PyObject *pyxc_tmem_control(XcObject *self, case XEN_SYSCTL_TMEM_OP_FREEZE: case XEN_SYSCTL_TMEM_OP_DESTROY: case XEN_SYSCTL_TMEM_OP_SET_WEIGHT: -case XEN_SYSCTL_TMEM_OP_SET_CAP: case XEN_SYSCTL_TMEM_OP_SET_COMPRESS: default: break; diff --git a/xen/common/tmem_control.c
[Xen-devel] [PATCH v2 3/9] tmem: Delete deduplication (and tze) code.
Couple of reasons: - It can lead to security issues (see row-hammer, KSM and such attacks). - Code is quite complex. - Deduplication is good if the pages themselves are the same but that is hardly guaranteed. - We got some gains (if pages are deduped) but at the cost of making code less maintainable. - tze depends on deduplication code. As such, deleting it. Signed-off-by: Konrad Rzeszutek Wilk--- v1: First submission. v2: Squash " tmem: Wrap tmem dedup code with CONFIG_TMEM_DEDUP" and "tmem: Wrap tmem tze code with CONFIG_TMEM_TZE" --- docs/misc/xen-command-line.markdown | 6 - xen/common/tmem.c | 310 +--- xen/common/tmem_control.c | 7 - xen/common/tmem_xen.c | 28 xen/include/xen/tmem_xen.h | 115 - 5 files changed, 3 insertions(+), 463 deletions(-) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index 8ff57fa..cd9534b 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1500,15 +1500,9 @@ pages) must also be specified via the tbuf\_size parameter. ### tmem\_compress > `= ` -### tmem\_dedup -> `= ` - ### tmem\_shared\_auth > `= ` -### tmem\_tze -> `= ` - ### tsc > `= unstable | skewed | stable:socket` diff --git a/xen/common/tmem.c b/xen/common/tmem.c index efaafc4..cf5271c 100644 --- a/xen/common/tmem.c +++ b/xen/common/tmem.c @@ -71,10 +71,7 @@ struct tmem_page_descriptor { pagesize_t size; /* 0 == PAGE_SIZE (pfp), -1 == data invalid, else compressed data (cdata). */ uint32_t index; -/* Must hold pcd_tree_rwlocks[firstbyte] to use pcd pointer/siblings. */ -uint16_t firstbyte; /* NON_SHAREABLE->pfp otherwise->pcd. */ bool_t eviction_attempted; /* CHANGE TO lifetimes? (settable). */ -struct list_head pcd_siblings; union { struct page_info *pfp; /* Page frame pointer. */ char *cdata; /* Compressed data. */ @@ -92,17 +89,11 @@ struct tmem_page_content_descriptor { union { struct page_info *pfp; /* Page frame pointer. */ char *cdata; /* If compression_enabled. */ -char *tze; /* If !compression_enabled, trailing zeroes eliminated. */ }; -struct list_head pgp_list; -struct rb_node pcd_rb_tree_node; -uint32_t pgp_ref_count; pagesize_t size; /* If compression_enabled -> 0 firstbyte; -struct tmem_page_content_descriptor *pcd; -int ret; - -ASSERT(tmem_dedup_enabled()); -read_lock(_tree_rwlocks[firstbyte]); -pcd = pgp->pcd; -if ( pgp->size < PAGE_SIZE && pgp->size != 0 && - pcd->size < PAGE_SIZE && pcd->size != 0 ) -ret = tmem_decompress_to_client(cmfn, pcd->cdata, pcd->size, - tmem_cli_buf_null); -else if ( tmem_tze_enabled() && pcd->size < PAGE_SIZE ) -ret = tmem_copy_tze_to_client(cmfn, pcd->tze, pcd->size); -else -ret = tmem_copy_to_client(cmfn, pcd->pfp, tmem_cli_buf_null); -read_unlock(_tree_rwlocks[firstbyte]); -return ret; -} - -/* - * Ensure pgp no longer points to pcd, nor vice-versa. - * Take pcd rwlock unless have_pcd_rwlock is set, always unlock when done. - */ -static void pcd_disassociate(struct tmem_page_descriptor *pgp, struct tmem_pool *pool, bool_t have_pcd_rwlock) -{ -struct tmem_page_content_descriptor *pcd = pgp->pcd; -struct page_info *pfp = pgp->pcd->pfp; -uint16_t firstbyte = pgp->firstbyte; -char *pcd_tze = pgp->pcd->tze; -pagesize_t pcd_size = pcd->size; -pagesize_t pgp_size = pgp->size; -char *pcd_cdata = pgp->pcd->cdata; -pagesize_t pcd_csize = pgp->pcd->size; - -ASSERT(tmem_dedup_enabled()); -ASSERT(firstbyte != NOT_SHAREABLE); -ASSERT(firstbyte < 256); - -if ( have_pcd_rwlock ) -ASSERT_WRITELOCK(_tree_rwlocks[firstbyte]); -else -write_lock(_tree_rwlocks[firstbyte]); -list_del_init(>pcd_siblings); -pgp->pcd = NULL; -pgp->firstbyte = NOT_SHAREABLE; -pgp->size = -1; -if ( --pcd->pgp_ref_count ) -{ -write_unlock(_tree_rwlocks[firstbyte]); -return; -} - -/* No more references to this pcd, recycle it and the physical page. */ -ASSERT(list_empty(>pgp_list)); -pcd->pfp = NULL; -/* Remove pcd from rbtree. */ -rb_erase(>pcd_rb_tree_node,_tree_roots[firstbyte]); -/* Reinit the struct for safety for now. */ -
[Xen-devel] [PATCH v2 6/9] tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE]..
Specifically: XEN_SYSCTL_TMEM_OP_SET_[WEIGHT,COMPRESS] are now done via: XEN_SYSCTL_TMEM_SET_CLIENT_INFO and XEN_SYSCTL_TMEM_OP_SAVE_GET_[VERSION,MAXPOOLS, CLIENT_WEIGHT, CLIENT_FLAGS] can now be retrieved via: XEN_SYSCTL_TMEM_GET_CLIENT_INFO All this information is now in 'struct xen_tmem_client' and that is what we pass around. We also rev up the XEN_SYSCTL_INTERFACE_VERSION as we are re-using the value number of the deleted ones (and henceforth the information is retrieved differently). On the toolstack, prior to this patch, the xc_tmem_control would use the bounce buffer only when arg1 was set and the cmd was to list. With the 'XEN_SYSCTL_TMEM_OP_SET_[WEIGHT|COMPRESS]' that made sense as the 'arg1' would have the value. However for the other ones (say XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID) the 'arg1' would be the length of the 'buf'. If this confusing don't despair, patch patch titled: tmem/xc_tmem_control: Rename 'arg1' to 'len' and 'arg2' to arg. takes care of that. The acute reader of the toolstack code will discover that we only used the bounce buffer for LIST, not for any other subcommands that used 'buf'!?! Which means that the contents of 'buf' would never be copied back to the calleer 'buf'! The author is not sure how this could possibly work, perhaps Xen 4.1 (when this was introduced) was more relaxed about the bounce buffer being enabled. Anyhow this fixes xc_tmem_control to do it for any subcommand that has 'arg1'. Lastly some of the checks in xc_tmem_[restore|save] are removed as they can't ever be reached (not even sure how they could have been reached in the original submission). One of them is the check for the weight against -1 when in fact the hypervisor would never have provided that value. Now the checks are simple - as the hypercall always returns ->version and ->maxpools (which is mirroring how it was done prior to this patch). But if one wants to check the if a guest has any tmem activity then the patch titled "tmem: Batch and squash XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_ [FLAGS,NPAGES,UUID] in one sub-call: XEN_SYSCTL_TMEM_OP_GET_POOLS." adds an ->nr_pools to check for that. Also we add the check for ->version and ->maxpools and remove the TODO. Signed-off-by: Konrad Rzeszutek Wilk--- Cc: Ian Jackson Cc: Wei Liu v1: New submission. v2: s/xen_sysctl_tmem_client/xen_tmem_client/ Remove the 'subop' check in __tmemc_set_var as there is only one command doing it now. - Add the 'version' and 'maxpools' in the 'xen_tmem_client_t' - Squash in 'tmem: Check version and maxpools when XEN_SYSCTL_TMEM_SET_CLIENT_INFO' - Squash in 'tmem: Handle 'struct tmem_info' as a seperate field in the' with some of its content moved to 'tmem/sysctl: Add union in struct xen_sysctl_tmem_op' - Redid the commit description to include comments from: 'tmem: Handle 'struct tmem_info' as a sep..' --- tools/libxc/xc_tmem.c | 73 +++ tools/libxl/libxl.c | 26 +++--- tools/python/xen/lowlevel/xc/xc.c | 2 - xen/common/tmem.c | 2 + xen/common/tmem_control.c | 102 ++ xen/include/public/sysctl.h | 23 ++--- xen/include/xen/tmem_xen.h| 2 +- 7 files changed, 126 insertions(+), 104 deletions(-) diff --git a/tools/libxc/xc_tmem.c b/tools/libxc/xc_tmem.c index e1de16e..7c06841 100644 --- a/tools/libxc/xc_tmem.c +++ b/tools/libxc/xc_tmem.c @@ -18,6 +18,7 @@ */ #include "xc_private.h" +#include #include static int do_tmem_op(xc_interface *xch, tmem_op_t *op) @@ -67,7 +68,12 @@ int xc_tmem_control(xc_interface *xch, sysctl.u.tmem_op.oid.oid[1] = 0; sysctl.u.tmem_op.oid.oid[2] = 0; -if ( cmd == XEN_SYSCTL_TMEM_OP_LIST && arg1 != 0 ) +if ( cmd == XEN_SYSCTL_TMEM_OP_SET_CLIENT_INFO ) +{ +HYPERCALL_BOUNCE_SET_DIR(buf, XC_HYPERCALL_BUFFER_BOUNCE_IN); +} + +if ( arg1 ) { if ( buf == NULL ) { @@ -85,8 +91,8 @@ int xc_tmem_control(xc_interface *xch, rc = do_sysctl(xch, ); -if ( cmd == XEN_SYSCTL_TMEM_OP_LIST && arg1 != 0 ) -xc_hypercall_bounce_post(xch, buf); +if ( arg1 ) +xc_hypercall_bounce_post(xch, buf); return rc; } @@ -211,38 +217,29 @@ int xc_tmem_save(xc_interface *xch, { int marker = field_marker; int i, j; -uint32_t max_pools, version; -uint32_t weight, flags; -uint32_t pool_id; +uint32_t flags; uint32_t minusone = -1; +uint32_t pool_id; struct tmem_handle *h; +xen_tmem_client_t info; if ( xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_BEGIN,dom,live,0,NULL) <= 0 ) return 0; if ( write_exact(io_fd, , sizeof(marker)) ) return -1; -version = xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION,0,0,0,NULL); -if ( write_exact(io_fd, , sizeof(version)) ) -
[Xen-devel] [PATCH v2 7/9] tmem: Unify XEN_SYSCTL_TMEM_OP_[[SAVE_[BEGIN|END]|RESTORE_BEGIN]
return values. For success they used to be 1 ([SAVE,RESTORE]_BEGIN), 0 if guest did not have any tmem (but only for SAVE_BEGIN), and -1 for any type of failure. And SAVE_END (which you would think would mirror SAVE_BEGIN) had 0 for success and -1 if guest did not any tmem enabled for it. This is confusing. Now the code will return 0 if the operation was success. Various XEN_EXX values are returned if tmem is not enabled or the operation could not performed. The xc_tmem.c code only needs one place to check - where we use SAVE_BEGIN. The place where RESTORE_BEGIN is used will have errno with the proper error value and return will be -1, so will still fail properly. Acked-by: Wei LiuSigned-off-by: Konrad Rzeszutek Wilk --- Cc: Ian Jackson Cc: Wei Liu v1: First submission v2: Rework return value for XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN if couldn't allocate tmem components. Added Wei's Ack. --- tools/libxc/xc_tmem.c | 14 +++--- xen/common/tmem.c | 17 + 2 files changed, 20 insertions(+), 11 deletions(-) diff --git a/tools/libxc/xc_tmem.c b/tools/libxc/xc_tmem.c index 7c06841..6c7ad93 100644 --- a/tools/libxc/xc_tmem.c +++ b/tools/libxc/xc_tmem.c @@ -216,15 +216,23 @@ int xc_tmem_save(xc_interface *xch, int dom, int io_fd, int live, int field_marker) { int marker = field_marker; -int i, j; +int i, j, rc; uint32_t flags; uint32_t minusone = -1; uint32_t pool_id; struct tmem_handle *h; xen_tmem_client_t info; -if ( xc_tmem_control(xch,0,XEN_SYSCTL_TMEM_OP_SAVE_BEGIN,dom,live,0,NULL) <= 0 ) -return 0; +rc = xc_tmem_control(xch, 0, XEN_SYSCTL_TMEM_OP_SAVE_BEGIN, + dom, live, 0, NULL); +if ( rc ) +{ +/* Nothing to save - no tmem enabled. */ +if ( errno == ENOENT ) +return 0; + +return rc; +} if ( write_exact(io_fd, , sizeof(marker)) ) return -1; diff --git a/xen/common/tmem.c b/xen/common/tmem.c index ab354b6..510d11c 100644 --- a/xen/common/tmem.c +++ b/xen/common/tmem.c @@ -1656,30 +1656,31 @@ static int tmemc_save_subop(int cli_id, uint32_t pool_id, struct client *client = tmem_client_from_cli_id(cli_id); uint32_t p; struct tmem_page_descriptor *pgp, *pgp2; -int rc = -1; +int rc = -ENOENT; switch(subop) { case XEN_SYSCTL_TMEM_OP_SAVE_BEGIN: if ( client == NULL ) -return 0; +break; for (p = 0; p < MAX_POOLS_PER_DOMAIN; p++) if ( client->pools[p] != NULL ) break; + if ( p == MAX_POOLS_PER_DOMAIN ) -{ -rc = 0; break; -} + client->was_frozen = client->info.flags.u.frozen; client->info.flags.u.frozen = 1; if ( arg1 != 0 ) client->info.flags.u.migrating = 1; -rc = 1; +rc = 0; break; case XEN_SYSCTL_TMEM_OP_RESTORE_BEGIN: -if ( client == NULL && (client = client_create(cli_id)) != NULL ) -return 1; +if ( client == NULL ) +rc = client_create(cli_id) ? 0 : -ENOMEM; +else +rc = -EEXIST; break; case XEN_SYSCTL_TMEM_OP_SAVE_END: if ( client == NULL ) -- 2.4.11 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 8/9] tmem/xc_tmem_control: Rename 'arg1' to 'len' and 'arg2' to arg.
That is what they are used for. Lets make it more clear. Of all the various sub-commands, the only one that needed semantic change is XEN_SYSCTL_TMEM_OP_SAVE_BEGIN. That in the past used 'arg1', and now we are moving it to use 'arg'. Since that code is only used during migration which is tied to the toolstack it is OK to change it. We should increment the XEN_SYSCTL_INTERFACE_VERSION because of this, and that was fortunatly done in the patch titled: "tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE].." While at it, also fix xc_tmem_control_oid to properly handle the 'buf' and bounce it as appropiate. Acked-by: Wei LiuSigned-off-by: Konrad Rzeszutek Wilk --- Cc: Ian Jackson Cc: Wei Liu v1: First submission. v2: Added Wei's Ack. Rebased on "tmem/sysctl: Add union in struct xen_sysctl_tmem_op" Expand the commit description to talk about XEN_SYSCTL_INTERFACE_VERSION. --- tools/libxc/include/xenctrl.h | 4 ++-- tools/libxc/xc_tmem.c | 37 ++--- tools/libxl/libxl.c | 4 ++-- tools/python/xen/lowlevel/xc/xc.c | 16 xen/common/tmem.c | 16 xen/common/tmem_control.c | 8 xen/include/public/sysctl.h | 4 ++-- 7 files changed, 44 insertions(+), 45 deletions(-) diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index d35f171..2c83544 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2067,11 +2067,11 @@ int xc_disable_turbo(xc_interface *xch, int cpuid); */ int xc_tmem_control_oid(xc_interface *xch, int32_t pool_id, uint32_t subop, -uint32_t cli_id, uint32_t arg1, uint32_t arg2, +uint32_t cli_id, uint32_t len, uint32_t arg, struct xen_tmem_oid oid, void *buf); int xc_tmem_control(xc_interface *xch, int32_t pool_id, uint32_t subop, uint32_t cli_id, -uint32_t arg1, uint32_t arg2, void *buf); +uint32_t len, uint32_t arg, void *buf); int xc_tmem_auth(xc_interface *xch, int cli_id, char *uuid_str, int arg1); int xc_tmem_save(xc_interface *xch, int dom, int live, int fd, int field_marker); int xc_tmem_save_extra(xc_interface *xch, int dom, int fd, int field_marker); diff --git a/tools/libxc/xc_tmem.c b/tools/libxc/xc_tmem.c index 6c7ad93..3b66d10 100644 --- a/tools/libxc/xc_tmem.c +++ b/tools/libxc/xc_tmem.c @@ -49,20 +49,20 @@ int xc_tmem_control(xc_interface *xch, int32_t pool_id, uint32_t cmd, uint32_t cli_id, -uint32_t arg1, -uint32_t arg2, +uint32_t len, +uint32_t arg, void *buf) { DECLARE_SYSCTL; -DECLARE_HYPERCALL_BOUNCE(buf, arg1, XC_HYPERCALL_BUFFER_BOUNCE_OUT); +DECLARE_HYPERCALL_BOUNCE(buf, len, XC_HYPERCALL_BUFFER_BOUNCE_OUT); int rc; sysctl.cmd = XEN_SYSCTL_tmem_op; sysctl.u.tmem_op.pool_id = pool_id; sysctl.u.tmem_op.cmd = cmd; sysctl.u.tmem_op.cli_id = cli_id; -sysctl.u.tmem_op.arg1 = arg1; -sysctl.u.tmem_op.arg2 = arg2; +sysctl.u.tmem_op.len = len; +sysctl.u.tmem_op.arg = arg; sysctl.u.tmem_op.pad = 0; sysctl.u.tmem_op.oid.oid[0] = 0; sysctl.u.tmem_op.oid.oid[1] = 0; @@ -72,8 +72,7 @@ int xc_tmem_control(xc_interface *xch, { HYPERCALL_BOUNCE_SET_DIR(buf, XC_HYPERCALL_BUFFER_BOUNCE_IN); } - -if ( arg1 ) +if ( len ) { if ( buf == NULL ) { @@ -91,7 +90,7 @@ int xc_tmem_control(xc_interface *xch, rc = do_sysctl(xch, ); -if ( arg1 ) +if ( len ) xc_hypercall_bounce_post(xch, buf); return rc; @@ -101,25 +100,25 @@ int xc_tmem_control_oid(xc_interface *xch, int32_t pool_id, uint32_t cmd, uint32_t cli_id, -uint32_t arg1, -uint32_t arg2, +uint32_t len, +uint32_t arg, struct xen_tmem_oid oid, void *buf) { DECLARE_SYSCTL; -DECLARE_HYPERCALL_BOUNCE(buf, arg1, XC_HYPERCALL_BUFFER_BOUNCE_OUT); +DECLARE_HYPERCALL_BOUNCE(buf, len, XC_HYPERCALL_BUFFER_BOUNCE_OUT); int rc; sysctl.cmd = XEN_SYSCTL_tmem_op; sysctl.u.tmem_op.pool_id = pool_id; sysctl.u.tmem_op.cmd = cmd; sysctl.u.tmem_op.cli_id = cli_id; -sysctl.u.tmem_op.arg1 = arg1; -sysctl.u.tmem_op.arg2 = arg2; +sysctl.u.tmem_op.len = len; +sysctl.u.tmem_op.arg = arg; sysctl.u.tmem_op.pad = 0; sysctl.u.tmem_op.oid = oid; -if ( cmd == XEN_SYSCTL_TMEM_OP_LIST && arg1 != 0 ) +if ( len ) { if ( buf == NULL )
[Xen-devel] [PATCH v2] Tmem cleanups/improvements for v4.8.
Hey! Since v1: [https://lists.xen.org/archives/html/xen-devel/2016-09/msg03035.html] - Squashes some patches - Added Wei's Acked-by - Reworked some patches. This batch of fixes slowly marches toward ripping out pieces of code in tmem that are hard to maintain and improve on code that was orginacally developed. I had hoped that I would have had the migration support all working, but it took longer than I thought to get to this point. And it may have become a giant series too. Anyhow please take a peek at the patches. The first couple of them should be fairly easy. The rest are more of squashing various subcommands in one. Thanks! The git tree with these patches is: git://xenbits.xen.org/people/konradwilk/xen.git tmem.v4.8.v2 docs/man/xl.pod.1.in| 4 - docs/misc/xen-command-line.markdown | 6 - tools/libxc/include/xenctrl.h | 4 +- tools/libxc/xc_tmem.c | 272 - tools/libxl/libxl.c | 30 ++- tools/libxl/xl_cmdtable.c | 1 - tools/python/xen/lowlevel/xc/xc.c | 19 +- xen/common/tmem.c | 386 xen/common/tmem_control.c | 211 +++- xen/common/tmem_xen.c | 28 --- xen/include/public/sysctl.h | 76 +-- xen/include/xen/tmem_xen.h | 121 +-- 12 files changed, 388 insertions(+), 770 deletions(-) 382 lines of code deleted!!! WOOHOO Konrad Rzeszutek Wilk (9): libxc/tmem/restore: Remove call to XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION tmem: Retire XEN_SYSCTL_TMEM_OP_[SET_CAP|SAVE_GET_CLIENT_CAP] tmem: Delete deduplication (and tze) code. tmem: Move client weight, frozen, live_migrating, and compress tmem/sysctl: Add union in struct xen_sysctl_tmem_op tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE].. tmem: Unify XEN_SYSCTL_TMEM_OP_[[SAVE_[BEGIN|END]|RESTORE_BEGIN] tmem/xc_tmem_control: Rename 'arg1' to 'len' and 'arg2' to arg. tmem: Batch and squash XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_[FLAGS,NPAGES,UUID] in one sub-call: XEN_SYSCTL_TMEM_OP_GET_POOLS. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [Resend PATCH 2/2] Xen/timer: Process softirq during dumping timer info
On Fri, Sep 30, 2016 at 10:19:06AM +0800, Lan Tianyu wrote: > Dumping timer info may run for a long time on the huge machine with > a lot of physical cpus. To avoid triggering NMI watchdog, add > process_pending_softirqs() in the loop of dumping timer info. Reviewed-by: Konrad Rzeszutek Wilk> > Signed-off-by: Lan Tianyu > --- > xen/common/timer.c |1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/xen/common/timer.c b/xen/common/timer.c > index 29a60a9..ab6bca0 100644 > --- a/xen/common/timer.c > +++ b/xen/common/timer.c > @@ -530,6 +530,7 @@ static void dump_timerq(unsigned char key) > { > ts = _cpu(timers, i); > > +process_pending_softirqs(); > printk("CPU%02d:\n", i); > spin_lock_irqsave(>lock, flags); > for ( j = 1; j <= GET_HEAP_SIZE(ts->heap); j++ ) > -- > 1.7.1 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [Patch] x86emul: simplify prefix handling for VMFUNC
On Wed, Sep 28, 2016 at 01:47:56AM -0600, Jan Beulich wrote: > >>> On 27.09.16 at 19:43,wrote: > > Finally found the vmfunc opcode page in Vol 3 30.3, VMX Instruction > > Reference. > > Agreed, there's no mention of prefixes, "pfx", on this page. It appears > > that the other VMX instructions in this section don't mention prefixes > > either. > > Looking at Table A-6 "Opcode Extensions for One- and Two-Byte Opcodes": > > vmcall, vmlaunch, vmresume, and vmxoff would need similar updating. > > Indeed. > > > I can ask for updating of the VMX instuctions to include mention of > > prefixes. > > Anything else as I gather requirements? > > I don't think so, but of course if would be nice to get confirmed (ahead > of the SDM update going public, which presumably will take some time) > that for all of those presence of said prefixes will cause #UD, as that > goes beyond stating that "they're reserved" (which is all the footnote > says). > > Jan > Got word that there's an effort to clean up the SDM in general, and this section in particular. Also, the word is that it'll be early next year (2017) before something comes out. No expectation was set of when an early draft would be available. -Paul ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [Resend PATCH 1/2] Xen/Keyhandler: Make keyhandler always run in tasklet
On Fri, Sep 30, 2016 at 10:19:05AM +0800, Lan Tianyu wrote: > Keyhandler may run for a long time in a timer handler on the large machine I am bit lost. You say 'timer handler' which will imply that there is some form of 'init_timer' and 'set_timer' that would call the handle_keypress function? But I am not seeing it? Or are you saying that when 'dump_timerq' is invoked? If so please say that. > with a lot of physical cpus(E,G keyhandler for dumping timer info) when serial s/E,G/e.g.g/ > port driver works in the poll mode. When timer interrupt arrives, timer > subsystem s/poll mode/poll mode (via the exception mechanism)/ > runs all timer handlers before programming next timer interrupt. So if timer > handler > runs longer than time for watchdog timeout, the timer handler of watchdog > will be Ah, so this is if a guest has set a timer and we are executing it. Or we have many of them to go through. > blocked to feed watchdog and xen hypervisor panics. This patch is to fix the > issue > via always scheduling a tasklet to run keyhandler to avoid timer handler > running > too long. You say "timer handler" again. But the timer handlers are executed via timer_softirq_action (which is a softirq, aka triggered by IPI). And the tasklet will mean that that it gets to be executed _after_ the do_softirq is done (as softirq.h puts the low numbered ones first, such as the TIMER_SOFTIRQ)? So what I think you are saying is that you do not want the 'timer_softirq_action' to be preempted by the 'dump_timerq' (or any other ones) which will trip the watchdog timeout. If that is the case please put something to that affect in the commit description. That begs one question that should be probably answered in the commit description: Why can't the dump_timerq or any other keyhandler poke the watchdog (expose nmi_timer_fn and call that?) > > Signed-off-by: Lan TianyuOtherwise the mechanical parts of the patch look good. > --- > xen/common/keyhandler.c |8 +--- > 1 files changed, 5 insertions(+), 3 deletions(-) > > diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c > index 16de6e8..fce52d2 100644 > --- a/xen/common/keyhandler.c > +++ b/xen/common/keyhandler.c > @@ -75,7 +75,9 @@ static struct keyhandler { > > static void keypress_action(unsigned long unused) > { > -handle_keypress(keypress_key, NULL); > +console_start_log_everything(); > +key_table[keypress_key].fn(keypress_key); > +console_end_log_everything(); > } > > static DECLARE_TASKLET(keypress_tasklet, keypress_action, 0); > @@ -87,10 +89,10 @@ void handle_keypress(unsigned char key, struct > cpu_user_regs *regs) > if ( key >= ARRAY_SIZE(key_table) || !(h = _table[key])->fn ) > return; > > -if ( !in_irq() || h->irq_callback ) > +if ( h->irq_callback ) > { > console_start_log_everything(); > -h->irq_callback ? h->irq_fn(key, regs) : h->fn(key); > +h->irq_fn(key, regs); > console_end_log_everything(); > } > else > -- > 1.7.1 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 29/30] xen/x86: allow PVHv2 to perform foreign memory mappings
On 27/09/16 16:57, Roger Pau Monne wrote: > Signed-off-by: Roger Pau MonnéAcked-by: George Dunlap With one note: You might consider changing the summary to "Allow HVM hardware domains (PVHv2 dom0) to perform..." That makes it clear to someone casually scanning that you're really enabling all HVM domains to get to the next step. If we ever get rid of the hardware domain check, this might be important to people using XSM, for instance. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 06/30] x86/paging: introduce paging_set_allocation
On 27/09/16 16:57, Roger Pau Monne wrote: > ... and remove hap_set_alloc_for_pvh_dom0. > > Signed-off-by: Roger Pau Monné> Acked-by: Tim Deegan Acked-by: George Dunlap > --- > Cc: Jan Beulich > Cc: Andrew Cooper > Cc: George Dunlap > Cc: Tim Deegan > --- > Changes since RFC: > - Make paging_set_allocation preemtable. > - Move comments. > --- > xen/arch/x86/domain_build.c | 17 + > xen/arch/x86/mm/hap/hap.c | 14 +- > xen/arch/x86/mm/paging.c| 16 > xen/arch/x86/mm/shadow/common.c | 7 +-- > xen/include/asm-x86/hap.h | 2 +- > xen/include/asm-x86/paging.h| 7 +++ > xen/include/asm-x86/shadow.h| 8 > 7 files changed, 47 insertions(+), 24 deletions(-) > > diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c > index 0a02d65..04d6cb0 100644 > --- a/xen/arch/x86/domain_build.c > +++ b/xen/arch/x86/domain_build.c > @@ -35,7 +35,6 @@ > #include > #include /* for bzimage_parse */ > #include > -#include > #include > > #include > @@ -1383,15 +1382,25 @@ int __init construct_dom0( > nr_pages); > } > > -if ( is_pvh_domain(d) ) > -hap_set_alloc_for_pvh_dom0(d, dom0_paging_pages(d, nr_pages)); > - > /* > * We enable paging mode again so guest_physmap_add_page will do the > * right thing for us. > */ > d->arch.paging.mode = save_pvh_pg_mode; > > +if ( is_pvh_domain(d) ) > +{ > +int preempted; > + > +do { > +preempted = 0; > +paging_set_allocation(d, dom0_paging_pages(d, nr_pages), > + ); > +process_pending_softirqs(); > +} while ( preempted ); > +} > + > + > /* Write the phys->machine and machine->phys table entries. */ > for ( pfn = 0; pfn < count; pfn++ ) > { > diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c > index 2dc82f5..4420e4e 100644 > --- a/xen/arch/x86/mm/hap/hap.c > +++ b/xen/arch/x86/mm/hap/hap.c > @@ -334,7 +334,7 @@ hap_get_allocation(struct domain *d) > > /* Set the pool of pages to the required number of pages. > * Returns 0 for success, non-zero for failure. */ > -static int > +int > hap_set_allocation(struct domain *d, unsigned long pages, int *preempted) > { > struct page_info *pg; > @@ -640,18 +640,6 @@ int hap_domctl(struct domain *d, xen_domctl_shadow_op_t > *sc, > } > } > > -void __init hap_set_alloc_for_pvh_dom0(struct domain *d, > - unsigned long hap_pages) > -{ > -int rc; > - > -paging_lock(d); > -rc = hap_set_allocation(d, hap_pages, NULL); > -paging_unlock(d); > - > -BUG_ON(rc); > -} > - > static const struct paging_mode hap_paging_real_mode; > static const struct paging_mode hap_paging_protected_mode; > static const struct paging_mode hap_paging_pae_mode; > diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c > index cc44682..2717bd3 100644 > --- a/xen/arch/x86/mm/paging.c > +++ b/xen/arch/x86/mm/paging.c > @@ -954,6 +954,22 @@ void paging_write_p2m_entry(struct p2m_domain *p2m, > unsigned long gfn, > safe_write_pte(p, new); > } > > +int paging_set_allocation(struct domain *d, unsigned long pages, int > *preempted) > +{ > +int rc; > + > +ASSERT(paging_mode_enabled(d)); > + > +paging_lock(d); > +if ( hap_enabled(d) ) > +rc = hap_set_allocation(d, pages, preempted); > +else > +rc = sh_set_allocation(d, pages, preempted); > +paging_unlock(d); > + > +return rc; > +} > + > /* > * Local variables: > * mode: C > diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c > index d3cc2cc..53ffe1a 100644 > --- a/xen/arch/x86/mm/shadow/common.c > +++ b/xen/arch/x86/mm/shadow/common.c > @@ -1609,12 +1609,7 @@ shadow_free_p2m_page(struct domain *d, struct > page_info *pg) > paging_unlock(d); > } > > -/* Set the pool of shadow pages to the required number of pages. > - * Input will be rounded up to at least shadow_min_acceptable_pages(), > - * plus space for the p2m table. > - * Returns 0 for success, non-zero for failure. */ > -static int sh_set_allocation(struct domain *d, unsigned long pages, > - int *preempted) > +int sh_set_allocation(struct domain *d, unsigned long pages, int *preempted) > { > struct page_info *sp; > unsigned int lower_bound; > diff --git a/xen/include/asm-x86/hap.h b/xen/include/asm-x86/hap.h > index c613836..9d59430 100644 > --- a/xen/include/asm-x86/hap.h > +++ b/xen/include/asm-x86/hap.h > @@ -46,7 +46,7 @@ int hap_track_dirty_vram(struct domain *d, > XEN_GUEST_HANDLE_64(uint8) dirty_bitmap); > > extern const struct paging_mode
Re: [Xen-devel] [PATCH v2 04/30] xen/x86: allow calling {sh/hap}_set_allocation with the idle domain
On 30/09/16 17:56, George Dunlap wrote: > On 27/09/16 16:56, Roger Pau Monne wrote: >> ... and using the "preempted" parameter. The solution relies on just calling >> softirq_pending if the current domain is the idle domain. >> >> Signed-off-by: Roger Pau Monné> > You probably also want to add something to the effect of: > > "This allows us to call *_set_allocation() when building domain 0." > > Someone doing archeology doesn't want to dig out this series to figure > out how it would be possible to call hap_set_allocation() while idle. :-) Oh, but when the comments have been addressed, you can add: Acked-by: George Dunlap ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 04/30] xen/x86: allow calling {sh/hap}_set_allocation with the idle domain
On 27/09/16 16:56, Roger Pau Monne wrote: > ... and using the "preempted" parameter. The solution relies on just calling > softirq_pending if the current domain is the idle domain. > > Signed-off-by: Roger Pau MonnéYou probably also want to add something to the effect of: "This allows us to call *_set_allocation() when building domain 0." Someone doing archeology doesn't want to dig out this series to figure out how it would be possible to call hap_set_allocation() while idle. :-) -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 08/12] tmem: Handle 'struct tmem_info' as a seperate field in the
On Fri, Sep 30, 2016 at 08:56:03AM -0600, Jan Beulich wrote: > >>> On 30.09.16 at 16:36,wrote: > > On Wed, Sep 28, 2016 at 06:56:40AM -0600, Jan Beulich wrote: > >> >>> On 28.09.16 at 11:42, wrote: > >> > Note: We still have to do this awkward 'guest_handle_cast' > >> > otherwise it will not compile on ARM - which defines _two_ > >> > of these macros (__guest_handle_64_xen_sysctl_tmem_client_t > >> > and __guest_handle_xen_sysctl_tmem_client_t). And if cast is > >> > not used then a compile error comes up as we use the wrong one. > >> > >> This seems suspicious, but it's hard to judge without knowing what > >> exactly the errors were. > > > > tmem_control.c: In function ‘tmem_control’: > > tmem_control.c:426:9: error: incompatible type for argument 2 of > > ‘tmemc_set_client_info’ > > ret = tmemc_set_client_info(op->cli_id, op->u.client); > > ^ > > tmem_control.c:302:12: note: expected > > ‘__guest_handle_xen_sysctl_tmem_client_t’ but argument is of type > > ‘__guest_handle_64_xen_sysctl_tmem_client_t’ > > static int tmemc_set_client_info(domid_t cli_id, > > ^ > > Looks like you want to pass around a 64-bit handle then? Which I found is XEN_GUEST_HANDLE, not XEN_GUEST_HANDLE_PARAM. .. snip.. > >> > if ( copy_from_guest(, buf, 1) ) > >> > return -EFAULT; > >> > >> The adjustments above look pretty unrelated to the purpose of the > >> patch, but well - you're the maintainer of this code. > > > > > > Don't be that harsh to yourself. :-) > > Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 03/30] xen/x86: fix parameters and return value of *_set_allocation functions
On 27/09/16 16:56, Roger Pau Monne wrote: > Return should be an int, and the number of pages should be an unsigned long. > > Signed-off-by: Roger Pau MonnéNon-shadow mm bits: Acked-by: George Dunlap > --- > Cc: George Dunlap > Cc: Jan Beulich > Cc: Andrew Cooper > Cc: Tim Deegan > --- > xen/arch/x86/mm/hap/hap.c | 6 +++--- > xen/arch/x86/mm/shadow/common.c | 7 +++ > xen/include/asm-x86/domain.h| 12 ++-- > 3 files changed, 12 insertions(+), 13 deletions(-) > > diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c > index 3218fa2..b6d2c61 100644 > --- a/xen/arch/x86/mm/hap/hap.c > +++ b/xen/arch/x86/mm/hap/hap.c > @@ -325,7 +325,7 @@ static void hap_free_p2m_page(struct domain *d, struct > page_info *pg) > static unsigned int > hap_get_allocation(struct domain *d) > { > -unsigned int pg = d->arch.paging.hap.total_pages > +unsigned long pg = d->arch.paging.hap.total_pages > + d->arch.paging.hap.p2m_pages; > > return ((pg >> (20 - PAGE_SHIFT)) > @@ -334,8 +334,8 @@ hap_get_allocation(struct domain *d) > > /* Set the pool of pages to the required number of pages. > * Returns 0 for success, non-zero for failure. */ > -static unsigned int > -hap_set_allocation(struct domain *d, unsigned int pages, int *preempted) > +static int > +hap_set_allocation(struct domain *d, unsigned long pages, int *preempted) > { > struct page_info *pg; > > diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c > index 21607bf..d3cc2cc 100644 > --- a/xen/arch/x86/mm/shadow/common.c > +++ b/xen/arch/x86/mm/shadow/common.c > @@ -1613,9 +1613,8 @@ shadow_free_p2m_page(struct domain *d, struct page_info > *pg) > * Input will be rounded up to at least shadow_min_acceptable_pages(), > * plus space for the p2m table. > * Returns 0 for success, non-zero for failure. */ > -static unsigned int sh_set_allocation(struct domain *d, > - unsigned int pages, > - int *preempted) > +static int sh_set_allocation(struct domain *d, unsigned long pages, > + int *preempted) > { > struct page_info *sp; > unsigned int lower_bound; > @@ -1692,7 +1691,7 @@ static unsigned int sh_set_allocation(struct domain *d, > /* Return the size of the shadow pool, rounded up to the nearest MB */ > static unsigned int shadow_get_allocation(struct domain *d) > { > -unsigned int pg = d->arch.paging.shadow.total_pages > +unsigned long pg = d->arch.paging.shadow.total_pages > + d->arch.paging.shadow.p2m_pages; > return ((pg >> (20 - PAGE_SHIFT)) > + ((pg & ((1 << (20 - PAGE_SHIFT)) - 1)) ? 1 : 0)); > diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h > index 5807a1f..11ac2a5 100644 > --- a/xen/include/asm-x86/domain.h > +++ b/xen/include/asm-x86/domain.h > @@ -93,9 +93,9 @@ struct shadow_domain { > > /* Memory allocation */ > struct page_list_head freelist; > -unsigned int total_pages; /* number of pages allocated */ > -unsigned int free_pages; /* number of pages on freelists */ > -unsigned int p2m_pages;/* number of pages allocates to p2m */ > +unsigned long total_pages; /* number of pages allocated */ > +unsigned long free_pages; /* number of pages on freelists */ > +unsigned long p2m_pages;/* number of pages allocates to p2m */ > > /* 1-to-1 map for use when HVM vcpus have paging disabled */ > pagetable_t unpaged_pagetable; > @@ -155,9 +155,9 @@ struct shadow_vcpu { > // > struct hap_domain { > struct page_list_head freelist; > -unsigned int total_pages; /* number of pages allocated */ > -unsigned int free_pages; /* number of pages on freelists */ > -unsigned int p2m_pages;/* number of pages allocates to p2m */ > +unsigned long total_pages; /* number of pages allocated */ > +unsigned long free_pages; /* number of pages on freelists */ > +unsigned long p2m_pages;/* number of pages allocates to p2m */ > }; > > // > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 101226: tolerable all pass - PUSHED
flight 101226 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/101226/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass version targeted for testing: xen 47bd00674201bb1a030d4eb604b074a156b1f4d5 baseline version: xen 933acddf74213f77a5adbb2c10ccd3f72634d456 Last test of basis 101224 2016-09-30 12:03:47 Z0 days Testing same since 101226 2016-09-30 14:02:55 Z0 days1 attempts People who touched revisions under test: Dario FaggioliGeorge Dunlap Ian Jackson Jan Beulich Mihai DonÈu Zhi Wang jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=47bd00674201bb1a030d4eb604b074a156b1f4d5 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 47bd00674201bb1a030d4eb604b074a156b1f4d5 + branch=xen-unstable-smoke + revision=47bd00674201bb1a030d4eb604b074a156b1f4d5 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.7-testing + '[' x47bd00674201bb1a030d4eb604b074a156b1f4d5 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ :
Re: [Xen-devel] [PATCH] trace: Fix incorrect number of pages used for trace metadata
On 30/09/16 17:04, Igor Druzhinin wrote: > On 30/09/16 15:46, George Dunlap wrote: >> On 29/09/16 14:53, Igor Druzhinin wrote: >>> As long as t_info_first_offset is calculated in uint32_t offsets we >>> need to >>> multiply it by sizeof(uint32_t) in order to get the right number of >>> pages >>> for trace metadata. Not doing that makes it impossible to read the trace >>> buffer correctly from userspace for some corner cases. >>> >>> Signed-off-by: Igor Druzhinin>> >> Hmm, looks like we actually want to revert this c/s fbf96e6, "xentrace: >> correct formula to calculate t_info_pages". But that one was presumably >> written (and Acked by me) because the variable name there, >> t_info_first_offset, is confusing. >> >> The other mistake in fbf96e6 is that before t_info_words was actually >> denominated in words; but after it's denominated in bytes (which is >> again confusing). >> >> What about something like the attached instead? This should fix your >> problem while making the code clearer. >> >> -George >> >> > > Reviewed-by: Igor Druzhinin Thanks. Any chance I could get a Tested-by as well? :-) -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] trace: Fix incorrect number of pages used for trace metadata
On 30/09/16 15:46, George Dunlap wrote: On 29/09/16 14:53, Igor Druzhinin wrote: As long as t_info_first_offset is calculated in uint32_t offsets we need to multiply it by sizeof(uint32_t) in order to get the right number of pages for trace metadata. Not doing that makes it impossible to read the trace buffer correctly from userspace for some corner cases. Signed-off-by: Igor DruzhininHmm, looks like we actually want to revert this c/s fbf96e6, "xentrace: correct formula to calculate t_info_pages". But that one was presumably written (and Acked by me) because the variable name there, t_info_first_offset, is confusing. The other mistake in fbf96e6 is that before t_info_words was actually denominated in words; but after it's denominated in bytes (which is again confusing). What about something like the attached instead? This should fix your problem while making the code clearer. -George Reviewed-by: Igor Druzhinin ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 10/10] xl: allow to set the ratelimit value online for Credit2
Dario Faggioli writes ("Re: [Xen-devel] [PATCH v2 10/10] xl: allow to set the ratelimit value online for Credit2"): > I mean, AFAICT, there's xl calling a libxl function and printing an > error if it fails. The libxl function also logs an error --true-- but > this seems rather common to me, and apart from that, it looks > reasonable as well, isn't it so? It seems to me that we should have a coherent strategy for which code is responsible for making what kind of log messages / error messages under what circumstances. One objective of this would be to try to arrange that when approximately one thing is wrong, the user gets one (accurate and informative) message about it, not half a dozen. But the more important objective would be to arrange that errors do not occur without _something_ providing an explanation. We are quite a way away from having such a coherent strategy in libxl. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 10/10] xl: allow to set the ratelimit value online for Credit2
On Fri, 2016-09-30 at 11:34 +0100, Ian Jackson wrote: > Dario Faggioli writes ("[PATCH v2 10/10] xl: allow to set > theratelimit value online for Credit2"): > > > > +static int sched_credit2_params_set(int poolid, > > +libxl_sched_credit2_params > > *scinfo) > > Each of this and the corresponding _get calls the corresponding libxl > function, prints a message, and adjusts the return value. > > But: AFAICT each of these has only one call site; and each of the > underlying libxl functions logs an error too; and each of the call > sites prints a message too. > > So I question whether these functions are valuable. > So, about this last point (utility of sched_*_params_{get,set}() ), I totally agree. I think the reason why the functions exist is for the most part related to convenience and style. E.g., maybe indentation/nesting level was getting too deep within the various main_sched_*(). As said in another email, I don't especially like those functions, and I have the feeling they're a better way to write them, but I haven't really try. I'll put having a look at that in my todo list, but not at super high priority. :-P > At this stage of the release cycle, and the maturity of this series, > I > guess you probably don't want to start messing with the error paths. > Yeah, indeed, thanks for understanding! :-P Although... > The messages that will come out will be overly verbose rather than > inadequate, so the code as it is fine if not as good as it could be. > ...I'm not entirely sure I get what you're saying here. I mean, AFAICT, there's xl calling a libxl function and printing an error if it fails. The libxl function also logs an error --true-- but this seems rather common to me, and apart from that, it looks reasonable as well, isn't it so? Avoiding double/overly verbose logging from either within xl or within libxl is something I totally understand and agree. But when the 'border' is crossed, is it that terrible that both do log? After all, how can xl be sure that libxl will log something? And even if he could, I see the two as different kind and different level of logging. Or was it something else that you were referring to? > But it was a thing I noticed which I wanted to write down and/or have > your opinion on. > Sure! :-) Thanks and Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 15/30] xen/x86: populate PVHv2 Dom0 physical memory map
>>> On 27.09.16 at 17:57,wrote: > @@ -43,6 +44,11 @@ static long __initdata dom0_nrpages; > static long __initdata dom0_min_nrpages; > static long __initdata dom0_max_nrpages = LONG_MAX; > > +/* Size of the VM86 TSS for virtual 8086 mode to use. */ > +#define HVM_VM86_TSS_SIZE 128 > + > +static unsigned int __initdata hvm_mem_stats[MAX_ORDER + 1]; This is for your debugging only I suppose? > @@ -336,7 +343,8 @@ static unsigned long __init compute_dom0_nr_pages( > avail -= dom0_paging_pages(d, nr_pages); > } > > -if ( (parms->p2m_base == UNSET_ADDR) && (dom0_nrpages <= 0) && > +if ( is_pv_domain(d) && > + (parms->p2m_base == UNSET_ADDR) && (dom0_nrpages <= 0) && Perhaps better to simply force parms->p2m_base to UNSET_ADDR earlier on? > @@ -579,8 +588,19 @@ static __init void pvh_setup_e820(struct domain *d, > unsigned long nr_pages) > continue; > } > > -*entry_guest = *entry; > -pages = PFN_UP(entry_guest->size); > +/* > + * Make sure the start and length are aligned to PAGE_SIZE, because > + * that's the minimum granularity of the 2nd stage translation. > + */ > +start = ROUNDUP(entry->addr, PAGE_SIZE); > +end = (entry->addr + entry->size) & PAGE_MASK; > +if ( start >= end ) > +continue; > + > +entry_guest->type = E820_RAM; > +entry_guest->addr = start; > +entry_guest->size = end - start; > +pages = PFN_DOWN(entry_guest->size); > if ( (cur_pages + pages) > nr_pages ) > { > /* Truncate region */ > @@ -591,6 +611,8 @@ static __init void pvh_setup_e820(struct domain *d, > unsigned long nr_pages) > { > cur_pages += pages; > } > +ASSERT(IS_ALIGNED(entry_guest->addr, PAGE_SIZE) && > + IS_ALIGNED(entry_guest->size, PAGE_SIZE)); What does this guard against? Your addition arranges for things to be page aligned, and the only adjustment done until we get here is one that obviously also doesn't violate that requirement. I'm all for assertions when they check state which is not obviously right, but here I don't see the need. > @@ -1657,15 +1679,238 @@ out: > return rc; > } > > +/* Populate an HVM memory range using the biggest possible order. */ > +static void __init hvm_populate_memory_range(struct domain *d, uint64_t > start, > + uint64_t size) > +{ > +static unsigned int __initdata memflags = MEMF_no_dma|MEMF_exact_node; > +unsigned int order; > +struct page_info *page; > +int rc; > + > +ASSERT(IS_ALIGNED(size, PAGE_SIZE) && IS_ALIGNED(start, PAGE_SIZE)); > + > +order = MAX_ORDER; > +while ( size != 0 ) > +{ > +order = min(get_order_from_bytes_floor(size), order); > +page = alloc_domheap_pages(d, order, memflags); > +if ( page == NULL ) > +{ > +if ( order == 0 && memflags ) > +{ > +/* Try again without any memflags. */ > +memflags = 0; > +order = MAX_ORDER; > +continue; > +} > +if ( order == 0 ) > +panic("Unable to allocate memory with order 0!\n"); > +order--; > +continue; > +} Is it not possible to utilize alloc_chunk() here? > +hvm_mem_stats[order]++; > +rc = guest_physmap_add_page(d, _gfn(PFN_DOWN(start)), > +_mfn(page_to_mfn(page)), order); > +if ( rc != 0 ) > +panic("Failed to populate memory: [%" PRIx64 " - %" PRIx64 "] > %d\n", [,) please. > + start, start + (((uint64_t)1) << (order + PAGE_SHIFT)), > rc); > +start += ((uint64_t)1) << (order + PAGE_SHIFT); > +size -= ((uint64_t)1) << (order + PAGE_SHIFT); Please prefer 1ULL over (uint64_t)1. > +if ( (size & 0x) == 0 ) > +process_pending_softirqs(); That's 4Gb at a time - isn't that a little too much? > +} > + > +} Stray blank line. > +static int __init hvm_setup_vmx_unrestricted_guest(struct domain *d) > +{ > +struct e820entry *entry; > +p2m_type_t p2mt; > +uint32_t rc, *ident_pt; > +uint8_t *tss; > +mfn_t mfn; > +paddr_t gaddr = 0; > +int i; unsigned int > +/* > + * Stole some space from the last found RAM region. One page will be Steal > + * used for the identify page tables, and the remaining space for the identity > + * VM86 TSS. Note that after this not all e820 regions will be aligned > + * to PAGE_SIZE. > + */ > +for ( i = 1; i <= d->arch.nr_e820; i++ ) > +{ > +entry = >arch.e820[d->arch.nr_e820 - i]; > +if ( entry->type != E820_RAM || > + entry->size < PAGE_SIZE + HVM_VM86_TSS_SIZE ) > +continue; > + > +entry->size -= PAGE_SIZE +
Re: [Xen-devel] [PATCH] svm/emulate: remove duplicated const specifier
On 30/09/16 16:47, Wei Liu wrote: > Clang complains: > > emulate.c:65:3: error: duplicate 'const' declaration specifier > [-Werror,-Wduplicate-decl-specifier] > } const opc_tab[INSTR_MAX_COUNT] = { > ^ > > Remove that const to fix the issue. > > Signed-off-by: Wei LiuReviewed-by: Andrew Cooper > --- > xen/arch/x86/hvm/svm/emulate.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/xen/arch/x86/hvm/svm/emulate.c b/xen/arch/x86/hvm/svm/emulate.c > index 7f6401f..36585b0 100644 > --- a/xen/arch/x86/hvm/svm/emulate.c > +++ b/xen/arch/x86/hvm/svm/emulate.c > @@ -62,7 +62,7 @@ static const struct { > unsigned int mod:2; > #define MODRM(mod, reg, rm) { rm, reg, mod } > } modrm; > -} const opc_tab[INSTR_MAX_COUNT] = { > +} opc_tab[INSTR_MAX_COUNT] = { > [INSTR_PAUSE] = { X86EMUL_OPC_F3(0, 0x90) }, > [INSTR_INT3]= { X86EMUL_OPC( 0, 0xcc) }, > [INSTR_HLT] = { X86EMUL_OPC( 0, 0xf4) }, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH] svm/emulate: remove duplicated const specifier
Clang complains: emulate.c:65:3: error: duplicate 'const' declaration specifier [-Werror,-Wduplicate-decl-specifier] } const opc_tab[INSTR_MAX_COUNT] = { ^ Remove that const to fix the issue. Signed-off-by: Wei Liu--- xen/arch/x86/hvm/svm/emulate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/arch/x86/hvm/svm/emulate.c b/xen/arch/x86/hvm/svm/emulate.c index 7f6401f..36585b0 100644 --- a/xen/arch/x86/hvm/svm/emulate.c +++ b/xen/arch/x86/hvm/svm/emulate.c @@ -62,7 +62,7 @@ static const struct { unsigned int mod:2; #define MODRM(mod, reg, rm) { rm, reg, mod } } modrm; -} const opc_tab[INSTR_MAX_COUNT] = { +} opc_tab[INSTR_MAX_COUNT] = { [INSTR_PAUSE] = { X86EMUL_OPC_F3(0, 0x90) }, [INSTR_INT3]= { X86EMUL_OPC( 0, 0xcc) }, [INSTR_HLT] = { X86EMUL_OPC( 0, 0xf4) }, -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 10/30] xen/x86: allow the emulated APICs to be enbled for the hardware domain
On Thu, Sep 29, 2016 at 08:26:04AM -0600, Jan Beulich wrote: > >>> On 27.09.16 at 17:57,wrote: > > +static bool emulation_flags_ok(const struct domain *d, uint32_t emflags) > > +{ > > + > > +if ( is_hvm_domain(d) ) > > +{ > > +if ( is_hardware_domain(d) && > > + emflags != > > (XEN_X86_EMU_PIT|XEN_X86_EMU_LAPIC|XEN_X86_EMU_IOAPIC)) > > +return false; > > +if ( !is_hardware_domain(d) && > > + emflags != XEN_X86_EMU_ALL && emflags != 0 ) > > +return false; > > +} > > +else > > +{ > > +/* PV or classic PVH. */ > > +if ( is_hardware_domain(d) && emflags != XEN_X86_EMU_PIT ) > > +return false; > > +if ( !is_hardware_domain(d) && emflags != 0 ) > > +return false; > > Previous code permitted XEN_X86_EMU_PIT also for the non-hardware > domains afaict. You shouldn't change behavior without saying so and > explaining why. Done. I'm not really sure if the emulated PIT makes much sense for a PV domain, also because I don't think libxl ever enabled the PIT for a PV domain, but alas this should be discussed in a separate patch and not batched in this commit. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 22/24] xen: credit2: "relax" CSCHED2_MAX_TIMER
On 17/08/16 18:20, Dario Faggioli wrote: > Credit2 is already event based, rather than tick > based. This means, the time at which the (i+1)-eth > scheduling decision needs to happen is computed > during the i-eth scheduling decision, and a timer > is set accordingly. > > If there's nothing imminent (or, the most imminent > event is really really really far away), it is > ok to say "well, let's double-check things in > a little bit anyway", but such 'little bit' does > not need to be too little, as, most likely, it's > just pure overhead. > > The current period, for this "safety catch"-alike > timer is 2ms, which indeed is high, but it can > well be higher. In fact, benchmarks show that > setting it to 10ms --combined with other > optimizations-- does actually improve performance. > > Signed-off-by: Dario FaggioliWe might as well toss this one in: Reviewed-by: George Dunlap ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 0/2] sched: Credit1 and Credit2 improvements... but *NO* soft-affinity for Credit2!
On 30/09/16 15:21, Dario Faggioli wrote: > And here they are, the last two remaining patches... > > Dario > --- > Dario Faggioli (2): > xen: credit2: implement yield() > xen: tracing: add trace records for schedule and rate-limiting. Applied, thanks. -George > > xen/common/sched_credit.c| 32 +++ > xen/common/sched_credit2.c | 92 > +++--- > xen/common/sched_rt.c| 15 +++ > xen/common/schedule.c|2 + > xen/include/xen/perfc_defn.h |1 > 5 files changed, 127 insertions(+), 15 deletions(-) > -- > <> (Raistlin Majere) > - > Dario Faggioli, Ph.D, http://about.me/dario.faggioli > Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 14/30] xen/mm: add a ceil sufix to current page calculation routine
>>> On 27.09.16 at 17:57,wrote: > ... and introduce a floor variant. I dislike this, not the least because of the many places you touch just to tack that odd suffix on. Unless you plan on adding half a dozen or more callers to that floor variant, I think it would be prudent to not introduce such a variant, but instead make those callers simply obtain what they need by calling the existing one. After all gofb_floor(x) = gofb(x + 1) - 1 if I'm not mistaken. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 1/2] xen: credit2: implement yield()
On 30/09/16 15:21, Dario Faggioli wrote: > When a vcpu explicitly yields it is usually giving > us an advice of "let someone else run and come back > to me in a bit." > > Credit2 isn't, so far, doing anything when a vcpu > yields, which means an yield is basically a NOP (well, > actually, it's pure overhead, as it causes the scheduler > kick in, but the result is --at least 99% of the time-- > that the very same vcpu that yielded continues to run). > > With this patch, when a vcpu yields, we go and try > picking the next vcpu on the runqueue that can run on > the pcpu where the yielding vcpu is running. Of course, > if we don't find any other vcpu that wants and can run > there, the yielding vcpu will continue. > > Also, add an yield performance counter, and fix the > style of a couple of comments. > > Signed-off-by: Dario FaggioliThanks! Reviewed-by: George Dunlap > --- > Cc: George Dunlap > Cc: Anshul Makkar > Cc: Jan Beulich > Cc: Andrew Cooper > --- > Changes from v2: > * implement yield as a "switch", rather than as a "knob", as suggested >during review. > > Changes from v1: > * add _us to the parameter name, as suggested during review; > * get rid of the minimum value for the yield bias; > * apply the idle bias via subtraction of credits to the yielding vcpu, rather >than via addition to all the others; > * merge the Credit2 bits of what was patch 7 here, as suggested during > review. > --- > xen/common/sched_credit2.c | 54 > +++--- > xen/common/schedule.c|2 ++ > xen/include/xen/perfc_defn.h |1 + > 3 files changed, 43 insertions(+), 14 deletions(-) > > diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c > index 72e31b5..c0646e9 100644 > --- a/xen/common/sched_credit2.c > +++ b/xen/common/sched_credit2.c > @@ -181,7 +181,13 @@ > */ > #define __CSFLAG_runq_migrate_request 3 > #define CSFLAG_runq_migrate_request (1<<__CSFLAG_runq_migrate_request) > - > +/* > + * CSFLAG_vcpu_yield: this vcpu was running, and has called vcpu_yield(). The > + * scheduler is invoked to see if we can give the cpu to someone else, and > + * get back to the yielding vcpu in a while. > + */ > +#define __CSFLAG_vcpu_yield 4 > +#define CSFLAG_vcpu_yield (1<<__CSFLAG_vcpu_yield) > > static unsigned int __read_mostly opt_migrate_resist = 500; > integer_param("sched_credit2_migrate_resist", opt_migrate_resist); > @@ -1431,6 +1437,14 @@ out: > } > > static void > +csched2_vcpu_yield(const struct scheduler *ops, struct vcpu *v) > +{ > +struct csched2_vcpu * const svc = CSCHED2_VCPU(v); > + > +__set_bit(__CSFLAG_vcpu_yield, >flags); > +} > + > +static void > csched2_context_saved(const struct scheduler *ops, struct vcpu *vc) > { > struct csched2_vcpu * const svc = CSCHED2_VCPU(vc); > @@ -2250,26 +2264,31 @@ runq_candidate(struct csched2_runqueue_data *rqd, > struct list_head *iter; > struct csched2_vcpu *snext = NULL; > struct csched2_private *prv = CSCHED2_PRIV(per_cpu(scheduler, cpu)); > +bool yield = __test_and_clear_bit(__CSFLAG_vcpu_yield, >flags); > > *skipped = 0; > > -/* Default to current if runnable, idle otherwise */ > -if ( vcpu_runnable(scurr->vcpu) ) > -snext = scurr; > -else > -snext = CSCHED2_VCPU(idle_vcpu[cpu]); > - > /* > * Return the current vcpu if it has executed for less than ratelimit. > * Adjuststment for the selected vcpu's credit and decision > * for how long it will run will be taken in csched2_runtime. > + * > + * Note that, if scurr is yielding, we don't let rate limiting kick in. > + * In fact, it may be the case that scurr is about to spin, and there's > + * no point forcing it to do so until rate limiting expires. > */ > -if ( prv->ratelimit_us && !is_idle_vcpu(scurr->vcpu) && > +if ( !yield && prv->ratelimit_us && !is_idle_vcpu(scurr->vcpu) && > vcpu_runnable(scurr->vcpu) && > (now - scurr->vcpu->runstate.state_entry_time) < >MICROSECS(prv->ratelimit_us) ) > return scurr; > > +/* Default to current if runnable, idle otherwise */ > +if ( vcpu_runnable(scurr->vcpu) ) > +snext = scurr; > +else > +snext = CSCHED2_VCPU(idle_vcpu[cpu]); > + > list_for_each( iter, >runq ) > { > struct csched2_vcpu * svc = list_entry(iter, struct csched2_vcpu, > runq_elem); > @@ -2293,8 +2312,10 @@ runq_candidate(struct csched2_runqueue_data *rqd, > continue; > } > > -/* If this is on a different processor, don't pull it unless > - * its credit is at least CSCHED2_MIGRATE_RESIST higher. */ > +/* > + * If this is on a different processor, don't pull it unless > + * its credit is at
Re: [Xen-devel] [PATCH v2 09/30] x86/vtd: fix and simplify mapping RMRR regions
>>> On 30.09.16 at 17:02,wrote: > On Fri, Sep 30, 2016 at 07:21:41AM -0600, Jan Beulich wrote: >> >>> On 30.09.16 at 13:27, wrote: >> > On Thu, Sep 29, 2016 at 08:18:36AM -0600, Jan Beulich wrote: >> >> >>> On 27.09.16 at 17:57, wrote: >> >> > +{ >> >> > +int rc; >> >> > + >> >> > +while ( nr_pages > 0 ) >> >> > +{ >> >> > +rc = (map ? map_mmio_regions : unmap_mmio_regions) >> >> > + (d, _gfn(pfn), nr_pages, _mfn(pfn)); >> >> > +if ( rc == 0 ) >> >> > +break; >> >> > +if ( rc < 0 ) >> >> > +{ >> >> > +printk(XENLOG_ERR >> >> > +"Failed to %smap %#lx - %#lx into domain %d memory >> >> > map: > %d\n", >> >> > + map ? "" : "un", pfn, pfn + nr_pages, d->domain_id, >> >> > rc); >> >> > +return rc; >> >> > +} >> >> > +nr_pages -= rc; >> >> > +pfn += rc; >> >> > +process_pending_softirqs(); >> >> > +} >> >> > + >> >> > +return rc; >> >> >> >> The way this is coded it appears to possibly return non-zero even in >> >> success case. I think this would therefore better be a for ( ; ; ) loop. >> > >> > I don't think this is possible, {un}map_mmio_regions will return < 0 on >> > error, > 0 if there are pending pages to map, and 0 when all the requested >> > pages have been mapped successfully. >> >> Right - hence the "appears to" in my reply; it took me a while to >> figure it's not actually possible, and hence my desire to make this >> more obvious to the reader. > > Ah, OK, I misunderstood you then. What about changing the last return rc > to return 0? This would make it more obvious, because I'm not really sure a > for loop would change much (IMHO the problem is the return semantics used by > {un}map_mmio_regions). Well, my suggestion was not to use "a for loop" but specifically "for ( ; ; )" to clarify the only loop exit condition is the single break statement. That break statement could of course also become a "return 0". I'd rather not see the return at the end of the function become "return 0"; if anything (i.e. if you follow my suggestion) it could be deleted altogether. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 12/30] xen/x86: make print_e820_memory_map global
>>> On 27.09.16 at 17:57,wrote: > So that it can be called from the Dom0 builder. Why would the Dom0 builder need to call it, when it doesn't so far? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 11/30] xen/x86: split Dom0 build into PV and PVHv2
>>> On 27.09.16 at 17:57,wrote: > --- a/docs/misc/xen-command-line.markdown > +++ b/docs/misc/xen-command-line.markdown > @@ -663,6 +663,13 @@ Pin dom0 vcpus to their respective pcpus > > Flag that makes a 64bit dom0 boot in PVH mode. No 32bit support at present. > > +### dom0hvm > +> `= ` > + > +> Default: `false` > + > +Flag that makes a dom0 boot in PVHv2 mode. Considering sorting aspects this clearly wants to go at least ahead of dom0pvh. > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -75,6 +75,10 @@ unsigned long __read_mostly cr4_pv32_mask; > static bool_t __initdata opt_dom0pvh; > boolean_param("dom0pvh", opt_dom0pvh); > > +/* Boot dom0 in HVM mode */ > +static bool_t __initdata opt_dom0hvm; Plain bool please. > @@ -1495,6 +1499,11 @@ void __init noreturn __start_xen(unsigned long mbi_p) > if ( opt_dom0pvh ) > domcr_flags |= DOMCRF_pvh | DOMCRF_hap; > > +if ( opt_dom0hvm ) { Coding style. > +domcr_flags |= DOMCRF_hvm | (hvm_funcs.hap_supported ? DOMCRF_hap : > 0); So you mean to support PVHv2 on shadow (including for Dom0) right away. Are you also testing that? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 09/30] x86/vtd: fix and simplify mapping RMRR regions
On Fri, Sep 30, 2016 at 07:21:41AM -0600, Jan Beulich wrote: > >>> On 30.09.16 at 13:27,wrote: > > On Thu, Sep 29, 2016 at 08:18:36AM -0600, Jan Beulich wrote: > >> >>> On 27.09.16 at 17:57, wrote: > >> > +{ > >> > +int rc; > >> > + > >> > +while ( nr_pages > 0 ) > >> > +{ > >> > +rc = (map ? map_mmio_regions : unmap_mmio_regions) > >> > + (d, _gfn(pfn), nr_pages, _mfn(pfn)); > >> > +if ( rc == 0 ) > >> > +break; > >> > +if ( rc < 0 ) > >> > +{ > >> > +printk(XENLOG_ERR > >> > +"Failed to %smap %#lx - %#lx into domain %d memory map: > >> > %d\n", > >> > + map ? "" : "un", pfn, pfn + nr_pages, d->domain_id, > >> > rc); > >> > +return rc; > >> > +} > >> > +nr_pages -= rc; > >> > +pfn += rc; > >> > +process_pending_softirqs(); > >> > +} > >> > + > >> > +return rc; > >> > >> The way this is coded it appears to possibly return non-zero even in > >> success case. I think this would therefore better be a for ( ; ; ) loop. > > > > I don't think this is possible, {un}map_mmio_regions will return < 0 on > > error, > 0 if there are pending pages to map, and 0 when all the requested > > pages have been mapped successfully. > > Right - hence the "appears to" in my reply; it took me a while to > figure it's not actually possible, and hence my desire to make this > more obvious to the reader. Ah, OK, I misunderstood you then. What about changing the last return rc to return 0? This would make it more obvious, because I'm not really sure a for loop would change much (IMHO the problem is the return semantics used by {un}map_mmio_regions). Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] SVM: use generic instruction decoding
On 09/30/2016 10:56 AM, Jan Beulich wrote: On 30.09.16 at 16:54,wrote: >> On 09/30/2016 10:44 AM, Jan Beulich wrote: > +int > +x86_insn_modrm(const struct x86_emulate_state *state, > + unsigned int *rm, unsigned int *reg) > +{ > +check_state(state); > + > +if ( !(state->desc & ModRM) ) > +return -EINVAL; > + > +if ( rm ) > +*rm = state->modrm_rm; > +if ( reg ) > +*reg = state->modrm_reg; > + > +return state->modrm_mod; > +} Can this return struct modrm (which would then become visible outside of svm.c)? And then x86_emulate_state can include the same struct instead of the three separate fields. >>> I'd prefer not to, to leave it to callers which parts they actually care >>> about. No need for them to put the whole structure on stack when >>> all they want is e.g. mod. >> But isn't the whole struct one byte long so you'd not be increasing >> amount of data on stack? This will also make comparison at least in >> __get_instruction_length_from_list() (and possibly other places) simpler. > See the other reply (as well as Andrew's): We'd be making available > incomplete information if we did it that way. Yes, I saw them after I sent my reply. Reviewed-by: Boris Ostrovsky ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] SVM: use generic instruction decoding
>>> On 30.09.16 at 16:54,wrote: > On 09/30/2016 10:44 AM, Jan Beulich wrote: >> +int +x86_insn_modrm(const struct x86_emulate_state *state, + unsigned int *rm, unsigned int *reg) +{ +check_state(state); + +if ( !(state->desc & ModRM) ) +return -EINVAL; + +if ( rm ) +*rm = state->modrm_rm; +if ( reg ) +*reg = state->modrm_reg; + +return state->modrm_mod; +} >>> Can this return struct modrm (which would then become visible outside of >>> svm.c)? And then x86_emulate_state can include the same struct instead >>> of the three separate fields. >> I'd prefer not to, to leave it to callers which parts they actually care >> about. No need for them to put the whole structure on stack when >> all they want is e.g. mod. > > But isn't the whole struct one byte long so you'd not be increasing > amount of data on stack? This will also make comparison at least in > __get_instruction_length_from_list() (and possibly other places) simpler. See the other reply (as well as Andrew's): We'd be making available incomplete information if we did it that way. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] hvmloader, pci: Don't try to relocate memory if 64-bit BAR is bigger than ~2GB
On Thu, Sep 29, 2016 at 11:48:43AM +0100, Wei Liu wrote: > On Thu, Sep 29, 2016 at 03:36:00AM -0600, Jan Beulich wrote: > > >>> On 29.09.16 at 11:23,wrote: > > > On Thu, Sep 29, 2016 at 01:03:02AM -0600, Jan Beulich wrote: > > >> >>> On 29.09.16 at 01:48, wrote: > > >> > @@ -265,11 +266,30 @@ void pci_setup(void) > > >> > bars[i].devfn = devfn; > > >> > bars[i].bar_reg = bar_reg; > > >> > bars[i].bar_sz = bar_sz; > > >> > +bars[i].above_4gb = false; > > >> > > > >> > if ( ((bar_data & PCI_BASE_ADDRESS_SPACE) == > > >> >PCI_BASE_ADDRESS_SPACE_MEMORY) || > > >> > (bar_reg == PCI_ROM_ADDRESS) ) > > >> > -mmio_total += bar_sz; > > >> > +{ > > >> > +/* > > >> > + * If bigger than 2GB minus emulated devices BAR > > >> > space and > > >> > + * APIC space, then don't try to put under 4GB. > > >> > + */ > > >> > +if ( is_64bar && (mmio_total >= GB(2) || bar_sz >= > > >> > + (GB(2) - HVM_BELOW_4G_MMIO_LENGTH - mmio_total)) > > >> > ) > > >> > > >> As mentioned in the reply to your earlier mail already, the > > >> subtraction of mmio_total here is risking wrap through zero (the > > >> >= GB(2) check doesn't fully guard against that). > > > > > > I am still waking up so bear with me, but is the reason the mmio_total > > >>= GB(2) check does not guard is because the compiler may choose > > > to execute _both_ parts of the '||' conditional (or swap them and > > > execute the 'mmio_total >= GB(2)' second)? > > > > No, it's because you subtract more than just mmio_total from GB(2). > > > > >> Furthermore you're now making behavior dependent on the order > > >> devices appear on the bus: The same device appearing early may > > >> get its BAR placed below 4Gb whereas when it appears late, it'll > > >> get placed high. IOW I think this needs further refinement: We > > >> should in a first pass place only 32-bit BARs. In a second pass we > > >> can then see which 64-bit BARs still fit (and I think we then ought > > >> to prefer small ones). Which means we should presumably account > > >> 32- and 64-bit BARs here independent of any other considerations, > > >> deferring the decision which 64-bit ones to place low until after this > > >> first pass. > > > > > > Ok, that is going to require some surgery and movement of code to add > > > some functions in that giant piece of code. Expect more patches next > > > week (or would it be easier if I just sent them out for the next release > > > considering the amount of patches that are floating this week that need > > > review?) > > > > Well, I would view this as a bug fix, so it might still be allowed in. > > Ask Wei if in doubt. > > > > Before RC1, sure. After we cut RCs, anything that changes memory layout > of the guests need to be considered carefully. OK. I will try my best to get it done before then. But if I fail we can always look at this for the next release. (Along with other patches that I need to redo). ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 08/12] tmem: Handle 'struct tmem_info' as a seperate field in the
>>> On 30.09.16 at 16:36,wrote: > On Wed, Sep 28, 2016 at 06:56:40AM -0600, Jan Beulich wrote: >> >>> On 28.09.16 at 11:42, wrote: >> > Note: We still have to do this awkward 'guest_handle_cast' >> > otherwise it will not compile on ARM - which defines _two_ >> > of these macros (__guest_handle_64_xen_sysctl_tmem_client_t >> > and __guest_handle_xen_sysctl_tmem_client_t). And if cast is >> > not used then a compile error comes up as we use the wrong one. >> >> This seems suspicious, but it's hard to judge without knowing what >> exactly the errors were. > > tmem_control.c: In function ‘tmem_control’: > tmem_control.c:426:9: error: incompatible type for argument 2 of > ‘tmemc_set_client_info’ > ret = tmemc_set_client_info(op->cli_id, op->u.client); > ^ > tmem_control.c:302:12: note: expected > ‘__guest_handle_xen_sysctl_tmem_client_t’ but argument is of type > ‘__guest_handle_64_xen_sysctl_tmem_client_t’ > static int tmemc_set_client_info(domid_t cli_id, > ^ Looks like you want to pass around a 64-bit handle then? >> > --- a/xen/common/tmem_control.c >> > +++ b/xen/common/tmem_control.c >> > @@ -258,7 +258,7 @@ static int tmemc_list(domid_t cli_id, >> > tmem_cli_va_param_t buf, uint32_t len, >> > return 0; >> > } >> > >> > -static int __tmemc_set_var(struct client *client, uint32_t subop, >> > +static int __tmemc_set_var(struct client *client, >> > >> > XEN_GUEST_HANDLE_PARAM(xen_sysctl_tmem_client_t) buf) >> > { >> > domid_t cli_id = client->cli_id; >> > @@ -267,11 +267,6 @@ static int __tmemc_set_var(struct client *client, >> > uint32_t subop, >> > >> > ASSERT(client); >> > >> > -if ( subop != XEN_SYSCTL_TMEM_OP_SET_CLIENT_INFO ) >> > -{ >> > -tmem_client_warn("tmem: unknown subop %d for tmemc_set_var\n", >> > subop); >> > -return -1; >> > -} >> > if ( copy_from_guest(, buf, 1) ) >> > return -EFAULT; >> >> The adjustments above look pretty unrelated to the purpose of the >> patch, but well - you're the maintainer of this code. > > Don't be that harsh to yourself. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] SVM: use generic instruction decoding
On 09/30/2016 10:44 AM, Jan Beulich wrote: > >>> +int >>> +x86_insn_modrm(const struct x86_emulate_state *state, >>> + unsigned int *rm, unsigned int *reg) >>> +{ >>> +check_state(state); >>> + >>> +if ( !(state->desc & ModRM) ) >>> +return -EINVAL; >>> + >>> +if ( rm ) >>> +*rm = state->modrm_rm; >>> +if ( reg ) >>> +*reg = state->modrm_reg; >>> + >>> +return state->modrm_mod; >>> +} >> Can this return struct modrm (which would then become visible outside of >> svm.c)? And then x86_emulate_state can include the same struct instead >> of the three separate fields. > I'd prefer not to, to leave it to callers which parts they actually care > about. No need for them to put the whole structure on stack when > all they want is e.g. mod. But isn't the whole struct one byte long so you'd not be increasing amount of data on stack? This will also make comparison at least in __get_instruction_length_from_list() (and possibly other places) simpler. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] SVM: use generic instruction decoding
>>> On 30.09.16 at 16:46,wrote: > On 30/09/16 15:44, Jan Beulich wrote: +int +x86_insn_modrm(const struct x86_emulate_state *state, + unsigned int *rm, unsigned int *reg) +{ +check_state(state); + +if ( !(state->desc & ModRM) ) +return -EINVAL; + +if ( rm ) +*rm = state->modrm_rm; +if ( reg ) +*reg = state->modrm_reg; + +return state->modrm_mod; +} >>> Can this return struct modrm (which would then become visible outside of >>> svm.c)? And then x86_emulate_state can include the same struct instead >>> of the three separate fields. >> I'd prefer not to, to leave it to callers which parts they actually care >> about. No need for them to put the whole structure on stack when >> all they want is e.g. mod. > > The structure is fine in principle, being only a single byte, but what > can't be represented with it is additions from REX prefixes. Oh, because of this I didn't even read it as returning an existing structure type, but introducing a new one. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] trace: Fix incorrect number of pages used for trace metadata
On 30/09/16 15:46, George Dunlap wrote: > On 29/09/16 14:53, Igor Druzhinin wrote: >> As long as t_info_first_offset is calculated in uint32_t offsets we need to >> multiply it by sizeof(uint32_t) in order to get the right number of pages >> for trace metadata. Not doing that makes it impossible to read the trace >> buffer correctly from userspace for some corner cases. >> >> Signed-off-by: Igor Druzhinin> > Hmm, looks like we actually want to revert this c/s fbf96e6, "xentrace: > correct formula to calculate t_info_pages". But that one was presumably > written (and Acked by me) because the variable name there, > t_info_first_offset, is confusing. > > The other mistake in fbf96e6 is that before t_info_words was actually > denominated in words; but after it's denominated in bytes (which is > again confusing). > > What about something like the attached instead? This should fix your > problem while making the code clearer. Obviously that comment shouldn't include the second "not bytes". We're not writing folk songs here... -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] trace: Fix incorrect number of pages used for trace metadata
On 30/09/16 15:46, George Dunlap wrote: > On 29/09/16 14:53, Igor Druzhinin wrote: >> > As long as t_info_first_offset is calculated in uint32_t offsets we need to >> > multiply it by sizeof(uint32_t) in order to get the right number of pages >> > for trace metadata. Not doing that makes it impossible to read the trace >> > buffer correctly from userspace for some corner cases. >> > >> > Signed-off-by: Igor Druzhinin> Hmm, looks like we actually want to revert this c/s fbf96e6, "xentrace: > correct formula to calculate t_info_pages". But that one was presumably > written (and Acked by me) because the variable name there, > t_info_first_offset, is confusing. > > The other mistake in fbf96e6 is that before t_info_words was actually > denominated in words; but after it's denominated in bytes (which is > again confusing). > > What about something like the attached instead? This should fix your > problem while making the code clearer. > > -George > > > > From b0252cec4a96fea28faa85c47ab0f5d5997444ad Mon Sep 17 00:00:00 2001 > From: George Dunlap > Date: Fri, 30 Sep 2016 15:42:56 +0100 > Subject: [PATCH] xen/trace: Fix trace metadata page count calculation (revert > fbf96e6) > > Changeset fbf96e6, "xentrace: correct formula to calculate > t_info_pages", broke the trace metadata page count calculation, by > mistaking t_info_first_offset as denominated in bytes, when in fact it > is denominated in words (uint32_t). > > Effectively revert that change, and put a comment there to reduce the > chance that someone will make that mistake in the future. > > Spotted-by: Igor Druzhinin > Signed-off-by: George Dunlap > --- > CC: Olaf Hering > CC: Andrew Cooper > CC: Jan Beulich > --- > xen/common/trace.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/xen/common/trace.c b/xen/common/trace.c > index f651cf3..2f4ecca 100644 > --- a/xen/common/trace.c > +++ b/xen/common/trace.c > @@ -148,8 +148,12 @@ static int calculate_tbuf_size(unsigned int pages, > uint16_t t_info_first_offset) > pages = max_pages; > } > > -t_info_words = num_online_cpus() * pages * sizeof(uint32_t); > -t_info_pages = PFN_UP(t_info_first_offset + t_info_words); > +/* > + * NB this calculation is correct, because t_info_first_offset is > + * in words, not bytes, not bytes s/, not bytes$/./ ~Andrew > + */ > +t_info_words = num_online_cpus() * pages + t_info_first_offset; > +t_info_pages = PFN_UP(t_info_words * sizeof(uint32_t)); > printk(XENLOG_INFO "xentrace: requesting %u t_info pages " > "for %u trace pages on %u cpus\n", > t_info_pages, pages, num_online_cpus()); > -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 0/2] Convert to new CPU hotplug framework
On 07/09/16 18:18, Boris Ostrovsky wrote: > New CPU hotplug framework was introduced recently. These patches convert Xen > CPU hotplug code to this infrastructure. Applied to for-linus-4.9, thanks. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] SVM: use generic instruction decoding
On 30/09/16 15:44, Jan Beulich wrote: >>> +int >>> +x86_insn_modrm(const struct x86_emulate_state *state, >>> + unsigned int *rm, unsigned int *reg) >>> +{ >>> +check_state(state); >>> + >>> +if ( !(state->desc & ModRM) ) >>> +return -EINVAL; >>> + >>> +if ( rm ) >>> +*rm = state->modrm_rm; >>> +if ( reg ) >>> +*reg = state->modrm_reg; >>> + >>> +return state->modrm_mod; >>> +} >> Can this return struct modrm (which would then become visible outside of >> svm.c)? And then x86_emulate_state can include the same struct instead >> of the three separate fields. > I'd prefer not to, to leave it to callers which parts they actually care > about. No need for them to put the whole structure on stack when > all they want is e.g. mod. The structure is fine in principle, being only a single byte, but what can't be represented with it is additions from REX prefixes. I had considered making structures for ModRM/SIB bytes in the past, but it makes the logic harder rather than easier. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 0/3] xen/pciback: support driver_override
On 22/09/16 09:45, Juergen Gross wrote: > Support the driver_override scheme introduced with commit 782a985d7af2 > ("PCI: Introduce new device binding path using pci_dev.driver_override") Applied to for-linus-4.9, thanks. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] trace: Fix incorrect number of pages used for trace metadata
On 29/09/16 14:53, Igor Druzhinin wrote: > As long as t_info_first_offset is calculated in uint32_t offsets we need to > multiply it by sizeof(uint32_t) in order to get the right number of pages > for trace metadata. Not doing that makes it impossible to read the trace > buffer correctly from userspace for some corner cases. > > Signed-off-by: Igor DruzhininHmm, looks like we actually want to revert this c/s fbf96e6, "xentrace: correct formula to calculate t_info_pages". But that one was presumably written (and Acked by me) because the variable name there, t_info_first_offset, is confusing. The other mistake in fbf96e6 is that before t_info_words was actually denominated in words; but after it's denominated in bytes (which is again confusing). What about something like the attached instead? This should fix your problem while making the code clearer. -George From b0252cec4a96fea28faa85c47ab0f5d5997444ad Mon Sep 17 00:00:00 2001 From: George Dunlap Date: Fri, 30 Sep 2016 15:42:56 +0100 Subject: [PATCH] xen/trace: Fix trace metadata page count calculation (revert fbf96e6) Changeset fbf96e6, "xentrace: correct formula to calculate t_info_pages", broke the trace metadata page count calculation, by mistaking t_info_first_offset as denominated in bytes, when in fact it is denominated in words (uint32_t). Effectively revert that change, and put a comment there to reduce the chance that someone will make that mistake in the future. Spotted-by: Igor Druzhinin Signed-off-by: George Dunlap --- CC: Olaf Hering CC: Andrew Cooper CC: Jan Beulich --- xen/common/trace.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/xen/common/trace.c b/xen/common/trace.c index f651cf3..2f4ecca 100644 --- a/xen/common/trace.c +++ b/xen/common/trace.c @@ -148,8 +148,12 @@ static int calculate_tbuf_size(unsigned int pages, uint16_t t_info_first_offset) pages = max_pages; } -t_info_words = num_online_cpus() * pages * sizeof(uint32_t); -t_info_pages = PFN_UP(t_info_first_offset + t_info_words); +/* + * NB this calculation is correct, because t_info_first_offset is + * in words, not bytes, not bytes + */ +t_info_words = num_online_cpus() * pages + t_info_first_offset; +t_info_pages = PFN_UP(t_info_words * sizeof(uint32_t)); printk(XENLOG_INFO "xentrace: requesting %u t_info pages " "for %u trace pages on %u cpus\n", t_info_pages, pages, num_online_cpus()); -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen: Remove event channel notification through Xen PCI platform device
On 26/08/16 22:55, KarimAllah Ahmed wrote: > Ever since commit 254d1a3f02eb ("xen/pv-on-hvm kexec: shutdown watches > from old kernel") using the INTx interrupt from Xen PCI platform device for > event channel notification would just lockup the guest during bootup. > postcore_initcall now calls xs_reset_watches which will eventually try to read > a value from XenStore and will get stuck on read_reply at XenBus forever since > the platform driver is not probed yet and its INTx interrupt handler is not > registered yet. That means that the guest can not be notified at this moment > of > any pending event channels and none of the per-event handlers will ever be > invoked (including the XenStore one) and the reply will never be picked up by > the kernel. Applied to for-linus-4.9, thanks. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] SVM: use generic instruction decoding
>>> On 30.09.16 at 16:37,wrote: > On 09/30/2016 05:38 AM, Jan Beulich wrote: >> +if ( opc_tab[instr].opcode == ctxt.ctxt.opcode ) >> { >> -if ( (inst_len + i) >= fetch_len ) >> -{ >> -if ( !fetch(vmcb, buf + fetch_len, fetch_addr + fetch_len, >> -max_len - fetch_len) ) >> -return 0; >> -fetch_len = max_len; >> -} >> +if ( !opc_tab[instr].modrm.mod ) >> +return inst_len; > > Is there a reason why this is not folded into the 'if' below? To keep things readable. >> +int >> +x86_insn_modrm(const struct x86_emulate_state *state, >> + unsigned int *rm, unsigned int *reg) >> +{ >> +check_state(state); >> + >> +if ( !(state->desc & ModRM) ) >> +return -EINVAL; >> + >> +if ( rm ) >> +*rm = state->modrm_rm; >> +if ( reg ) >> +*reg = state->modrm_reg; >> + >> +return state->modrm_mod; >> +} > > Can this return struct modrm (which would then become visible outside of > svm.c)? And then x86_emulate_state can include the same struct instead > of the three separate fields. I'd prefer not to, to leave it to callers which parts they actually care about. No need for them to put the whole structure on stack when all they want is e.g. mod. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] SVM: use generic instruction decoding
On 09/30/2016 05:38 AM, Jan Beulich wrote: > +if ( opc_tab[instr].opcode == ctxt.ctxt.opcode ) > { > -if ( (inst_len + i) >= fetch_len ) > -{ > -if ( !fetch(vmcb, buf + fetch_len, fetch_addr + fetch_len, > -max_len - fetch_len) ) > -return 0; > -fetch_len = max_len; > -} > +if ( !opc_tab[instr].modrm.mod ) > +return inst_len; Is there a reason why this is not folded into the 'if' below? > > -if ( buf[inst_len+i] != opcode[i+1] ) > -goto mismatch; > +if ( modrm_mod == opc_tab[instr].modrm.mod && > + (modrm_rm & 7) == opc_tab[instr].modrm.rm && > + (modrm_reg & 7) == opc_tab[instr].modrm.reg ) > +return inst_len; > } ... > + > +int > +x86_insn_modrm(const struct x86_emulate_state *state, > + unsigned int *rm, unsigned int *reg) > +{ > +check_state(state); > + > +if ( !(state->desc & ModRM) ) > +return -EINVAL; > + > +if ( rm ) > +*rm = state->modrm_rm; > +if ( reg ) > +*reg = state->modrm_reg; > + > +return state->modrm_mod; > +} Can this return struct modrm (which would then become visible outside of svm.c)? And then x86_emulate_state can include the same struct instead of the three separate fields. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 08/12] tmem: Handle 'struct tmem_info' as a seperate field in the
On Wed, Sep 28, 2016 at 06:56:40AM -0600, Jan Beulich wrote: > >>> On 28.09.16 at 11:42,wrote: > > Note: We still have to do this awkward 'guest_handle_cast' > > otherwise it will not compile on ARM - which defines _two_ > > of these macros (__guest_handle_64_xen_sysctl_tmem_client_t > > and __guest_handle_xen_sysctl_tmem_client_t). And if cast is > > not used then a compile error comes up as we use the wrong one. > > This seems suspicious, but it's hard to judge without knowing what > exactly the errors were. tmem_control.c: In function ‘tmem_control’: tmem_control.c:426:9: error: incompatible type for argument 2 of ‘tmemc_set_client_info’ ret = tmemc_set_client_info(op->cli_id, op->u.client); ^ tmem_control.c:302:12: note: expected ‘__guest_handle_xen_sysctl_tmem_client_t’ but argument is of type ‘__guest_handle_64_xen_sysctl_tmem_client_t’ static int tmemc_set_client_info(domid_t cli_id, ^ tmem_control.c:432:9: error: incompatible type for argument 2 of ‘tmemc_get_client_info’ ret = tmemc_get_client_info(op->cli_id, op->u.client); ^ > > > --- a/xen/common/tmem_control.c > > +++ b/xen/common/tmem_control.c > > @@ -258,7 +258,7 @@ static int tmemc_list(domid_t cli_id, > > tmem_cli_va_param_t buf, uint32_t len, > > return 0; > > } > > > > -static int __tmemc_set_var(struct client *client, uint32_t subop, > > +static int __tmemc_set_var(struct client *client, > > > > XEN_GUEST_HANDLE_PARAM(xen_sysctl_tmem_client_t) buf) > > { > > domid_t cli_id = client->cli_id; > > @@ -267,11 +267,6 @@ static int __tmemc_set_var(struct client *client, > > uint32_t subop, > > > > ASSERT(client); > > > > -if ( subop != XEN_SYSCTL_TMEM_OP_SET_CLIENT_INFO ) > > -{ > > -tmem_client_warn("tmem: unknown subop %d for tmemc_set_var\n", > > subop); > > -return -1; > > -} > > if ( copy_from_guest(, buf, 1) ) > > return -EFAULT; > > The adjustments above look pretty unrelated to the purpose of the > patch, but well - you're the maintainer of this code. That is indeed wrong. This should have been done in the previous patch: "tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE].." > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Adding new custom devices to Xen via QEMU
Thanks David, This could very well be the issue, but could you please elaborate? The questions that come up are the following: What is the physical address range given to RAM? What range of addresses would work for my device? And, if this is the case, how would I unpopulate the RAM? There are reasons for the address chosen, and it works on other hypervisors (e.g. KVM) so although it might be easiest to change the address I really don't want to unless its the only way to keep from a Xen modification entirely. Jason On 9/30/2016 9:53 AM, David Vrabel wrote: On 30/09/16 14:35, Jason Dickens wrote: Hi Wei, Thanks for the response. It make sense to me that if the device were on the PCI bus (or other such bus, e.g. USB) that it could be discovered, at least by an OS. Its something to consider. I should mention that our guest VM doesn't actually use an OS. However, the device is not implemented that as PCI it is simply memory mapped. Technically, in QEMU is has type ISA because it was derived as a modification of the TPM device. Is it possible something is lacking in the QEMU model that Xen needs but KVM doesn't? If the answer is that Xen should not need modification for any new devices then this gives me hope. You've also inspired some things to try, like whether or not smaller modifications to the TPM device work. One change that is significant to mention is that the physical address range use is anomalous, by which I mean it not in the normal device range. Does device MMIO overlap with guest RAM? If so, you'll need to unpopulate the RAM first. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 2/2] xen: tracing: add trace records for schedule and rate-limiting.
As far as {csched, csched2, rt}_schedule() are concerned, an "empty" event, would already make it easier to read and understand a trace. But while there, add a few useful information, like if the cpu that is going through the scheduler has been tickled or not, if it is currently idle, etc (they vary, on a per-scheduler basis). For Credit1 and Credit2, add a record about when rate-limiting kicks in too. Signed-off-by: Dario Faggioli--- Cc: George Dunlap Cc: Meng Xu Cc: Anshul Makkar --- Changes from v1: * corrected the schedule record for sched_rt.c, as pointed out during review; * pack the Credit1 records as well, as requested during review. --- xen/common/sched_credit.c | 32 xen/common/sched_credit2.c | 38 +- xen/common/sched_rt.c | 15 +++ 3 files changed, 84 insertions(+), 1 deletion(-) diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index 5700763..fc3a321 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -133,6 +133,8 @@ #define TRC_CSCHED_TICKLETRC_SCHED_CLASS_EVT(CSCHED, 6) #define TRC_CSCHED_BOOST_START TRC_SCHED_CLASS_EVT(CSCHED, 7) #define TRC_CSCHED_BOOST_END TRC_SCHED_CLASS_EVT(CSCHED, 8) +#define TRC_CSCHED_SCHEDULE TRC_SCHED_CLASS_EVT(CSCHED, 9) +#define TRC_CSCHED_RATELIMIT TRC_SCHED_CLASS_EVT(CSCHED, 10) /* @@ -1774,6 +1776,23 @@ csched_schedule( SCHED_STAT_CRANK(schedule); CSCHED_VCPU_CHECK(current); +/* + * Here in Credit1 code, we usually just call TRACE_nD() helpers, and + * don't care about packing. But scheduling happens very often, so it + * actually is important that the record is as small as possible. + */ +if ( unlikely(tb_init_done) ) +{ +struct { +unsigned cpu:16, tasklet:8, idle:8; +} d; +d.cpu = cpu; +d.tasklet = tasklet_work_scheduled; +d.idle = is_idle_vcpu(current); +__trace_var(TRC_CSCHED_SCHEDULE, 1, sizeof(d), +(unsigned char *)); +} + runtime = now - current->runstate.state_entry_time; if ( runtime < 0 ) /* Does this ever happen? */ runtime = 0; @@ -1829,6 +1848,19 @@ csched_schedule( tslice = MICROSECS(prv->ratelimit_us) - runtime; if ( unlikely(runtime < CSCHED_MIN_TIMER) ) tslice = CSCHED_MIN_TIMER; +if ( unlikely(tb_init_done) ) +{ +struct { +unsigned vcpu:16, dom:16; +unsigned runtime; +} d; +d.dom = scurr->vcpu->domain->domain_id; +d.vcpu = scurr->vcpu->vcpu_id; +d.runtime = runtime; +__trace_var(TRC_CSCHED_RATELIMIT, 1, sizeof(d), +(unsigned char *)); +} + ret.migrated = 0; goto out; } diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index c0646e9..6b98319 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -55,6 +55,8 @@ #define TRC_CSCHED2_LOAD_BALANCE TRC_SCHED_CLASS_EVT(CSCHED2, 17) #define TRC_CSCHED2_PICKED_CPU TRC_SCHED_CLASS_EVT(CSCHED2, 19) #define TRC_CSCHED2_RUNQ_CANDIDATE TRC_SCHED_CLASS_EVT(CSCHED2, 20) +#define TRC_CSCHED2_SCHEDULE TRC_SCHED_CLASS_EVT(CSCHED2, 21) +#define TRC_CSCHED2_RATELIMITTRC_SCHED_CLASS_EVT(CSCHED2, 22) /* * WARNING: This is still in an experimental phase. Status and work can be found at the @@ -2281,7 +2283,22 @@ runq_candidate(struct csched2_runqueue_data *rqd, vcpu_runnable(scurr->vcpu) && (now - scurr->vcpu->runstate.state_entry_time) < MICROSECS(prv->ratelimit_us) ) +{ +if ( unlikely(tb_init_done) ) +{ +struct { +unsigned vcpu:16, dom:16; +unsigned runtime; +} d; +d.dom = scurr->vcpu->domain->domain_id; +d.vcpu = scurr->vcpu->vcpu_id; +d.runtime = now - scurr->vcpu->runstate.state_entry_time; +__trace_var(TRC_CSCHED2_RATELIMIT, 1, +sizeof(d), +(unsigned char *)); +} return scurr; +} /* Default to current if runnable, idle otherwise */ if ( vcpu_runnable(scurr->vcpu) ) @@ -2371,6 +2388,7 @@ csched2_schedule( struct csched2_vcpu *snext = NULL; unsigned int skipped_vcpus = 0; struct task_slice ret; +bool_t tickled; SCHED_STAT_CRANK(schedule); CSCHED2_VCPU_CHECK(current); @@ -2385,13 +2403,31 @@ csched2_schedule( BUG_ON(!is_idle_vcpu(scurr->vcpu) && scurr->rqd != rqd); /* Clear "tickled" bit now that we've been scheduled */ -if ( cpumask_test_cpu(cpu, >tickled) ) +tickled = cpumask_test_cpu(cpu, >tickled); +if ( tickled ) {
[Xen-devel] [PATCH v3 0/2] sched: Credit1 and Credit2 improvements... but *NO* soft-affinity for Credit2!
And here they are, the last two remaining patches... Dario --- Dario Faggioli (2): xen: credit2: implement yield() xen: tracing: add trace records for schedule and rate-limiting. xen/common/sched_credit.c| 32 +++ xen/common/sched_credit2.c | 92 +++--- xen/common/sched_rt.c| 15 +++ xen/common/schedule.c|2 + xen/include/xen/perfc_defn.h |1 5 files changed, 127 insertions(+), 15 deletions(-) -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 1/2] xen: credit2: implement yield()
When a vcpu explicitly yields it is usually giving us an advice of "let someone else run and come back to me in a bit." Credit2 isn't, so far, doing anything when a vcpu yields, which means an yield is basically a NOP (well, actually, it's pure overhead, as it causes the scheduler kick in, but the result is --at least 99% of the time-- that the very same vcpu that yielded continues to run). With this patch, when a vcpu yields, we go and try picking the next vcpu on the runqueue that can run on the pcpu where the yielding vcpu is running. Of course, if we don't find any other vcpu that wants and can run there, the yielding vcpu will continue. Also, add an yield performance counter, and fix the style of a couple of comments. Signed-off-by: Dario Faggioli--- Cc: George Dunlap Cc: Anshul Makkar Cc: Jan Beulich Cc: Andrew Cooper --- Changes from v2: * implement yield as a "switch", rather than as a "knob", as suggested during review. Changes from v1: * add _us to the parameter name, as suggested during review; * get rid of the minimum value for the yield bias; * apply the idle bias via subtraction of credits to the yielding vcpu, rather than via addition to all the others; * merge the Credit2 bits of what was patch 7 here, as suggested during review. --- xen/common/sched_credit2.c | 54 +++--- xen/common/schedule.c|2 ++ xen/include/xen/perfc_defn.h |1 + 3 files changed, 43 insertions(+), 14 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 72e31b5..c0646e9 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -181,7 +181,13 @@ */ #define __CSFLAG_runq_migrate_request 3 #define CSFLAG_runq_migrate_request (1<<__CSFLAG_runq_migrate_request) - +/* + * CSFLAG_vcpu_yield: this vcpu was running, and has called vcpu_yield(). The + * scheduler is invoked to see if we can give the cpu to someone else, and + * get back to the yielding vcpu in a while. + */ +#define __CSFLAG_vcpu_yield 4 +#define CSFLAG_vcpu_yield (1<<__CSFLAG_vcpu_yield) static unsigned int __read_mostly opt_migrate_resist = 500; integer_param("sched_credit2_migrate_resist", opt_migrate_resist); @@ -1431,6 +1437,14 @@ out: } static void +csched2_vcpu_yield(const struct scheduler *ops, struct vcpu *v) +{ +struct csched2_vcpu * const svc = CSCHED2_VCPU(v); + +__set_bit(__CSFLAG_vcpu_yield, >flags); +} + +static void csched2_context_saved(const struct scheduler *ops, struct vcpu *vc) { struct csched2_vcpu * const svc = CSCHED2_VCPU(vc); @@ -2250,26 +2264,31 @@ runq_candidate(struct csched2_runqueue_data *rqd, struct list_head *iter; struct csched2_vcpu *snext = NULL; struct csched2_private *prv = CSCHED2_PRIV(per_cpu(scheduler, cpu)); +bool yield = __test_and_clear_bit(__CSFLAG_vcpu_yield, >flags); *skipped = 0; -/* Default to current if runnable, idle otherwise */ -if ( vcpu_runnable(scurr->vcpu) ) -snext = scurr; -else -snext = CSCHED2_VCPU(idle_vcpu[cpu]); - /* * Return the current vcpu if it has executed for less than ratelimit. * Adjuststment for the selected vcpu's credit and decision * for how long it will run will be taken in csched2_runtime. + * + * Note that, if scurr is yielding, we don't let rate limiting kick in. + * In fact, it may be the case that scurr is about to spin, and there's + * no point forcing it to do so until rate limiting expires. */ -if ( prv->ratelimit_us && !is_idle_vcpu(scurr->vcpu) && +if ( !yield && prv->ratelimit_us && !is_idle_vcpu(scurr->vcpu) && vcpu_runnable(scurr->vcpu) && (now - scurr->vcpu->runstate.state_entry_time) < MICROSECS(prv->ratelimit_us) ) return scurr; +/* Default to current if runnable, idle otherwise */ +if ( vcpu_runnable(scurr->vcpu) ) +snext = scurr; +else +snext = CSCHED2_VCPU(idle_vcpu[cpu]); + list_for_each( iter, >runq ) { struct csched2_vcpu * svc = list_entry(iter, struct csched2_vcpu, runq_elem); @@ -2293,8 +2312,10 @@ runq_candidate(struct csched2_runqueue_data *rqd, continue; } -/* If this is on a different processor, don't pull it unless - * its credit is at least CSCHED2_MIGRATE_RESIST higher. */ +/* + * If this is on a different processor, don't pull it unless + * its credit is at least CSCHED2_MIGRATE_RESIST higher. + */ if ( svc->vcpu->processor != cpu && snext->credit + CSCHED2_MIGRATE_RESIST > svc->credit ) { @@ -2303,9 +2324,12 @@ runq_candidate(struct csched2_runqueue_data *rqd, continue; } -/* If the next one on the list has more credit than current - *
Re: [Xen-devel] [PATCH v2] x86/32on64: don't modify guest descriptors without need
On 30/09/16 14:36, Jan Beulich wrote: > System gates with type 0 shouldn't have what might be their DPL altered > - such descriptors can't be used anyway without incurring a #GP, and > hence adjusting its DPL is only risking to confuse the guest. > > Also bail right away for non-present descriptors - no need to write > back anything in that case. > > Signed-off-by: Jan BeulichReviewed-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 00/10] sched: Credit1 and Credit2 improvements... but *NO* soft-affinity for Credit2!
On Fri, 2016-09-30 at 15:10 +0100, George Dunlap wrote: > On 30/09/16 15:06, Dario Faggioli wrote: > > Mm... looks to me that > > > > 3/10 xen: credit2: make tickling more deterministic > > > > is also missing? Or should I re-`git pull' (I've just done that, > > though...)? > > I see it there in my tree:: > Yep, as said on IRC, it was indeed me that were on the wrong local branch. Sorry for the noise. :-P Thanks and Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/4] x86emul: support RTM instructions
On 30/09/16 14:17, Jan Beulich wrote: > Minimal emulation: XBEGIN aborts right away, hence > - XABORT is just a no-op, > - XEND always raises #GP, > - XTEST always signals neither RTM nor HLE are active. > > Signed-off-by: Jan Beulich> --- > v2: Explicitly generate #UD for xtest and xend. > > --- a/xen/arch/x86/x86_emulate/x86_emulate.c > +++ b/xen/arch/x86/x86_emulate/x86_emulate.c > @@ -1170,6 +1170,8 @@ static bool_t vcpu_has( > #define vcpu_has_clflush() vcpu_has( 1, EDX, 19, ctxt, ops) > #define vcpu_has_lzcnt() vcpu_has(0x8001, ECX, 5, ctxt, ops) > #define vcpu_has_bmi1() vcpu_has(0x0007, EBX, 3, ctxt, ops) > +#define vcpu_has_hle() vcpu_has(0x0007, EBX, 4, ctxt, ops) > +#define vcpu_has_rtm() vcpu_has(0x0007, EBX, 11, ctxt, ops) > > #define vcpu_must_have(leaf, reg, bit) \ > generate_exception_if(!vcpu_has(leaf, reg, bit, ctxt, ops), EXC_UD, -1) > @@ -2863,7 +2865,18 @@ x86_emulate( > lock_prefix = 1; > break; > > -case 0xc6 ... 0xc7: /* mov (sole member of Grp11) */ > +case 0xc6: /* Grp11: mov / xabort */ > +case 0xc7: /* Grp11: mov / xbegin */ > +if ( modrm == 0xf8 && vcpu_has_rtm() ) > +{ Please could we leave notes as to the current behaviour, as it isn't a full emulation. i.e. /* xbegin unconditionally aborts, xabort is unconditionally a nop. */ With something to this effect, Reviewed-by: Andrew Cooper > +if ( b & 1 ) > +{ > +jmp_rel((int32_t)src.val); > +_regs.eax = 0; > +} > +dst.type = OP_NONE; > +break; > +} > generate_exception_if((modrm_reg & 7) != 0, EXC_UD, -1); > case 0x88 ... 0x8b: /* mov */ > case 0xa0 ... 0xa1: /* mov mem.offs,{%al,%ax,%eax,%rax} */ > @@ -4248,6 +4261,20 @@ x86_emulate( > goto done; > goto no_writeback; > > +case 0xd5: /* xend */ > +generate_exception_if(vex.pfx, EXC_UD, -1); > +generate_exception_if(!vcpu_has_rtm(), EXC_UD, -1); > +generate_exception_if(vcpu_has_rtm(), EXC_GP, 0); > +break; > + > +case 0xd6: /* xtest */ > +generate_exception_if(vex.pfx, EXC_UD, -1); > +generate_exception_if(!vcpu_has_rtm() && !vcpu_has_hle(), > + EXC_UD, -1); > +/* Neither HLE nor RTM can be active when we get here. */ > +_regs.eflags |= EFLG_ZF; > +goto no_writeback; > + > case 0xdf: /* invlpga */ > generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1); > generate_exception_if(!mode_ring0(), EXC_GP, 0); > > > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 00/10] sched: Credit1 and Credit2 improvements... but *NO* soft-affinity for Credit2!
On 30/09/16 15:06, Dario Faggioli wrote: > On Fri, 2016-09-30 at 14:51 +0100, George Dunlap wrote: >> On 30/09/16 03:53, Dario Faggioli wrote: >>> >>> Dario Faggioli (10): >> [snip] >>> >>> xen: credit2: implement yield() >>> xen: tracing: add trace records for schedule and rate- >>> limiting. >> >> I've pushed everything but these two; the first because I had >> questions >> about the algorithm, the second because it didn't apply cleanly >> without >> the first. >> > Mm... looks to me that > > 3/10 xen: credit2: make tickling more deterministic > > is also missing? Or should I re-`git pull' (I've just done that, > though...)? I see it there in my tree:: $ git log -n 8 47bd006 Fri Sep 30 04:54:28 2016 +0200 Dario Faggioli xl: allow to set the ratelimit value online for Credit2 368db83 Fri Sep 30 04:54:21 2016 +0200 Dario Faggioli libxl: allow to set the ratelimit value online for Credit2 9724606 Fri Sep 30 04:54:14 2016 +0200 Dario Faggioli libxl: fix coding style of credit1 parameters related functions 4217c01 Fri Sep 30 04:54:07 2016 +0200 Dario Faggioli tools: tracing: handle more scheduling related events. 5e4b419 Fri Sep 30 04:53:46 2016 +0200 Dario Faggioli xen: credit2: only reset credit on reset condition 069cf39 Fri Sep 30 04:53:39 2016 +0200 Dario Faggioli xen: credit2: make tickling more deterministic 9fdd2f3 Fri Sep 30 04:53:32 2016 +0200 Dario Faggioli xen: credit1: don't rate limit context switches in case of yields 0053127 Fri Sep 30 04:53:25 2016 +0200 Dario Faggioli xen: credit1: return the 'time remaining to the limit' as next timeslice. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 101224: tolerable all pass - PUSHED
flight 101224 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/101224/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass version targeted for testing: xen 933acddf74213f77a5adbb2c10ccd3f72634d456 baseline version: xen 002e3ffb2156a135abbfb57374802922e162c5ae Last test of basis 101221 2016-09-30 09:02:48 Z0 days Testing same since 101224 2016-09-30 12:03:47 Z0 days1 attempts People who touched revisions under test: Andrew CooperJan Beulich Julien Grall Lars Kurth Shannon Zhao Wei Liu jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=933acddf74213f77a5adbb2c10ccd3f72634d456 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 933acddf74213f77a5adbb2c10ccd3f72634d456 + branch=xen-unstable-smoke + revision=933acddf74213f77a5adbb2c10ccd3f72634d456 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.7-testing + '[' x933acddf74213f77a5adbb2c10ccd3f72634d456 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ :
Re: [Xen-devel] [PATCH v2 00/10] sched: Credit1 and Credit2 improvements... but *NO* soft-affinity for Credit2!
On Fri, 2016-09-30 at 14:51 +0100, George Dunlap wrote: > On 30/09/16 03:53, Dario Faggioli wrote: > > > > Dario Faggioli (10): > [snip] > > > > xen: credit2: implement yield() > > xen: tracing: add trace records for schedule and rate- > > limiting. > > I've pushed everything but these two; the first because I had > questions > about the algorithm, the second because it didn't apply cleanly > without > the first. > Mm... looks to me that 3/10 xen: credit2: make tickling more deterministic is also missing? Or should I re-`git pull' (I've just done that, though...)? Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 02/12] tmem: Retire XEN_SYSCTL_TMEM_OP_[SET_CAP|SAVE_GET_CLIENT_CAP]
On Wed, Sep 28, 2016 at 06:10:58AM -0600, Jan Beulich wrote: > >>> On 28.09.16 at 11:42,wrote: > > It is not used by anything. > > But that shouldn't be the only aspect. Are they also not useful for > anything? As far as I can see it was meant to complement the 'weight'. But the code just hangs around. I can re-introduce it if there is a need for it and make it visible via the XEN_SYSCTL_TMEM_SET_CLIENT_INFO (introduced in patch #7). > > > --- a/xen/common/tmem_control.c > > +++ b/xen/common/tmem_control.c > > @@ -103,9 +103,9 @@ static int tmemc_list_client(struct client *c, > > tmem_cli_va_param_t buf, > > struct tmem_pool *p; > > bool_t s; > > > > -n = scnprintf(info,BSIZE,"C=CI:%d,ww:%d,ca:%d,co:%d,fr:%d," > > +n = scnprintf(info,BSIZE,"C=CI:%d,ww:%d,co:%d,fr:%d," > > "Tc:%"PRIu64",Ge:%ld,Pp:%ld,Gp:%ld%c", > > -c->cli_id, c->weight, c->cap, c->compress, c->frozen, > > +c->cli_id, c->weight, c->compress, c->frozen, > > c->total_cycles, c->succ_eph_gets, c->succ_pers_puts, > > c->succ_pers_gets, > > use_long ? ',' : '\n'); > > if (use_long) > > @@ -273,11 +273,6 @@ static int __tmemc_set_var(struct client *client, > > uint32_t subop, uint32_t arg1) > > atomic_sub(old_weight,_global.client_weight_total); > > atomic_add(client->weight,_global.client_weight_total); > > break; > > -case XEN_SYSCTL_TMEM_OP_SET_CAP: > > -client->cap = arg1; > > -tmem_client_info("tmem: cap set to %d for %s=%d\n", > > -arg1, tmem_cli_id_str, cli_id); > > -break; > > case XEN_SYSCTL_TMEM_OP_SET_COMPRESS: > > if ( tmem_dedup_enabled() ) > > { > > @@ -341,11 +336,6 @@ static int tmemc_save_subop(int cli_id, uint32_t > > pool_id, > > break; > > rc = client->weight == -1 ? -2 : client->weight; > > break; > > -case XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP: > > -if ( client == NULL ) > > -break; > > -rc = client->cap == -1 ? -2 : client->cap; > > -break; > > case XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS: > > if ( client == NULL ) > > break; > > It looks like you're removing all accesses to the cap field. That would > suggest that you now want to also remove the field itself. Yes! The patch titled "06/12] tmem: Move client weight,frozen,live_migrating, and compress" did it, but it should have been done here. Thanks! > > > --- a/xen/include/public/sysctl.h > > +++ b/xen/include/public/sysctl.h > > @@ -757,14 +757,12 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t); > > #define XEN_SYSCTL_TMEM_OP_DESTROY3 > > #define XEN_SYSCTL_TMEM_OP_LIST 4 > > #define XEN_SYSCTL_TMEM_OP_SET_WEIGHT 5 > > -#define XEN_SYSCTL_TMEM_OP_SET_CAP6 > > #define XEN_SYSCTL_TMEM_OP_SET_COMPRESS 7 > > #define XEN_SYSCTL_TMEM_OP_QUERY_FREEABLE_MB 8 > > #define XEN_SYSCTL_TMEM_OP_SAVE_BEGIN 10 > > #define XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION 11 > > #define XEN_SYSCTL_TMEM_OP_SAVE_GET_MAXPOOLS 12 > > #define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_WEIGHT 13 > > -#define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_CAP14 > > #define XEN_SYSCTL_TMEM_OP_SAVE_GET_CLIENT_FLAGS 15 > > #define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_FLAGS16 > > #define XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_NPAGES 17 > > I think such removals should be accompanied by bumping > XEN_SYSCTL_INTERFACE_VERSION, albeit it's obviously not as > relevant as it would be when changing some structure's layout. The patch series actually does not alter the layout per say. I think it would be most justified in "11/12] tmem/xc_tmem_control: Rename 'arg1' to 'len' and 'arg2' to arg." as that: "Of all the various sub-commands, the only one that needed semantic change is XEN_SYSCTL_TMEM_OP_SAVE_BEGIN. That in the past used 'arg1', and now we are moving it to use 'arg'." ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 05/10] xen: credit2: implement yield()
On Fri, 2016-09-30 at 13:52 +0100, George Dunlap wrote: > On 30/09/16 03:53, Dario Faggioli wrote: > > > > When a vcpu explicitly yields it is usually giving > > us an advice of "let someone else run and come back > > to me in a bit." > > > > Credit2 isn't, so far, doing anything when a vcpu > > yields, which means an yield is basically a NOP (well, > > actually, it's pure overhead, as it causes the scheduler > > kick in, but the result is --at least 99% of the time-- > > that the very same vcpu that yielded continues to run). > > > > Implement a "preempt bias", to be applied to yielding > > vcpus. Basically when evaluating what vcpu to run next, > > if a vcpu that has just yielded is encountered, we give > > it a credit penalty, and check whether there is anyone > > else that would better take over the cpu (of course, > > if there isn't the yielding vcpu will continue). > > > > The value of this bias can be configured with a boot > > time parameter, and the default is set to 1 ms. > > > > Also, add an yield performance counter, and fix the > > style of a couple of comments. > > > > Signed-off-by: Dario Faggioli> > Hmm, I'm sorry for not asking this earlier -- but what's the > rationale > for having a "yield bias", rather than just choosing the next > runnable > guy in the queue on yield, regardless of what his credits are? > Flexibility, I'd say. Flexibility of deciding 'how strong' an yield would be and, e.g., have two different values for two cpupools (not possible with just this patch, but not a big deal to do in future). Sure, if we only think at the spinlock case, that's not really important (if good at all). OTOH, if you think at yielding as a way of saying: < >. Well, this implementation gives you a way of quantifying that "while". But of course, more flexibility often means more complexity. And in this case, rather than complexity in the code, what would be hard is to come up with a good value for a specific workload. IOW, right now we have no yield. Instead of adding a "yield switch", it's implemented as a "yield knob", which has its up and down sides. I personally like knobs a lot more than switches... But I see the risk of people starting to turn the knob, expecting wonders, and being disappointed (and complaining!) if things don't improve for them! :-P > If snext has 9ms of credit and the top runnable guy on the runqueue > has > 7.8ms of credit, doesn't it make sense to run him instead of making > snext run again and burn his credit? > Again, in the one use case for which yield is most popular (the spinlock one) this that you say totally makes sense. Which makes me think that, even if we were to keep (or go back to) using the bias, I'd probably go with a default value higher than 1ms worth of credits. *ANYWAY*, you asked for a rationale, this is mine. But all this being said, though, I honestly think the simple solution you're hinting at is better, at least for now. Also, I've just tried to implement it, and yes, it works by doing as you suggest here, and the code is simpler. Therefore, I'm going for that. :-) I've just seen you have applied most of the series already. I'll send a v3 consisting only of the remaining patches, with this one modified as suggested. Thanks and Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Adding new custom devices to Xen via QEMU
On 30/09/16 14:35, Jason Dickens wrote: > Hi Wei, > > Thanks for the response. It make sense to me that if the device were on > the PCI bus (or other such bus, e.g. USB) that it could be discovered, > at least by an OS. Its something to consider. I should mention that our > guest VM doesn't actually use an OS. > > However, the device is not implemented that as PCI it is simply memory > mapped. Technically, in QEMU is has type ISA because it was derived as a > modification of the TPM device. Is it possible something is lacking in > the QEMU model that Xen needs but KVM doesn't? > If the answer is that Xen should not need modification for any new > devices then this gives me hope. You've also inspired some things to > try, like whether or not smaller modifications to the TPM device work. > One change that is significant to mention is that the physical address > range use is anomalous, by which I mean it not in the normal device range. Does device MMIO overlap with guest RAM? If so, you'll need to unpopulate the RAM first. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 00/10] sched: Credit1 and Credit2 improvements... but *NO* soft-affinity for Credit2!
On 30/09/16 03:53, Dario Faggioli wrote: > Hey, > > This is v2 of my Credit1 and Credit2 improvements series. First posting is > here: > > https://lists.xen.org/archives/html/xen-devel/2016-08/msg02183.html > > Now, couple of things: > - some of the patches have been applied already out of v1; > - I've reshuffled the remaining patches a bit, mostly upon reviewers' >requests to do so; > - I'm not including the 'soft affinity for Credit2 patches'. > > In fact, most of the soft affinity work still needs to be reviewed. > > OTOH, the patches that I've put together here, have been reviewed already, and > I think I've addressed all the review comments, which means that --if people > (i.e., mostly George!:-P) manage to have a quick look at them-- they can even > go in for 4.8. > > And while there's really no point in rushing soft-affinity for Credit2 in at > this stage, these patches brings some nice (and moderateely simple) > improvements for both the schedulers, which I think should make it in the > release. > > There's a branch available here: > git://xenbits.xen.org/people/dariof/xen.git > rel/sched/misc-credit1-credit2-plus-credit2-softaff-v2 > > http://xenbits.xen.org/gitweb/?p=people/dariof/xen.git;a=shortlog;h=refs/heads/rel/sched/misc-credit1-credit2-plus-credit2-softaff-v2 > https://travis-ci.org/fdario/xen/builds/163898322 > > Thanks and Regards, > Dario > --- > Dario Faggioli (10): [snip] > xen: credit2: implement yield() > xen: tracing: add trace records for schedule and rate-limiting. I've pushed everything but these two; the first because I had questions about the algorithm, the second because it didn't apply cleanly without the first. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2] x86/32on64: don't modify guest descriptors without need
System gates with type 0 shouldn't have what might be their DPL altered - such descriptors can't be used anyway without incurring a #GP, and hence adjusting its DPL is only risking to confuse the guest. Also bail right away for non-present descriptors - no need to write back anything in that case. Signed-off-by: Jan Beulich--- v2: Broken out from a larger patch. --- a/xen/arch/x86/x86_64/mm.c +++ b/xen/arch/x86/x86_64/mm.c @@ -1088,7 +1088,7 @@ int check_descriptor(const struct domain /* A not-present descriptor will always fault, so is safe. */ if ( !(b & _SEGMENT_P) ) -goto good; +return 1; /* Check and fix up the DPL. */ dpl = (b >> 13) & 3; @@ -1130,7 +1130,7 @@ int check_descriptor(const struct domain /* Invalid type 0 is harmless. It is used for 2nd half of a call gate. */ if ( (b & _SEGMENT_TYPE) == 0x000 ) -goto good; +return 1; /* Everything but a call gate is discarded here. */ if ( (b & _SEGMENT_TYPE) != 0xc00 ) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Adding new custom devices to Xen via QEMU
Hi Wei, Thanks for the response. It make sense to me that if the device were on the PCI bus (or other such bus, e.g. USB) that it could be discovered, at least by an OS. Its something to consider. I should mention that our guest VM doesn't actually use an OS. However, the device is not implemented that as PCI it is simply memory mapped. Technically, in QEMU is has type ISA because it was derived as a modification of the TPM device. Is it possible something is lacking in the QEMU model that Xen needs but KVM doesn't? If the answer is that Xen should not need modification for any new devices then this gives me hope. You've also inspired some things to try, like whether or not smaller modifications to the TPM device work. One change that is significant to mention is that the physical address range use is anomalous, by which I mean it not in the normal device range. Any references you could give how Xen actually discovers when to use QEMU to service MMIO reads/writes would be useful too. My current understanding is that "something" would have to be used to configure the EPT to trigger QEMU to use on VMEXITs caused by such accesses. Thanks, Jason P.S. Regarding device_model_args= I'm already using this and QEMU seems quite satisfied as the device is realized as expected it simply doesn't receive verified MMIO accesses for the device. On 9/30/2016 7:33 AM, Wei Liu wrote: Hello On Thu, Sep 29, 2016 at 09:35:21AM -0400, Jason Dickens wrote: Hello, My name is Jason Dickens and I'm a Research Scientist here at GrammaTech. Some of our research involves securing hypervisors and we have needed to add to and/or modify Xen. I have been successful in modifying the source for various purposes, but my question now is about devices. We have a custom device model implemented in QEMU which works great with QEMU (on Intel) standalone and with KVM, however, we now want access to it in Xen using the same modified QEMU build. The only problem I seem to be having is getting Xen to send the MMIO R/W's to QEMU. The device is being realized, but guest access to the physical address range I expect to reference the device seem to go no place. I see in the source calls such as "register_io_handler" that other devices use to effect the EPT mapping. Is this what I need? My main question is whether or not it is truly necessary to change Xen itself in order to introduce new devices in Xen using QEMU, or is there just a configuration setting? And what is the simplest way to have a range of It is most likely achievable by just configuring your PCI device (?) in QEMU (provided it has a reasonable BAR size or whatnot). Check out various device model options in xl.cfg manpage, especially device_model_args= option allows you to add arbitrary options to QEMU. Wei. physical addresses access a custom QEMU device? Thanks, Jason ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 08/10] libxl: fix coding style of credit1 parameters related functions
On 30/09/16 13:04, Dario Faggioli wrote: > On Fri, 2016-09-30 at 11:24 +0100, Ian Jackson wrote: >> Dario Faggioli writes ("[PATCH v2 08/10] libxl: fix coding style of >> credit1 parameters related functions"): >>> int libxl_sched_credit_params_set(libxl_ctx *ctx, uint32_t poolid, >>>libxl_sched_credit_params >>> *scinfo) >>> { >>> struct xen_sysctl_credit_schedule sparam; >>> -int rc=0; >>> +int r, rc; >> ... >>> >>> >>> scinfo->tslice_ms = sparam.tslice_ms; >>> scinfo->ratelimit_us = sparam.ratelimit_us; >>> >>> + out: >>> GC_FREE; >>> -return 0; >>> +return rc; >> >> I think this is missing an assignment >> >> rc = 0; >> >> on the successful exit path, just before out. Am I wrong ? >> > Indeed it's missing. It was not necessary in v1 of this patch, so I > must have failed to notice that it was, when splitting that in two. > > Sorry. > > Not sure how to proceed, so I'm attaching an updated version of the > patch to this email. I've fixed it up in my tree, and added your Acked-by, Ian. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 09/30] x86/vtd: fix and simplify mapping RMRR regions
>>> On 30.09.16 at 13:27,wrote: > On Thu, Sep 29, 2016 at 08:18:36AM -0600, Jan Beulich wrote: >> >>> On 27.09.16 at 17:57, wrote: >> > +{ >> > +int rc; >> > + >> > +while ( nr_pages > 0 ) >> > +{ >> > +rc = (map ? map_mmio_regions : unmap_mmio_regions) >> > + (d, _gfn(pfn), nr_pages, _mfn(pfn)); >> > +if ( rc == 0 ) >> > +break; >> > +if ( rc < 0 ) >> > +{ >> > +printk(XENLOG_ERR >> > +"Failed to %smap %#lx - %#lx into domain %d memory map: >> > %d\n", >> > + map ? "" : "un", pfn, pfn + nr_pages, d->domain_id, >> > rc); >> > +return rc; >> > +} >> > +nr_pages -= rc; >> > +pfn += rc; >> > +process_pending_softirqs(); >> > +} >> > + >> > +return rc; >> >> The way this is coded it appears to possibly return non-zero even in >> success case. I think this would therefore better be a for ( ; ; ) loop. > > I don't think this is possible, {un}map_mmio_regions will return < 0 on > error, > 0 if there are pending pages to map, and 0 when all the requested > pages have been mapped successfully. Right - hence the "appears to" in my reply; it took me a while to figure it's not actually possible, and hence my desire to make this more obvious to the reader. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/4] x86emul: consolidate segment register handling
On 30/09/16 14:16, Jan Beulich wrote: > Use a single set of variables throughout the huge switch() statement, > allowing to funnel SLDT/STR into the mov-from-sreg code path. > > Signed-off-by: Jan Beulich> Reviewed-by: Andrew Cooper Also looking better. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/4] x86emul: support UMIP
On 30/09/16 14:16, Jan Beulich wrote: > To make this complete, also add support for SLDT and STR. Note that by > just looking at the guest CR4 bit, this is independent of actually > making available the UMIP feature to guests. > > Signed-off-by: Jan BeulichMuch clearer. Reviewed-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 4/4] x86emul: conditionally clear BNDn for branches
Considering that we surface MPX to HVM guests, instructions we emulate should also correctly deal with MPX state. While for now BND* instructions don't get emulated, the effect of branches (which we do emulate) without BND prefix should be taken care of. No need to alter XABORT behavior: While not mentioned in the SDM so far, this restores BNDn as they were at the XBEGIN, and since we make XBEGIN abort right away, XABORT in the emulator is only a no-op. Signed-off-by: Jan Beulich--- v2: Re-base. Address all RFC reasons based on feedback from Intel. Re-work the actual clearing of BNDn. --- a/tools/tests/x86_emulator/x86_emulate.c +++ b/tools/tests/x86_emulator/x86_emulate.c @@ -22,6 +22,8 @@ typedef bool bool_t; #define cpu_has_amd_erratum(nr) 0 #define mark_regs_dirty(r) ((void)(r)) +#define read_bndcfgu() 0 +#define xstate_set_init(what) #define likely(x) __builtin_expect(!!(x), true) #define unlikely(x) __builtin_expect(!!(x), false) --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -421,6 +421,8 @@ int vcpu_initialise(struct vcpu *v) vmce_init_vcpu(v); } +else if ( (rc = xstate_alloc_save_area(v)) != 0 ) +return rc; spin_lock_init(>arch.vpmu.vpmu_lock); --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -409,6 +409,9 @@ typedef union { #define MSR_SYSENTER_CS 0x0174 #define MSR_SYSENTER_ESP 0x0175 #define MSR_SYSENTER_EIP 0x0176 +#define MSR_BNDCFGS 0x0d90 +#define BNDCFG_ENABLE(1 << 0) +#define BNDCFG_PRESERVE (1 << 1) #define MSR_EFER 0xc080 #define MSR_STAR 0xc081 #define MSR_LSTAR0xc082 @@ -1172,6 +1175,7 @@ static bool_t vcpu_has( #define vcpu_has_bmi1() vcpu_has(0x0007, EBX, 3, ctxt, ops) #define vcpu_has_hle() vcpu_has(0x0007, EBX, 4, ctxt, ops) #define vcpu_has_rtm() vcpu_has(0x0007, EBX, 11, ctxt, ops) +#define vcpu_has_mpx() vcpu_has(0x0007, EBX, 14, ctxt, ops) #define vcpu_must_have(leaf, reg, bit) \ generate_exception_if(!vcpu_has(leaf, reg, bit, ctxt, ops), EXC_UD, -1) @@ -1616,6 +1620,34 @@ static int inject_swint(enum x86_swint_t return ops->inject_hw_exception(fault_type, error_code, ctxt); } +static void clear_bnd(struct x86_emulate_ctxt *ctxt, + const struct x86_emulate_ops *ops, enum vex_pfx pfx) +{ +uint64_t bndcfg; +int rc; + +if ( pfx == vex_f2 || !vcpu_has_mpx() ) +return; + +if ( !mode_ring0() ) +bndcfg = read_bndcfgu(); +else if ( !ops->read_msr || + ops->read_msr(MSR_BNDCFGS, , ctxt) != X86EMUL_OKAY ) +return; +if ( (bndcfg & BNDCFG_ENABLE) && !(bndcfg & BNDCFG_PRESERVE) ) +{ +/* + * Using BNDMK or any other MPX instruction here is pointless, as + * we run with MPX disabled ourselves, and hence they're all no-ops. + * Therefore we have two ways to clear BNDn: Enable MPX temporarily + * (in which case executing any suitable non-prefixed branch + * instruction would do), or use XRSTOR. + */ +xstate_set_init(XSTATE_BNDREGS); +} + done:; +} + int x86emul_unhandleable_rw( enum x86_segment seg, unsigned long offset, @@ -2835,6 +2867,7 @@ x86_emulate( case 0x70 ... 0x7f: /* jcc (short) */ if ( test_cc(b, _regs.eflags) ) jmp_rel((int32_t)src.val); +clear_bnd(ctxt, ops, vex.pfx); break; case 0x82: /* Grp1 (x86/32 only) */ @@ -3184,6 +3217,7 @@ x86_emulate( (rc = ops->insn_fetch(x86_seg_cs, dst.val, NULL, 0, ctxt)) ) goto done; _regs.eip = dst.val; +clear_bnd(ctxt, ops, vex.pfx); break; case 0xc4: /* les */ { @@ -3910,12 +3944,15 @@ x86_emulate( op_bytes = ((op_bytes == 4) && mode_64bit()) ? 8 : op_bytes; src.val = _regs.eip; jmp_rel(rel); +clear_bnd(ctxt, ops, vex.pfx); goto push; } case 0xe9: /* jmp (near) */ case 0xeb: /* jmp (short) */ jmp_rel((int32_t)src.val); +if ( !(b & 2) ) +clear_bnd(ctxt, ops, vex.pfx); break; case 0xea: /* jmp (far, absolute) */ @@ -4175,12 +4212,14 @@ x86_emulate( goto done; _regs.eip = src.val; src.val = dst.val; +clear_bnd(ctxt, ops, vex.pfx); goto push; case 4: /* jmp (near) */ if ( (rc = ops->insn_fetch(x86_seg_cs, src.val, NULL, 0, ctxt)) ) goto done; _regs.eip = src.val; dst.type = OP_NONE; +clear_bnd(ctxt, ops, vex.pfx); break; case 3: /* call (far, absolute indirect) */ case 5: /* jmp (far, absolute indirect) */ { @@ -4893,6 +4932,7 @@ x86_emulate( case X86EMUL_OPC(0x0f, 0x80) ... X86EMUL_OPC(0x0f, 0x8f): /* jcc (near) */ if ( test_cc(b, _regs.eflags) )
[Xen-devel] [PATCH v2 3/4] x86emul: support RTM instructions
Minimal emulation: XBEGIN aborts right away, hence - XABORT is just a no-op, - XEND always raises #GP, - XTEST always signals neither RTM nor HLE are active. Signed-off-by: Jan Beulich--- v2: Explicitly generate #UD for xtest and xend. --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -1170,6 +1170,8 @@ static bool_t vcpu_has( #define vcpu_has_clflush() vcpu_has( 1, EDX, 19, ctxt, ops) #define vcpu_has_lzcnt() vcpu_has(0x8001, ECX, 5, ctxt, ops) #define vcpu_has_bmi1() vcpu_has(0x0007, EBX, 3, ctxt, ops) +#define vcpu_has_hle() vcpu_has(0x0007, EBX, 4, ctxt, ops) +#define vcpu_has_rtm() vcpu_has(0x0007, EBX, 11, ctxt, ops) #define vcpu_must_have(leaf, reg, bit) \ generate_exception_if(!vcpu_has(leaf, reg, bit, ctxt, ops), EXC_UD, -1) @@ -2863,7 +2865,18 @@ x86_emulate( lock_prefix = 1; break; -case 0xc6 ... 0xc7: /* mov (sole member of Grp11) */ +case 0xc6: /* Grp11: mov / xabort */ +case 0xc7: /* Grp11: mov / xbegin */ +if ( modrm == 0xf8 && vcpu_has_rtm() ) +{ +if ( b & 1 ) +{ +jmp_rel((int32_t)src.val); +_regs.eax = 0; +} +dst.type = OP_NONE; +break; +} generate_exception_if((modrm_reg & 7) != 0, EXC_UD, -1); case 0x88 ... 0x8b: /* mov */ case 0xa0 ... 0xa1: /* mov mem.offs,{%al,%ax,%eax,%rax} */ @@ -4248,6 +4261,20 @@ x86_emulate( goto done; goto no_writeback; +case 0xd5: /* xend */ +generate_exception_if(vex.pfx, EXC_UD, -1); +generate_exception_if(!vcpu_has_rtm(), EXC_UD, -1); +generate_exception_if(vcpu_has_rtm(), EXC_GP, 0); +break; + +case 0xd6: /* xtest */ +generate_exception_if(vex.pfx, EXC_UD, -1); +generate_exception_if(!vcpu_has_rtm() && !vcpu_has_hle(), + EXC_UD, -1); +/* Neither HLE nor RTM can be active when we get here. */ +_regs.eflags |= EFLG_ZF; +goto no_writeback; + case 0xdf: /* invlpga */ generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1); generate_exception_if(!mode_ring0(), EXC_GP, 0); x86emul: support RTM instructions Minimal emulation: XBEGIN aborts right away, hence - XABORT is just a no-op, - XEND always raises #GP, - XTEST always signals neither RTM nor HLE are active. Signed-off-by: Jan Beulich --- v2: Explicitly generate #UD for xtest and xend. --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -1170,6 +1170,8 @@ static bool_t vcpu_has( #define vcpu_has_clflush() vcpu_has( 1, EDX, 19, ctxt, ops) #define vcpu_has_lzcnt() vcpu_has(0x8001, ECX, 5, ctxt, ops) #define vcpu_has_bmi1() vcpu_has(0x0007, EBX, 3, ctxt, ops) +#define vcpu_has_hle() vcpu_has(0x0007, EBX, 4, ctxt, ops) +#define vcpu_has_rtm() vcpu_has(0x0007, EBX, 11, ctxt, ops) #define vcpu_must_have(leaf, reg, bit) \ generate_exception_if(!vcpu_has(leaf, reg, bit, ctxt, ops), EXC_UD, -1) @@ -2863,7 +2865,18 @@ x86_emulate( lock_prefix = 1; break; -case 0xc6 ... 0xc7: /* mov (sole member of Grp11) */ +case 0xc6: /* Grp11: mov / xabort */ +case 0xc7: /* Grp11: mov / xbegin */ +if ( modrm == 0xf8 && vcpu_has_rtm() ) +{ +if ( b & 1 ) +{ +jmp_rel((int32_t)src.val); +_regs.eax = 0; +} +dst.type = OP_NONE; +break; +} generate_exception_if((modrm_reg & 7) != 0, EXC_UD, -1); case 0x88 ... 0x8b: /* mov */ case 0xa0 ... 0xa1: /* mov mem.offs,{%al,%ax,%eax,%rax} */ @@ -4248,6 +4261,20 @@ x86_emulate( goto done; goto no_writeback; +case 0xd5: /* xend */ +generate_exception_if(vex.pfx, EXC_UD, -1); +generate_exception_if(!vcpu_has_rtm(), EXC_UD, -1); +generate_exception_if(vcpu_has_rtm(), EXC_GP, 0); +break; + +case 0xd6: /* xtest */ +generate_exception_if(vex.pfx, EXC_UD, -1); +generate_exception_if(!vcpu_has_rtm() && !vcpu_has_hle(), + EXC_UD, -1); +/* Neither HLE nor RTM can be active when we get here. */ +_regs.eflags |= EFLG_ZF; +goto no_writeback; + case 0xdf: /* invlpga */ generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1); generate_exception_if(!mode_ring0(), EXC_GP, 0); ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 06/10] xen: tracing: add trace records for schedule and rate-limiting.
On 30/09/16 03:54, Dario Faggioli wrote: > As far as {csched, csched2, rt}_schedule() are concerned, > an "empty" event, would already make it easier to read and > understand a trace. > > But while there, add a few useful information, like > if the cpu that is going through the scheduler has > been tickled or not, if it is currently idle, etc > (they vary, on a per-scheduler basis). > > For Credit1 and Credit2, add a record about when > rate-limiting kicks in too. > > Signed-off-by: Dario FaggioliAcked-by: George Dunlap > --- > Cc: George Dunlap > Cc: Meng Xu > Cc: Anshul Makkar > --- > Changes from v1: > * corrected the schedule record for sched_rt.c, as pointed out during review; > * pack the Credit1 records as well, as requested during review. > --- > xen/common/sched_credit.c | 32 > xen/common/sched_credit2.c | 40 +++- > xen/common/sched_rt.c | 15 +++ > 3 files changed, 86 insertions(+), 1 deletion(-) > > diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c > index 5700763..fc3a321 100644 > --- a/xen/common/sched_credit.c > +++ b/xen/common/sched_credit.c > @@ -133,6 +133,8 @@ > #define TRC_CSCHED_TICKLETRC_SCHED_CLASS_EVT(CSCHED, 6) > #define TRC_CSCHED_BOOST_START TRC_SCHED_CLASS_EVT(CSCHED, 7) > #define TRC_CSCHED_BOOST_END TRC_SCHED_CLASS_EVT(CSCHED, 8) > +#define TRC_CSCHED_SCHEDULE TRC_SCHED_CLASS_EVT(CSCHED, 9) > +#define TRC_CSCHED_RATELIMIT TRC_SCHED_CLASS_EVT(CSCHED, 10) > > > /* > @@ -1774,6 +1776,23 @@ csched_schedule( > SCHED_STAT_CRANK(schedule); > CSCHED_VCPU_CHECK(current); > > +/* > + * Here in Credit1 code, we usually just call TRACE_nD() helpers, and > + * don't care about packing. But scheduling happens very often, so it > + * actually is important that the record is as small as possible. > + */ > +if ( unlikely(tb_init_done) ) > +{ > +struct { > +unsigned cpu:16, tasklet:8, idle:8; > +} d; > +d.cpu = cpu; > +d.tasklet = tasklet_work_scheduled; > +d.idle = is_idle_vcpu(current); > +__trace_var(TRC_CSCHED_SCHEDULE, 1, sizeof(d), > +(unsigned char *)); > +} > + > runtime = now - current->runstate.state_entry_time; > if ( runtime < 0 ) /* Does this ever happen? */ > runtime = 0; > @@ -1829,6 +1848,19 @@ csched_schedule( > tslice = MICROSECS(prv->ratelimit_us) - runtime; > if ( unlikely(runtime < CSCHED_MIN_TIMER) ) > tslice = CSCHED_MIN_TIMER; > +if ( unlikely(tb_init_done) ) > +{ > +struct { > +unsigned vcpu:16, dom:16; > +unsigned runtime; > +} d; > +d.dom = scurr->vcpu->domain->domain_id; > +d.vcpu = scurr->vcpu->vcpu_id; > +d.runtime = runtime; > +__trace_var(TRC_CSCHED_RATELIMIT, 1, sizeof(d), > +(unsigned char *)); > +} > + > ret.migrated = 0; > goto out; > } > diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c > index fde61ef..5cf3f16 100644 > --- a/xen/common/sched_credit2.c > +++ b/xen/common/sched_credit2.c > @@ -55,6 +55,8 @@ > #define TRC_CSCHED2_LOAD_BALANCE TRC_SCHED_CLASS_EVT(CSCHED2, 17) > #define TRC_CSCHED2_PICKED_CPU TRC_SCHED_CLASS_EVT(CSCHED2, 19) > #define TRC_CSCHED2_RUNQ_CANDIDATE TRC_SCHED_CLASS_EVT(CSCHED2, 20) > +#define TRC_CSCHED2_SCHEDULE TRC_SCHED_CLASS_EVT(CSCHED2, 21) > +#define TRC_CSCHED2_RATELIMITTRC_SCHED_CLASS_EVT(CSCHED2, 22) > > /* > * WARNING: This is still in an experimental phase. Status and work can be > found at the > @@ -2288,12 +2290,29 @@ runq_candidate(struct csched2_runqueue_data *rqd, > * no point forcing it to do so until rate limiting expires. > */ > if ( __test_and_clear_bit(__CSFLAG_vcpu_yield, >flags) ) > +{ > yield_bias = CSCHED2_YIELD_BIAS; > +} > else if ( prv->ratelimit_us && !is_idle_vcpu(scurr->vcpu) && >vcpu_runnable(scurr->vcpu) && >(now - scurr->vcpu->runstate.state_entry_time) < > MICROSECS(prv->ratelimit_us) ) > +{ > +if ( unlikely(tb_init_done) ) > +{ > +struct { > +unsigned vcpu:16, dom:16; > +unsigned runtime; > +} d; > +d.dom = scurr->vcpu->domain->domain_id; > +d.vcpu = scurr->vcpu->vcpu_id; > +d.runtime = now - scurr->vcpu->runstate.state_entry_time; > +__trace_var(TRC_CSCHED2_RATELIMIT, 1, > +sizeof(d), > +(unsigned char *)); > +} > return scurr; > +} > >
[Xen-devel] [PATCH v2 2/4] x86emul: consolidate segment register handling
Use a single set of variables throughout the huge switch() statement, allowing to funnel SLDT/STR into the mov-from-sreg code path. Signed-off-by: Jan BeulichReviewed-by: Andrew Cooper --- v2: s/store_seg/store_selector/g. Re-base. --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -2505,7 +2505,8 @@ x86_emulate( switch ( ctxt->opcode ) { -struct segment_register cs; +enum x86_segment seg; +struct segment_register cs, sreg; case 0x00 ... 0x05: add: /* add */ emulate_2op_SrcV("add", src, dst, _regs.eflags); @@ -2541,22 +2542,20 @@ x86_emulate( dst.type = OP_NONE; break; -case 0x06: /* push %%es */ { -struct segment_register reg; +case 0x06: /* push %%es */ src.val = x86_seg_es; push_seg: generate_exception_if(mode_64bit() && !ext, EXC_UD, -1); fail_if(ops->read_segment == NULL); -if ( (rc = ops->read_segment(src.val, , ctxt)) != 0 ) +if ( (rc = ops->read_segment(src.val, , ctxt)) != 0 ) goto done; /* 64-bit mode: PUSH defaults to a 64-bit operand. */ if ( mode_64bit() && (op_bytes == 4) ) op_bytes = 8; if ( (rc = ops->write(x86_seg_ss, sp_pre_dec(op_bytes), - , op_bytes, ctxt)) != 0 ) + , op_bytes, ctxt)) != 0 ) goto done; break; -} case 0x07: /* pop %%es */ src.val = x86_seg_es; @@ -2872,21 +2871,20 @@ x86_emulate( dst.val = src.val; break; -case 0x8c: /* mov Sreg,r/m */ { -struct segment_register reg; -enum x86_segment seg = decode_segment(modrm_reg); +case 0x8c: /* mov Sreg,r/m */ +seg = decode_segment(modrm_reg); generate_exception_if(seg == decode_segment_failed, EXC_UD, -1); +store_selector: fail_if(ops->read_segment == NULL); -if ( (rc = ops->read_segment(seg, , ctxt)) != 0 ) +if ( (rc = ops->read_segment(seg, , ctxt)) != 0 ) goto done; -dst.val = reg.sel; +dst.val = sreg.sel; if ( dst.type == OP_MEM ) dst.bytes = 2; break; -} -case 0x8e: /* mov r/m,Sreg */ { -enum x86_segment seg = decode_segment(modrm_reg); +case 0x8e: /* mov r/m,Sreg */ +seg = decode_segment(modrm_reg); generate_exception_if(seg == decode_segment_failed, EXC_UD, -1); generate_exception_if(seg == x86_seg_cs, EXC_UD, -1); if ( (rc = load_seg(seg, src.val, 0, NULL, ctxt, ops)) != 0 ) @@ -2895,7 +2893,6 @@ x86_emulate( ctxt->retire.flags.mov_ss = 1; dst.type = OP_NONE; break; -} case 0x8d: /* lea */ generate_exception_if(ea.type != OP_MEM, EXC_UD, -1); @@ -2952,17 +2949,15 @@ x86_emulate( } break; -case 0x9a: /* call (far, absolute) */ { -struct segment_register reg; - +case 0x9a: /* call (far, absolute) */ ASSERT(!mode_64bit()); fail_if(ops->read_segment == NULL); -if ( (rc = ops->read_segment(x86_seg_cs, , ctxt)) || +if ( (rc = ops->read_segment(x86_seg_cs, , ctxt)) || (rc = load_seg(x86_seg_cs, imm2, 0, , ctxt, ops)) || (validate_far_branch(, imm1), rc = ops->write(x86_seg_ss, sp_pre_dec(op_bytes), - , op_bytes, ctxt)) || + , op_bytes, ctxt)) || (rc = ops->write(x86_seg_ss, sp_pre_dec(op_bytes), &_regs.eip, op_bytes, ctxt)) || (rc = ops->write_segment(x86_seg_cs, , ctxt)) ) @@ -2970,7 +2965,6 @@ x86_emulate( _regs.eip = imm1; break; -} case 0x9b: /* wait/fwait */ emulate_fpu_insn("fwait"); @@ -4179,13 +4173,12 @@ x86_emulate( if ( (modrm_reg & 7) == 3 ) /* call */ { -struct segment_register reg; fail_if(ops->read_segment == NULL); -if ( (rc = ops->read_segment(x86_seg_cs, , ctxt)) || +if ( (rc = ops->read_segment(x86_seg_cs, , ctxt)) || (rc = load_seg(x86_seg_cs, sel, 0, , ctxt, ops)) || (validate_far_branch(, src.val), rc = ops->write(x86_seg_ss, sp_pre_dec(op_bytes), - , op_bytes, ctxt)) || + , op_bytes, ctxt)) || (rc = ops->write(x86_seg_ss, sp_pre_dec(op_bytes), &_regs.eip, op_bytes, ctxt)) || (rc = ops->write_segment(x86_seg_cs, , ctxt)) ) @@ -4207,23 +4200,13 @@ x86_emulate( break; case X86EMUL_OPC(0x0f, 0x00): /* Grp6 */ -{ -enum x86_segment seg = (modrm_reg & 1) ? x86_seg_tr :
[Xen-devel] [PATCH v2 1/4] x86emul: support UMIP
To make this complete, also add support for SLDT and STR. Note that by just looking at the guest CR4 bit, this is independent of actually making available the UMIP feature to guests. Signed-off-by: Jan Beulich--- v2: s/is_umip/umip_active/g. Fix a coding style issue. --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -182,7 +182,7 @@ static const opcode_desc_t opcode_table[ static const opcode_desc_t twobyte_table[256] = { /* 0x00 - 0x07 */ -SrcMem16|ModRM, ImplicitOps|ModRM, ModRM, ModRM, +ModRM, ImplicitOps|ModRM, ModRM, ModRM, 0, ImplicitOps, ImplicitOps, ImplicitOps, /* 0x08 - 0x0F */ ImplicitOps, ImplicitOps, 0, ImplicitOps, @@ -419,6 +419,7 @@ typedef union { /* Control register flags. */ #define CR0_PE(1<<0) #define CR4_TSD (1<<2) +#define CR4_UMIP (1<<11) /* EFLAGS bit definitions. */ #define EFLG_VIP (1<<20) @@ -1493,6 +1494,17 @@ static bool is_aligned(enum x86_segment return !((reg.base + offs) & (size - 1)); } +static bool umip_active(struct x86_emulate_ctxt *ctxt, +const struct x86_emulate_ops *ops) +{ +unsigned long cr4; + +/* Intentionally not using mode_ring0() here to avoid its fail_if(). */ +return get_cpl(ctxt, ops) > 0 && + ops->read_cr && ops->read_cr(4, , ctxt) == X86EMUL_OKAY && + (cr4 & CR4_UMIP); +} + /* Inject a software interrupt/exception, emulating if needed. */ static int inject_swint(enum x86_swint_type type, uint8_t vector, uint8_t insn_len, @@ -2068,10 +2080,20 @@ x86_decode( break; case ext_0f: -case ext_0f3a: -case ext_8f08: -case ext_8f09: -case ext_8f0a: +switch ( b ) +{ +case 0x00: /* Grp6 */ +switch ( modrm_reg & 6 ) +{ +case 0: +d |= DstMem | SrcImplicit | Mov; +break; +case 2: case 4: +d |= SrcMem16; +break; +} +break; +} break; case ext_0f38: @@ -2087,6 +2109,12 @@ x86_decode( } break; +case ext_0f3a: +case ext_8f08: +case ext_8f09: +case ext_8f0a: +break; + default: ASSERT_UNREACHABLE(); } @@ -4179,13 +4207,34 @@ x86_emulate( break; case X86EMUL_OPC(0x0f, 0x00): /* Grp6 */ -fail_if((modrm_reg & 6) != 2); +{ +enum x86_segment seg = (modrm_reg & 1) ? x86_seg_tr : x86_seg_ldtr; + generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1); -generate_exception_if(!mode_ring0(), EXC_GP, 0); -if ( (rc = load_seg((modrm_reg & 1) ? x86_seg_tr : x86_seg_ldtr, -src.val, 0, NULL, ctxt, ops)) != 0 ) -goto done; +switch ( modrm_reg & 6 ) +{ +struct segment_register sreg; + +case 0: /* sldt / str */ +generate_exception_if(umip_active(ctxt, ops), EXC_GP, 0); +fail_if(!ops->read_segment); +if ( (rc = ops->read_segment(seg, , ctxt)) != 0 ) +goto done; +dst.val = sreg.sel; +if ( dst.type == OP_MEM ) +dst.bytes = 2; +break; +case 2: /* lldt / ltr */ +generate_exception_if(!mode_ring0(), EXC_GP, 0); +if ( (rc = load_seg(seg, src.val, 0, NULL, ctxt, ops)) != 0 ) +goto done; +break; +default: +generate_exception_if(true, EXC_UD, -1); +break; +} break; +} case X86EMUL_OPC(0x0f, 0x01): /* Grp7 */ { struct segment_register reg; @@ -4282,6 +4331,7 @@ x86_emulate( case 0: /* sgdt */ case 1: /* sidt */ generate_exception_if(ea.type != OP_MEM, EXC_UD, -1); +generate_exception_if(umip_active(ctxt, ops), EXC_GP, 0); fail_if(ops->read_segment == NULL); if ( (rc = ops->read_segment((modrm_reg & 1) ? x86_seg_idtr : x86_seg_gdtr, @@ -4316,6 +4366,7 @@ x86_emulate( goto done; break; case 4: /* smsw */ +generate_exception_if(umip_active(ctxt, ops), EXC_GP, 0); ea.bytes = (ea.type == OP_MEM) ? 2 : op_bytes; dst = ea; fail_if(ops->read_cr == NULL); x86emul: support UMIP To make this complete, also add support for SLDT and STR. Note that by just looking at the guest CR4 bit, this is independent of actually making available the UMIP feature to guests. Signed-off-by: Jan Beulich --- v2: s/is_umip/umip_active/g. Fix a coding style issue. --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++
[Xen-devel] [PATCH v2 0/4] x86: further insn emulator improvements
1: support UMIP 2: consolidate segment register handling 3: support RTM instructions 4: conditionally clear BNDn for branches Signed-off-by: Jan Beulich___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel