[Xen-devel] [xen-unstable test] 101215: tolerable FAIL - PUSHED
flight 101215 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/101215/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 101209 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 101209 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 101209 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 101209 test-amd64-amd64-xl-rtds 9 debian-install fail like 101209 Tests which did not succeed, but are not blocking: test-amd64-i386-rumprun-i386 1 build-check(1) blocked n/a test-amd64-amd64-rumprun-amd64 1 build-check(1) blocked n/a build-amd64-rumprun 5 rumprun-buildfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass build-i386-rumprun5 rumprun-buildfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass version targeted for testing: xen eabe1c39d1cd4fcef18ec50571db3c70c055c1fb baseline version: xen fb8be95ca0b5fc816fd2234925f95c3f82ead824 Last test of basis 101209 2016-09-29 15:14:48 Z0 days Testing same since 101215 2016-09-29 22:44:04 Z0 days1 attempts People who touched revisions under test: Razvan CojocaruTamas K Lengyel Wei Liu jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64-xtf pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-oldkern
[Xen-devel] [ovmf test] 101217: all pass - PUSHED
flight 101217 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/101217/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf dab62c5ec8a88def3ee99c04d644720cb201de08 baseline version: ovmf 84bc72fb7ddaa26105bfe5bf36115099da1e60b1 Last test of basis 101206 2016-09-29 10:45:18 Z0 days Testing same since 101217 2016-09-30 01:46:31 Z0 days1 attempts People who touched revisions under test: Laszlo ErsekQin Long jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=ovmf + revision=dab62c5ec8a88def3ee99c04d644720cb201de08 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf dab62c5ec8a88def3ee99c04d644720cb201de08 + branch=ovmf + revision=dab62c5ec8a88def3ee99c04d644720cb201de08 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=ovmf + xenbranch=xen-unstable + '[' xovmf = xlinux ']' + linuxbranch= + '[' x = x ']' + qemuubranch=qemu-upstream-unstable + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable + prevxenbranch=xen-4.7-testing + '[' xdab62c5ec8a88def3ee99c04d644720cb201de08 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git ++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git ++
[Xen-devel] [PATCH v2 10/10] xl: allow to set the ratelimit value online for Credit2
Last part of the wiring necessary for allowing to change the value of the ratelimit_us parameter online, for Credit2 (like it is already for Credit1). Signed-off-by: Dario FaggioliReviewed-by: George Dunlap --- Cc: Ian Jackson Cc: Wei Liu --- docs/man/xl.pod.1.in |9 tools/libxl/xl_cmdimpl.c | 91 + tools/libxl/xl_cmdtable.c |2 + 3 files changed, 86 insertions(+), 16 deletions(-) diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in index a2be541..803c67e 100644 --- a/docs/man/xl.pod.1.in +++ b/docs/man/xl.pod.1.in @@ -1089,6 +1089,15 @@ to 65535 and the default is 256. Restrict output to domains in the specified cpupool. +=item B<-s>, B<--schedparam> + +Specify to list or set pool-wide scheduler parameters. + +=item B<-r RLIMIT>, B<--ratelimit_us=RLIMIT> + +Attempts to limit the rate of context switching. It is basically the same +as B<--ratelimit_us> in B + =back =item B [I] diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index cb43c00..b317dde 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -6457,8 +6457,29 @@ static int sched_credit_pool_output(uint32_t poolid) return 0; } -static int sched_credit2_domain_output( -int domid) +static int sched_credit2_params_set(int poolid, +libxl_sched_credit2_params *scinfo) +{ +if (libxl_sched_credit2_params_set(ctx, poolid, scinfo)) { +fprintf(stderr, "libxl_sched_credit2_params_set failed.\n"); +return 1; +} + +return 0; +} + +static int sched_credit2_params_get(int poolid, +libxl_sched_credit2_params *scinfo) +{ +if (libxl_sched_credit2_params_get(ctx, poolid, scinfo)) { +fprintf(stderr, "libxl_sched_credit2_params_get failed.\n"); +return 1; +} + +return 0; +} + +static int sched_credit2_domain_output(int domid) { char *domname; libxl_domain_sched_params scinfo; @@ -6483,6 +6504,22 @@ static int sched_credit2_domain_output( return 0; } +static int sched_credit2_pool_output(uint32_t poolid) +{ +libxl_sched_credit2_params scparam; +char *poolname = libxl_cpupoolid_to_name(ctx, poolid); + +if (sched_credit2_params_get(poolid, )) +printf("Cpupool %s: [sched params unavailable]\n", poolname); +else +printf("Cpupool %s: ratelimit=%dus\n", + poolname, scparam.ratelimit_us); + +free(poolname); + +return 0; +} + static int sched_rtds_domain_output( int domid) { @@ -6582,17 +6619,6 @@ static int sched_rtds_pool_output(uint32_t poolid) return 0; } -static int sched_default_pool_output(uint32_t poolid) -{ -char *poolname; - -poolname = libxl_cpupoolid_to_name(ctx, poolid); -printf("Cpupool %s:\n", - poolname); -free(poolname); -return 0; -} - static int sched_domain_output(libxl_scheduler sched, int (*output)(int), int (*pooloutput)(uint32_t), const char *cpupool) { @@ -6838,17 +6864,22 @@ int main_sched_credit2(int argc, char **argv) { const char *dom = NULL; const char *cpupool = NULL; +int ratelimit = 0; int weight = 256; +bool opt_s = false; +bool opt_r = false; bool opt_w = false; int opt, rc; static struct option opts[] = { {"domain", 1, 0, 'd'}, {"weight", 1, 0, 'w'}, +{"schedparam", 0, 0, 's'}, +{"ratelimit_us", 1, 0, 'r'}, {"cpupool", 1, 0, 'p'}, COMMON_LONG_OPTS }; -SWITCH_FOREACH_OPT(opt, "d:w:p:", opts, "sched-credit2", 0) { +SWITCH_FOREACH_OPT(opt, "d:w:p:r:s", opts, "sched-credit2", 0) { case 'd': dom = optarg; break; @@ -6856,6 +6887,13 @@ int main_sched_credit2(int argc, char **argv) weight = strtol(optarg, NULL, 10); opt_w = true; break; +case 's': +opt_s = true; +break; +case 'r': +ratelimit = strtol(optarg, NULL, 10); +opt_r = true; +break; case 'p': cpupool = optarg; break; @@ -6871,10 +6909,31 @@ int main_sched_credit2(int argc, char **argv) return EXIT_FAILURE; } -if (!dom) { /* list all domain's credit scheduler info */ +if (opt_s) { +libxl_sched_credit2_params scparam; +uint32_t poolid = 0; + +if (cpupool) { +if (libxl_cpupool_qualifier_to_cpupoolid(ctx, cpupool, + , NULL) || +!libxl_cpupoolid_is_valid(ctx, poolid)) { +fprintf(stderr, "unknown cpupool \'%s\'\n", cpupool); +return EXIT_FAILURE; +} +} + +if (!opt_r) { /* Output scheduling parameters */ +if (sched_credit2_pool_output(poolid)) +return
[Xen-devel] [PATCH v2 09/10] libxl: allow to set the ratelimit value online for Credit2
This is the remaining part of the plumbing (the libxl one) necessary to be able to change the value of the ratelimit_us parameter online, for Credit2 (like it is already for Credit1). Note that, so far, we were rejecting (for Credit1) a new value of zero, despite it is a pretty nice way to ask for the rate limiting to be disabled, and the hypervisor is already capable of dealing with it in that way. Therefore, we change things so that it is possible to do so, both for Credit1 and Credit2 Signed-off-by: Dario Faggioli--- Cc: Ian Jackson Cc: Wei Liu Cc: George Dunlap --- Changes from v1: * added the appropriate LIBXL_HAVE_, as requested during review. * coding style fixes put in previous patch, as requested during review. --- tools/libxl/libxl.c | 71 --- tools/libxl/libxl.h | 11 +++ tools/libxl/libxl_types.idl |4 ++ 3 files changed, 81 insertions(+), 5 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 606d71a..5a70564 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -5229,6 +5229,19 @@ static int sched_credit_domain_set(libxl__gc *gc, uint32_t domid, return 0; } +static int sched_ratelimit_check(libxl__gc *gc, int ratelimit) +{ +if (ratelimit != 0 && +(ratelimit < XEN_SYSCTL_SCHED_RATELIMIT_MIN || + ratelimit > XEN_SYSCTL_SCHED_RATELIMIT_MAX)) { +LOG(ERROR, "Ratelimit out of range, valid range is from %d to %d", +XEN_SYSCTL_SCHED_RATELIMIT_MIN, XEN_SYSCTL_SCHED_RATELIMIT_MAX); +return ERROR_INVAL; +} + +return 0; +} + int libxl_sched_credit_params_get(libxl_ctx *ctx, uint32_t poolid, libxl_sched_credit_params *scinfo) { @@ -5266,11 +5279,8 @@ int libxl_sched_credit_params_set(libxl_ctx *ctx, uint32_t poolid, rc = ERROR_INVAL; goto out; } -if (scinfo->ratelimit_us < XEN_SYSCTL_SCHED_RATELIMIT_MIN -|| scinfo->ratelimit_us > XEN_SYSCTL_SCHED_RATELIMIT_MAX) { -LOG(ERROR, "Ratelimit out of range, valid range is from %d to %d", -XEN_SYSCTL_SCHED_RATELIMIT_MIN, XEN_SYSCTL_SCHED_RATELIMIT_MAX); -rc = ERROR_INVAL; +rc = sched_ratelimit_check(gc, scinfo->ratelimit_us); +if (rc) { goto out; } if (scinfo->ratelimit_us > scinfo->tslice_ms*1000) { @@ -5297,6 +5307,57 @@ int libxl_sched_credit_params_set(libxl_ctx *ctx, uint32_t poolid, return rc; } +int libxl_sched_credit2_params_get(libxl_ctx *ctx, uint32_t poolid, + libxl_sched_credit2_params *scinfo) +{ +struct xen_sysctl_credit2_schedule sparam; +int r, rc; +GC_INIT(ctx); + +r = xc_sched_credit2_params_get(ctx->xch, poolid, ); +if (r < 0) { +LOGE(ERROR, "getting Credit2 scheduler parameters"); +rc = ERROR_FAIL; +goto out; +} + +scinfo->ratelimit_us = sparam.ratelimit_us; + +rc = 0; + out: +GC_FREE; +return rc; +} + + +int libxl_sched_credit2_params_set(libxl_ctx *ctx, uint32_t poolid, + libxl_sched_credit2_params *scinfo) +{ +struct xen_sysctl_credit2_schedule sparam; +int r, rc; +GC_INIT(ctx); + +rc = sched_ratelimit_check(gc, scinfo->ratelimit_us); +if (rc) { +goto out; +} + +sparam.ratelimit_us = scinfo->ratelimit_us; + +r = xc_sched_credit2_params_set(ctx->xch, poolid, ); +if ( r < 0 ) { +LOGE(ERROR, "Setting Credit2 scheduler parameters"); +rc = ERROR_FAIL; +goto out; +} + +scinfo->ratelimit_us = sparam.ratelimit_us; + + out: +GC_FREE; +return rc; +} + static int sched_credit2_domain_get(libxl__gc *gc, uint32_t domid, libxl_domain_sched_params *scinfo) { diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 7cfa540..969a089 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -281,6 +281,13 @@ #define LIBXL_HAVE_QEMU_MONITOR_COMMAND 1 /* + * LIBXL_HAVE_SCHED_CREDIT2_PARAMS indicates the existance of a + * libxl_sched_credit2_params structure, containing Credit2 scheduler + * wide parameters (i.e., the ratelimiting value). + */ +#define LIBXL_HAVE_SCHED_CREDIT2_PARAMS 1 + +/* * libxl ABI compatibility * * The only guarantee which libxl makes regarding ABI compatibility @@ -1992,6 +1999,10 @@ int libxl_sched_credit_params_get(libxl_ctx *ctx, uint32_t poolid, libxl_sched_credit_params *scinfo); int libxl_sched_credit_params_set(libxl_ctx *ctx, uint32_t poolid, libxl_sched_credit_params *scinfo); +int libxl_sched_credit2_params_get(libxl_ctx *ctx, uint32_t poolid, + libxl_sched_credit2_params *scinfo); +int libxl_sched_credit2_params_set(libxl_ctx *ctx, uint32_t
[Xen-devel] [PATCH v2 05/10] xen: credit2: implement yield()
When a vcpu explicitly yields it is usually giving us an advice of "let someone else run and come back to me in a bit." Credit2 isn't, so far, doing anything when a vcpu yields, which means an yield is basically a NOP (well, actually, it's pure overhead, as it causes the scheduler kick in, but the result is --at least 99% of the time-- that the very same vcpu that yielded continues to run). Implement a "preempt bias", to be applied to yielding vcpus. Basically when evaluating what vcpu to run next, if a vcpu that has just yielded is encountered, we give it a credit penalty, and check whether there is anyone else that would better take over the cpu (of course, if there isn't the yielding vcpu will continue). The value of this bias can be configured with a boot time parameter, and the default is set to 1 ms. Also, add an yield performance counter, and fix the style of a couple of comments. Signed-off-by: Dario Faggioli--- Cc: George Dunlap Cc: Anshul Makkar Cc: Jan Beulich Cc: Andrew Cooper --- Changes from v1: * add _us to the parameter name, as suggested during review; * get rid of the minimum value for the yield bias; * apply the idle bias via subtraction of credits to the yielding vcpu, rather than via addition to all the others; * merge the Credit2 bits of what was patch 7 here, as suggested during review. --- docs/misc/xen-command-line.markdown | 10 + xen/common/sched_credit2.c | 76 +++ xen/common/schedule.c |2 + xen/include/xen/perfc_defn.h|1 4 files changed, 71 insertions(+), 18 deletions(-) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index 8ff57fa..4fd3460 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1395,6 +1395,16 @@ Choose the default scheduler. ### sched\_credit2\_migrate\_resist > `= ` +### sched\_credit2\_yield\_bias\_us +> `= ` + +> Default: `1000` + +Set how much a yielding vcpu will be penalized, in order to actually +give a chance to run to some other vcpu. This is basically a bias, in +favour of the non-yielding vcpus, expressed in microseconds (default +is 1ms). + ### sched\_credit\_tslice\_ms > `= ` diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 72e31b5..fde61ef 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -144,6 +144,8 @@ #define CSCHED2_MIGRATE_RESIST ((opt_migrate_resist)*MICROSECS(1)) /* How much to "compensate" a vcpu for L2 migration */ #define CSCHED2_MIGRATE_COMPENSATION MICROSECS(50) +/* How big of a bias we should have against a yielding vcpu */ +#define CSCHED2_YIELD_BIAS ((opt_yield_bias)*MICROSECS(1)) /* Reset: Value below which credit will be reset. */ #define CSCHED2_CREDIT_RESET 0 /* Max timer: Maximum time a guest can be run for. */ @@ -181,11 +183,20 @@ */ #define __CSFLAG_runq_migrate_request 3 #define CSFLAG_runq_migrate_request (1<<__CSFLAG_runq_migrate_request) - +/* + * CSFLAG_vcpu_yield: this vcpu was running, and has called vcpu_yield(). The + * scheduler is invoked to see if we can give the cpu to someone else, and + * get back to the yielding vcpu in a while. + */ +#define __CSFLAG_vcpu_yield 4 +#define CSFLAG_vcpu_yield (1<<__CSFLAG_vcpu_yield) static unsigned int __read_mostly opt_migrate_resist = 500; integer_param("sched_credit2_migrate_resist", opt_migrate_resist); +static unsigned int __read_mostly opt_yield_bias = 1000; +integer_param("sched_credit2_yield_bias_us", opt_yield_bias); + /* * Useful macros */ @@ -1431,6 +1442,14 @@ out: } static void +csched2_vcpu_yield(const struct scheduler *ops, struct vcpu *v) +{ +struct csched2_vcpu * const svc = CSCHED2_VCPU(v); + +__set_bit(__CSFLAG_vcpu_yield, >flags); +} + +static void csched2_context_saved(const struct scheduler *ops, struct vcpu *vc) { struct csched2_vcpu * const svc = CSCHED2_VCPU(vc); @@ -2250,26 +2269,39 @@ runq_candidate(struct csched2_runqueue_data *rqd, struct list_head *iter; struct csched2_vcpu *snext = NULL; struct csched2_private *prv = CSCHED2_PRIV(per_cpu(scheduler, cpu)); +/* + * If scurr is yielding, temporarily subtract CSCHED2_YIELD_BIAS + * credits to it (where "temporarily" means "for the sake of just + * this scheduling decision). + */ +int yield_bias = 0, snext_credit; *skipped = 0; -/* Default to current if runnable, idle otherwise */ -if ( vcpu_runnable(scurr->vcpu) ) -snext = scurr; -else -snext = CSCHED2_VCPU(idle_vcpu[cpu]); - /* * Return the current vcpu if it has executed for less than ratelimit. * Adjuststment for the selected vcpu's credit and decision * for how long it will run will be taken in csched2_runtime. + * +
[Xen-devel] [PATCH v2 08/10] libxl: fix coding style of credit1 parameters related functions
More specifically, the the error handling path is made compliant with libxl's codying style. No functional change intended. Signed-off-by: Dario Faggioli--- Cc: Ian Jackson Cc: Wei Liu Cc: George Dunlap --- Changes from v1: * new patch, containing only the coding style changes from what was patch 14 in v1. --- tools/libxl/libxl.c | 43 +++ 1 file changed, 23 insertions(+), 20 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 997d94c..606d71a 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -5233,65 +5233,68 @@ int libxl_sched_credit_params_get(libxl_ctx *ctx, uint32_t poolid, libxl_sched_credit_params *scinfo) { struct xen_sysctl_credit_schedule sparam; -int rc; +int r, rc; GC_INIT(ctx); -rc = xc_sched_credit_params_get(ctx->xch, poolid, ); -if (rc != 0) { -LOGE(ERROR, "getting sched credit param"); -GC_FREE; -return ERROR_FAIL; +r = xc_sched_credit_params_get(ctx->xch, poolid, ); +if (r < 0) { +LOGE(ERROR, "getting Credit scheduler parameters"); +rc = ERROR_FAIL; +goto out; } scinfo->tslice_ms = sparam.tslice_ms; scinfo->ratelimit_us = sparam.ratelimit_us; +rc = 0; + out: GC_FREE; -return 0; +return rc; } int libxl_sched_credit_params_set(libxl_ctx *ctx, uint32_t poolid, libxl_sched_credit_params *scinfo) { struct xen_sysctl_credit_schedule sparam; -int rc=0; +int r, rc; GC_INIT(ctx); if (scinfo->tslice_ms < XEN_SYSCTL_CSCHED_TSLICE_MIN || scinfo->tslice_ms > XEN_SYSCTL_CSCHED_TSLICE_MAX) { LOG(ERROR, "Time slice out of range, valid range is from %d to %d", XEN_SYSCTL_CSCHED_TSLICE_MIN, XEN_SYSCTL_CSCHED_TSLICE_MAX); -GC_FREE; -return ERROR_INVAL; +rc = ERROR_INVAL; +goto out; } if (scinfo->ratelimit_us < XEN_SYSCTL_SCHED_RATELIMIT_MIN || scinfo->ratelimit_us > XEN_SYSCTL_SCHED_RATELIMIT_MAX) { LOG(ERROR, "Ratelimit out of range, valid range is from %d to %d", XEN_SYSCTL_SCHED_RATELIMIT_MIN, XEN_SYSCTL_SCHED_RATELIMIT_MAX); -GC_FREE; -return ERROR_INVAL; +rc = ERROR_INVAL; +goto out; } if (scinfo->ratelimit_us > scinfo->tslice_ms*1000) { LOG(ERROR, "Ratelimit cannot be greater than timeslice"); -GC_FREE; -return ERROR_INVAL; +rc = ERROR_INVAL; +goto out; } sparam.tslice_ms = scinfo->tslice_ms; sparam.ratelimit_us = scinfo->ratelimit_us; -rc = xc_sched_credit_params_set(ctx->xch, poolid, ); -if ( rc < 0 ) { -LOGE(ERROR, "setting sched credit param"); -GC_FREE; -return ERROR_FAIL; +r = xc_sched_credit_params_set(ctx->xch, poolid, ); +if ( r < 0 ) { +LOGE(ERROR, "Setting Credit scheduler parameters"); +rc = ERROR_FAIL; +goto out; } scinfo->tslice_ms = sparam.tslice_ms; scinfo->ratelimit_us = sparam.ratelimit_us; + out: GC_FREE; -return 0; +return rc; } static int sched_credit2_domain_get(libxl__gc *gc, uint32_t domid, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 06/10] xen: tracing: add trace records for schedule and rate-limiting.
As far as {csched, csched2, rt}_schedule() are concerned, an "empty" event, would already make it easier to read and understand a trace. But while there, add a few useful information, like if the cpu that is going through the scheduler has been tickled or not, if it is currently idle, etc (they vary, on a per-scheduler basis). For Credit1 and Credit2, add a record about when rate-limiting kicks in too. Signed-off-by: Dario Faggioli--- Cc: George Dunlap Cc: Meng Xu Cc: Anshul Makkar --- Changes from v1: * corrected the schedule record for sched_rt.c, as pointed out during review; * pack the Credit1 records as well, as requested during review. --- xen/common/sched_credit.c | 32 xen/common/sched_credit2.c | 40 +++- xen/common/sched_rt.c | 15 +++ 3 files changed, 86 insertions(+), 1 deletion(-) diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index 5700763..fc3a321 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -133,6 +133,8 @@ #define TRC_CSCHED_TICKLETRC_SCHED_CLASS_EVT(CSCHED, 6) #define TRC_CSCHED_BOOST_START TRC_SCHED_CLASS_EVT(CSCHED, 7) #define TRC_CSCHED_BOOST_END TRC_SCHED_CLASS_EVT(CSCHED, 8) +#define TRC_CSCHED_SCHEDULE TRC_SCHED_CLASS_EVT(CSCHED, 9) +#define TRC_CSCHED_RATELIMIT TRC_SCHED_CLASS_EVT(CSCHED, 10) /* @@ -1774,6 +1776,23 @@ csched_schedule( SCHED_STAT_CRANK(schedule); CSCHED_VCPU_CHECK(current); +/* + * Here in Credit1 code, we usually just call TRACE_nD() helpers, and + * don't care about packing. But scheduling happens very often, so it + * actually is important that the record is as small as possible. + */ +if ( unlikely(tb_init_done) ) +{ +struct { +unsigned cpu:16, tasklet:8, idle:8; +} d; +d.cpu = cpu; +d.tasklet = tasklet_work_scheduled; +d.idle = is_idle_vcpu(current); +__trace_var(TRC_CSCHED_SCHEDULE, 1, sizeof(d), +(unsigned char *)); +} + runtime = now - current->runstate.state_entry_time; if ( runtime < 0 ) /* Does this ever happen? */ runtime = 0; @@ -1829,6 +1848,19 @@ csched_schedule( tslice = MICROSECS(prv->ratelimit_us) - runtime; if ( unlikely(runtime < CSCHED_MIN_TIMER) ) tslice = CSCHED_MIN_TIMER; +if ( unlikely(tb_init_done) ) +{ +struct { +unsigned vcpu:16, dom:16; +unsigned runtime; +} d; +d.dom = scurr->vcpu->domain->domain_id; +d.vcpu = scurr->vcpu->vcpu_id; +d.runtime = runtime; +__trace_var(TRC_CSCHED_RATELIMIT, 1, sizeof(d), +(unsigned char *)); +} + ret.migrated = 0; goto out; } diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index fde61ef..5cf3f16 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -55,6 +55,8 @@ #define TRC_CSCHED2_LOAD_BALANCE TRC_SCHED_CLASS_EVT(CSCHED2, 17) #define TRC_CSCHED2_PICKED_CPU TRC_SCHED_CLASS_EVT(CSCHED2, 19) #define TRC_CSCHED2_RUNQ_CANDIDATE TRC_SCHED_CLASS_EVT(CSCHED2, 20) +#define TRC_CSCHED2_SCHEDULE TRC_SCHED_CLASS_EVT(CSCHED2, 21) +#define TRC_CSCHED2_RATELIMITTRC_SCHED_CLASS_EVT(CSCHED2, 22) /* * WARNING: This is still in an experimental phase. Status and work can be found at the @@ -2288,12 +2290,29 @@ runq_candidate(struct csched2_runqueue_data *rqd, * no point forcing it to do so until rate limiting expires. */ if ( __test_and_clear_bit(__CSFLAG_vcpu_yield, >flags) ) +{ yield_bias = CSCHED2_YIELD_BIAS; +} else if ( prv->ratelimit_us && !is_idle_vcpu(scurr->vcpu) && vcpu_runnable(scurr->vcpu) && (now - scurr->vcpu->runstate.state_entry_time) < MICROSECS(prv->ratelimit_us) ) +{ +if ( unlikely(tb_init_done) ) +{ +struct { +unsigned vcpu:16, dom:16; +unsigned runtime; +} d; +d.dom = scurr->vcpu->domain->domain_id; +d.vcpu = scurr->vcpu->vcpu_id; +d.runtime = now - scurr->vcpu->runstate.state_entry_time; +__trace_var(TRC_CSCHED2_RATELIMIT, 1, +sizeof(d), +(unsigned char *)); +} return scurr; +} /* Default to current if runnable, idle otherwise */ if ( vcpu_runnable(scurr->vcpu) ) @@ -2383,6 +2402,7 @@ csched2_schedule( struct csched2_vcpu *snext = NULL; unsigned int skipped_vcpus = 0; struct task_slice ret; +bool_t tickled; SCHED_STAT_CRANK(schedule); CSCHED2_VCPU_CHECK(current); @@ -2397,13
[Xen-devel] [PATCH v2 07/10] tools: tracing: handle more scheduling related events.
There are some scheduling related trace records that are not being taken care of (and hence only dumped as raw records). Some of them are being introduced in this series, while other were just neglected by previous patches. Add support for them. Signed-off-by: Dario FaggioliAcked-by: George Dunlap --- Cc: Ian Jackson Cc: Wei Liu --- Changes from v1: * only the one made necessary by the packing done to Credit1 records. Those were requested by George himself, and the effect on this patch is small, and purely mechanic, so I decided to keep his Ack. --- tools/xentrace/formats|8 tools/xentrace/xenalyze.c | 101 + 2 files changed, 109 insertions(+) diff --git a/tools/xentrace/formats b/tools/xentrace/formats index 0de7990..db89f92 100644 --- a/tools/xentrace/formats +++ b/tools/xentrace/formats @@ -42,6 +42,10 @@ 0x00022004 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched:stolen_vcpu [ dom:vcpu = 0x%(2)04x%(3)04x, from = %(1)d ] 0x00022005 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched:picked_cpu[ dom:vcpu = 0x%(1)04x%(2)04x, cpu = %(3)d ] 0x00022006 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched:tickle[ cpu = %(1)d ] +0x00022007 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched:boost [ dom:vcpu = 0x%(1)04x%(2)04x ] +0x00022008 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched:unboost [ dom:vcpu = 0x%(1)04x%(2)04x ] +0x00022009 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched:schedule [ cpu[16]:tasklet[8]:idle[8] = %(1)08x ] +0x0002200A CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched:ratelimit [ dom:vcpu = 0x%(1)08x, runtime = %(2)d ] 0x00022201 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched2:tick 0x00022202 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched2:runq_pos [ dom:vcpu = 0x%(1)08x, pos = %(2)d] @@ -61,12 +65,16 @@ 0x00022210 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched2:load_check [ lrq_id[16]:orq_id[16] = 0x%(1)08x, delta = %(2)d ] 0x00022211 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched2:load_balance [ l_bavgload = 0x%(2)08x%(1)08x, o_bavgload = 0x%(4)08x%(3)08x, lrq_id[16]:orq_id[16] = 0x%(5)08x ] 0x00022212 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched2:pick_cpu [ b_avgload = 0x%(2)08x%(1)08x, dom:vcpu = 0x%(3)08x, rq_id[16]:new_cpu[16] = %(4)d ] +0x00022213 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched2:runq_candidate [ dom:vcpu = 0x%(1)08x, skipped_vcpus = %(2)d tickled_cpu = %(3)d ] +0x00022214 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched2:schedule [ rq:cpu = 0x%(1)08x, tasklet[8]:idle[8]:smt_idle[8]:tickled[8] = %(2)08x ] +0x00022215 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched2:ratelimit [ dom:vcpu = 0x%(1)08x, runtime = %(2)d ] 0x00022801 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:tickle[ cpu = %(1)d ] 0x00022802 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:runq_pick [ dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ] 0x00022803 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:burn_budget [ dom:vcpu = 0x%(1)08x, cur_budget = 0x%(3)08x%(2)08x, delta = %(4)d ] 0x00022804 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:repl_budget [ dom:vcpu = 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ] 0x00022805 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:sched_tasklet +0x00022806 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) rtds:schedule [ cpu[16]:tasklet[8]:idle[4]:tickled[4] = %(1)08x ] 0x00041001 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) domain_create [ dom = 0x%(1)08x ] 0x00041002 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) domain_destroy [ dom = 0x%(1)08x ] diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c index 0b697d0..f006804 100644 --- a/tools/xentrace/xenalyze.c +++ b/tools/xentrace/xenalyze.c @@ -7590,6 +7590,50 @@ void sched_process(struct pcpu_info *p) ri->dump_header, r->cpu); } break; +case TRC_SCHED_CLASS_EVT(CSCHED, 7): /* BOOST_START */ +if(opt.dump_all) { +struct { +unsigned int domid, vcpuid; +} *r = (typeof(r))ri->d; + +printf(" %s csched: d%uv%u boosted\n", + ri->dump_header, r->domid, r->vcpuid); +} +break; +case TRC_SCHED_CLASS_EVT(CSCHED, 8): /* BOOST_END */ +if(opt.dump_all) { +struct { +unsigned int domid, vcpuid; +} *r = (typeof(r))ri->d; + +printf(" %s csched: d%uv%u unboosted\n", + ri->dump_header, r->domid, r->vcpuid); +} +break; +case TRC_SCHED_CLASS_EVT(CSCHED, 9): /* SCHEDULE */ +if(opt.dump_all) { +struct { +unsigned int cpu:16, tasklet:8, idle:8; +} *r = (typeof(r))ri->d; + +
[Xen-devel] [PATCH v2 04/10] xen: credit2: only reset credit on reset condition
The condition for a Credit2 scheduling epoch coming to an end is that the vcpu at the front of the runqueue has negative credits. However, it is possible, that runq_candidate() does not actually return to the scheduler the first vcpu in the runqueue (e.g., because such vcpu can't run on the cpu that is going through the scheduler, because of hard-affinity). If that happens, we should not trigger a credit reset, or we risk altering the lenght of a scheduler epoch, wrt what the original idea of the algorithm was. Signed-off-by: Dario Faggioli--- Cc: George Dunlap Cc: Anshul Makkar --- Changes from v1: * new patch, containing part of what was in patch 5; * (wrt v1 patch 5) 'pos' parameter to runq_candidate renamed 'skipped', as requested during review. --- xen/common/sched_credit2.c | 33 - 1 file changed, 28 insertions(+), 5 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 3986441..72e31b5 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -2244,12 +2244,15 @@ void __dump_execstate(void *unused); static struct csched2_vcpu * runq_candidate(struct csched2_runqueue_data *rqd, struct csched2_vcpu *scurr, - int cpu, s_time_t now) + int cpu, s_time_t now, + unsigned int *skipped) { struct list_head *iter; struct csched2_vcpu *snext = NULL; struct csched2_private *prv = CSCHED2_PRIV(per_cpu(scheduler, cpu)); +*skipped = 0; + /* Default to current if runnable, idle otherwise */ if ( vcpu_runnable(scurr->vcpu) ) snext = scurr; @@ -2273,7 +2276,10 @@ runq_candidate(struct csched2_runqueue_data *rqd, /* Only consider vcpus that are allowed to run on this processor. */ if ( !cpumask_test_cpu(cpu, svc->vcpu->cpu_hard_affinity) ) +{ +(*skipped)++; continue; +} /* * If a vcpu is meant to be picked up by another processor, and such @@ -2282,6 +2288,7 @@ runq_candidate(struct csched2_runqueue_data *rqd, if ( svc->tickled_cpu != -1 && svc->tickled_cpu != cpu && cpumask_test_cpu(svc->tickled_cpu, >tickled) ) { +(*skipped)++; SCHED_STAT_CRANK(deferred_to_tickled_cpu); continue; } @@ -2291,6 +2298,7 @@ runq_candidate(struct csched2_runqueue_data *rqd, if ( svc->vcpu->processor != cpu && snext->credit + CSCHED2_MIGRATE_RESIST > svc->credit ) { +(*skipped)++; SCHED_STAT_CRANK(migrate_resisted); continue; } @@ -2308,11 +2316,12 @@ runq_candidate(struct csched2_runqueue_data *rqd, { struct { unsigned vcpu:16, dom:16; -unsigned tickled_cpu; +unsigned tickled_cpu, skipped; } d; d.dom = snext->vcpu->domain->domain_id; d.vcpu = snext->vcpu->vcpu_id; d.tickled_cpu = snext->tickled_cpu; +d.skipped = *skipped; __trace_var(TRC_CSCHED2_RUNQ_CANDIDATE, 1, sizeof(d), (unsigned char *)); @@ -2336,6 +2345,7 @@ csched2_schedule( struct csched2_runqueue_data *rqd; struct csched2_vcpu * const scurr = CSCHED2_VCPU(current); struct csched2_vcpu *snext = NULL; +unsigned int skipped_vcpus = 0; struct task_slice ret; SCHED_STAT_CRANK(schedule); @@ -2385,7 +2395,7 @@ csched2_schedule( snext = CSCHED2_VCPU(idle_vcpu[cpu]); } else -snext = runq_candidate(rqd, scurr, cpu, now); +snext = runq_candidate(rqd, scurr, cpu, now, _vcpus); /* If switching from a non-idle runnable vcpu, put it * back on the runqueue. */ @@ -2409,8 +2419,21 @@ csched2_schedule( __set_bit(__CSFLAG_scheduled, >flags); } -/* Check for the reset condition */ -if ( snext->credit <= CSCHED2_CREDIT_RESET ) +/* + * The reset condition is "has a scheduler epoch come to an end?". + * The way this is enforced is checking whether the vcpu at the top + * of the runqueue has negative credits. This means the epochs have + * variable lenght, as in one epoch expores when: + * 1) the vcpu at the top of the runqueue has executed for + * around 10 ms (with default parameters); + * 2) no other vcpu with higher credits wants to run. + * + * Here, where we want to check for reset, we need to make sure the + * proper vcpu is being used. In fact, runqueue_candidate() may have + * not returned the first vcpu in the runqueue, for various reasons + * (e.g., affinity). Only trigger a reset when it does. + */ +if ( skipped_vcpus == 0 && snext->credit <= CSCHED2_CREDIT_RESET ) {
[Xen-devel] [PATCH v2 02/10] xen: credit1: don't rate limit context switches in case of yields
Rate limiting has been primarily introduced to avoid too heavy context switch rate due to interrupts, and, in general, asynchronous events. If a vcpu "voluntarily" yields, we really should let it give up the cpu for a while. In fact, it may be that it is yielding because it's about to start spinning, and there's few point in forcing a vcpu to spin for (potentially) the entire rate-limiting period. Signed-off-by: Dario Faggioli--- Cc: George Dunlap Cc: Anshul Makkar --- Changes from v1: * move this patch up in the series, and remove the Credit2 bits, as suggested during review; --- xen/common/sched_credit.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index 4d84b5f..5700763 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -1802,9 +1802,16 @@ csched_schedule( * cpu and steal it. */ -/* If we have schedule rate limiting enabled, check to see - * how long we've run for. */ -if ( !tasklet_work_scheduled +/* + * If we have schedule rate limiting enabled, check to see + * how long we've run for. + * + * If scurr is yielding, however, we don't let rate limiting kick in. + * In fact, it may be the case that scurr is about to spin, and there's + * no point forcing it to do so until rate limiting expires. + */ +if ( !test_bit(CSCHED_FLAG_VCPU_YIELD, >flags) + && !tasklet_work_scheduled && prv->ratelimit_us && vcpu_runnable(current) && !is_idle_vcpu(current) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 01/10] xen: credit1: return the 'time remaining to the limit' as next timeslice.
If vcpu x has run for 200us, and sched_ratelimit_us is 1000us, continue running x _but_ return 1000us-200us as the next time slice. This way, next scheduling point will happen in 800us, i.e., exactly at the point when x crosses the threshold, and can be descheduled (if appropriate). Right now (without this patch), we're always returning sched_ratelimit_us (1000us, in the example above), which means we're (potentially) allowing x to run more than it should have been able to. Note that, however, in order to avoid setting timers to very short intervals, which is part of the purpose of rate limiting, we never use a time slice smaller than a well defined threshold. Such threshold (CSCHED_MIN_TIMER defined in this patch) is, in general independent from rate limiting, but it looks a good idea to set it to the minimum possible ratelimiting value. Signed-off-by: Dario Faggioli--- Cc: George Dunlap --- Changes from v1: * introduce CSCHED_MIN_TIMER, as agreed during review. --- xen/common/sched_credit.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index b63efac..4d84b5f 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -51,6 +51,8 @@ /* Default timeslice: 30ms */ #define CSCHED_DEFAULT_TSLICE_MS30 #define CSCHED_CREDITS_PER_MSEC 10 +/* Never set a timer shorter than this value. */ +#define CSCHED_MIN_TIMERXEN_SYSCTL_SCHED_RATELIMIT_MIN /* @@ -1811,7 +1813,15 @@ csched_schedule( snext = scurr; snext->start_time += now; perfc_incr(delay_ms); -tslice = MICROSECS(prv->ratelimit_us); +/* + * Next timeslice must last just until we'll have executed for + * ratelimit_us. However, to avoid setting a really short timer, which + * will most likely be inaccurate and counterproductive, we never go + * below CSCHED_MIN_TIMER. + */ +tslice = MICROSECS(prv->ratelimit_us) - runtime; +if ( unlikely(runtime < CSCHED_MIN_TIMER) ) +tslice = CSCHED_MIN_TIMER; ret.migrated = 0; goto out; } ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 03/10] xen: credit2: make tickling more deterministic
Right now, the following scenario can occurr: - upon vcpu v wakeup, v itself is put in the runqueue, and pcpu X is tickled; - pcpu Y schedules (for whatever reason), sees v in the runqueue and picks it up. This may seem ok (or even a good thing), but it's not. In fact, if runq_tickle() decided X is where v should run, it did it for a reason (load distribution, SMT support, cache hotness, affinity, etc), and we really should try as hard as possible to stick to that. Of course, we can't be too strict, or we risk leaving vcpus in the runqueue while there is available CPU capacity. So, we only leave v in runqueue --for X to pick it up-- if we see that X has been tickled and has not scheduled yet, i.e., it will have a real chance of actually select and schedule v. If that is not the case, we schedule it on Y (or, at least, we consider that), as running somewhere non-ideal is better than not running at all. The commit also adds performance counters for each of the possible situations. Signed-off-by: Dario Faggioli--- Cc: George Dunlap Cc: Anshul Makkar Cc: Jan Beulich Cc: Andrew Cooper --- Changes from v1: * always initialize tickled_cpu to -1, also for idle vcpus (in which cases, it just won't ever change to anything else than that), for improved readability and understandability; * logic for reporting back to csched_schedule() whether any vcpu was skipped, within runq_candidate(), and to only reset the credits if that did not happen moved out from here, to another patch. --- xen/common/sched_credit2.c | 37 - xen/include/xen/perfc_defn.h |3 +++ 2 files changed, 39 insertions(+), 1 deletion(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 5c7d0dc..3986441 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -54,6 +54,7 @@ #define TRC_CSCHED2_LOAD_CHECK TRC_SCHED_CLASS_EVT(CSCHED2, 16) #define TRC_CSCHED2_LOAD_BALANCE TRC_SCHED_CLASS_EVT(CSCHED2, 17) #define TRC_CSCHED2_PICKED_CPU TRC_SCHED_CLASS_EVT(CSCHED2, 19) +#define TRC_CSCHED2_RUNQ_CANDIDATE TRC_SCHED_CLASS_EVT(CSCHED2, 20) /* * WARNING: This is still in an experimental phase. Status and work can be found at the @@ -398,6 +399,7 @@ struct csched2_vcpu { int credit; s_time_t start_time; /* When we were scheduled (used for credit) */ unsigned flags; /* 16 bits doesn't seem to play well with clear_bit() */ +int tickled_cpu; /* cpu tickled for picking us up (-1 if none) */ /* Individual contribution to load */ s_time_t load_last_update; /* Last time average was updated */ @@ -1049,6 +1051,10 @@ runq_tickle(const struct scheduler *ops, struct csched2_vcpu *new, s_time_t now) __cpumask_set_cpu(ipid, >tickled); smt_idle_mask_clear(ipid, >smt_idle); cpu_raise_softirq(ipid, SCHEDULE_SOFTIRQ); + +if ( unlikely(new->tickled_cpu != -1) ) +SCHED_STAT_CRANK(tickled_cpu_overwritten); +new->tickled_cpu = ipid; } /* @@ -1276,6 +1282,7 @@ csched2_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd) svc->credit = CSCHED2_IDLE_CREDIT; svc->weight = 0; } +svc->tickled_cpu = -1; SCHED_STAT_CRANK(vcpu_alloc); @@ -2268,6 +2275,17 @@ runq_candidate(struct csched2_runqueue_data *rqd, if ( !cpumask_test_cpu(cpu, svc->vcpu->cpu_hard_affinity) ) continue; +/* + * If a vcpu is meant to be picked up by another processor, and such + * processor has not scheduled yet, leave it in the runqueue for him. + */ +if ( svc->tickled_cpu != -1 && svc->tickled_cpu != cpu && + cpumask_test_cpu(svc->tickled_cpu, >tickled) ) +{ +SCHED_STAT_CRANK(deferred_to_tickled_cpu); +continue; +} + /* If this is on a different processor, don't pull it unless * its credit is at least CSCHED2_MIGRATE_RESIST higher. */ if ( svc->vcpu->processor != cpu @@ -2284,9 +2302,25 @@ runq_candidate(struct csched2_runqueue_data *rqd, /* In any case, if we got this far, break. */ break; +} +if ( unlikely(tb_init_done) ) +{ +struct { +unsigned vcpu:16, dom:16; +unsigned tickled_cpu; +} d; +d.dom = snext->vcpu->domain->domain_id; +d.vcpu = snext->vcpu->vcpu_id; +d.tickled_cpu = snext->tickled_cpu; +__trace_var(TRC_CSCHED2_RUNQ_CANDIDATE, 1, +sizeof(d), +(unsigned char *)); } +if ( unlikely(snext->tickled_cpu != -1 && snext->tickled_cpu != cpu) ) +SCHED_STAT_CRANK(tickled_cpu_overridden); + return snext; } @@ -2351,7 +2385,7 @@ csched2_schedule( snext = CSCHED2_VCPU(idle_vcpu[cpu]); } else -
[Xen-devel] [PATCH v2 00/10] sched: Credit1 and Credit2 improvements... but *NO* soft-affinity for Credit2!
Hey, This is v2 of my Credit1 and Credit2 improvements series. First posting is here: https://lists.xen.org/archives/html/xen-devel/2016-08/msg02183.html Now, couple of things: - some of the patches have been applied already out of v1; - I've reshuffled the remaining patches a bit, mostly upon reviewers' requests to do so; - I'm not including the 'soft affinity for Credit2 patches'. In fact, most of the soft affinity work still needs to be reviewed. OTOH, the patches that I've put together here, have been reviewed already, and I think I've addressed all the review comments, which means that --if people (i.e., mostly George!:-P) manage to have a quick look at them-- they can even go in for 4.8. And while there's really no point in rushing soft-affinity for Credit2 in at this stage, these patches brings some nice (and moderateely simple) improvements for both the schedulers, which I think should make it in the release. There's a branch available here: git://xenbits.xen.org/people/dariof/xen.git rel/sched/misc-credit1-credit2-plus-credit2-softaff-v2 http://xenbits.xen.org/gitweb/?p=people/dariof/xen.git;a=shortlog;h=refs/heads/rel/sched/misc-credit1-credit2-plus-credit2-softaff-v2 https://travis-ci.org/fdario/xen/builds/163898322 Thanks and Regards, Dario --- Dario Faggioli (10): xen: credit1: return the 'time remaining to the limit' as next timeslice. xen: credit1: don't rate limit context switches in case of yields xen: credit2: make tickling more deterministic xen: credit2: only reset credit on reset condition xen: credit2: implement yield() xen: tracing: add trace records for schedule and rate-limiting. tools: tracing: handle more scheduling related events. libxl: fix coding style of credit1 parameters related functions libxl: allow to set the ratelimit value online for Credit2 xl: allow to set the ratelimit value online for Credit2 docs/man/xl.pod.1.in|9 ++ docs/misc/xen-command-line.markdown | 10 ++ tools/libxl/libxl.c | 112 +- tools/libxl/libxl.h | 11 ++ tools/libxl/libxl_types.idl |4 + tools/libxl/xl_cmdimpl.c| 91 +++--- tools/libxl/xl_cmdtable.c |2 tools/xentrace/formats |8 ++ tools/xentrace/xenalyze.c | 101 xen/common/sched_credit.c | 57 ++- xen/common/sched_credit2.c | 180 +++ xen/common/sched_rt.c | 15 +++ xen/common/schedule.c |2 xen/include/xen/perfc_defn.h|4 + 14 files changed, 540 insertions(+), 66 deletions(-) -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 05/24] xen: credit2: make tickling more deterministic
On Tue, 2016-09-13 at 12:28 +0100, George Dunlap wrote: > On 17/08/16 18:18, Dario Faggioli wrote: > > > diff --git a/xen/common/sched_credit2.c > > @@ -2233,7 +2241,8 @@ void __dump_execstate(void *unused); > > static struct csched2_vcpu * > > runq_candidate(struct csched2_runqueue_data *rqd, > > struct csched2_vcpu *scurr, > > - int cpu, s_time_t now) > > + int cpu, s_time_t now, > > + unsigned int *pos) > > I think I'd prefer if this were called "skipped" or something like > that > -- to indicate how many vcpus in the runqueue had been skipped before > coming to this one. > Done. > > @@ -2298,6 +2340,7 @@ csched2_schedule( > > struct csched2_runqueue_data *rqd; > > struct csched2_vcpu * const scurr = CSCHED2_VCPU(current); > > struct csched2_vcpu *snext = NULL; > > +unsigned int snext_pos = 0; > > struct task_slice ret; > > > > SCHED_STAT_CRANK(schedule); > > @@ -2347,7 +2390,7 @@ csched2_schedule( > > snext = CSCHED2_VCPU(idle_vcpu[cpu]); > > } > > else > > -snext=runq_candidate(rqd, scurr, cpu, now); > > +snext = runq_candidate(rqd, scurr, cpu, now, _pos); > > > > /* If switching from a non-idle runnable vcpu, put it > > * back on the runqueue. */ > > @@ -2371,8 +2414,21 @@ csched2_schedule( > > __set_bit(__CSFLAG_scheduled, >flags); > > } > > > > -/* Check for the reset condition */ > > -if ( snext->credit <= CSCHED2_CREDIT_RESET ) > > +/* > > + * The reset condition is "has a scheduler epoch come to > > an end?". > > + * The way this is enforced is checking whether the vcpu > > at the top > > + * of the runqueue has negative credits. This means the > > epochs have > > + * variable lenght, as in one epoch expores when: > > + * 1) the vcpu at the top of the runqueue has executed > > for > > + * around 10 ms (with default parameters); > > + * 2) no other vcpu with higher credits wants to run. > > + * > > + * Here, where we want to check for reset, we need to make > > sure the > > + * proper vcpu is being used. In fact, > > runqueue_candidate() may have > > + * not returned the first vcpu in the runqueue, for > > various reasons > > + * (e.g., affinity). Only trigger a reset when it does. > > + */ > > +if ( snext_pos == 0 && snext->credit <= > > CSCHED2_CREDIT_RESET ) > > This bit wasn't mentioned in the description. :-) > You're right. Actually, I think this change deserves to be in its own patch, so in v2 I'm splitting this patch in two. > There's a certain amount of sense to the idea here, but it's the kind > of > thing that may have strange side effects. Did you look at traces > before > and after this change? And does the behavior seem more rational? > I have. It's not like it was happening a lot of times that we were resetting upon the wrong vcpus, but I indeed have caught a couple of examples. And yes, the trace looked more 'regular' with this patch. Or, IOW, without this patch, there were some of the reset events that were suspiciously closer between each other. TBH, in the vast majority of the cases, even when a "spurious reset" was involved, the difference was rather hard to tell, but please, consider that the combination of hard-affinity, this patch and soft- affinity will potentially make things much worse (and in fact, I saw the most severe occurrences when using hard-affinity). It's also rather hard to measure the effect, but I think what is implemented here is the right thing to do. And even if it may be hard to measure the performance impact, I claim that this is a 'correctness' issue, or at least a matter of adhering as much as possible to the algorithm theory and idea. > If so, I'm happy to trust your judgement -- just want to check to > make > sure. :-) > Ah, thanks. :-) Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [Resend PATCH 2/2] Xen/timer: Process softirq during dumping timer info
Dumping timer info may run for a long time on the huge machine with a lot of physical cpus. To avoid triggering NMI watchdog, add process_pending_softirqs() in the loop of dumping timer info. Signed-off-by: Lan Tianyu--- xen/common/timer.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/xen/common/timer.c b/xen/common/timer.c index 29a60a9..ab6bca0 100644 --- a/xen/common/timer.c +++ b/xen/common/timer.c @@ -530,6 +530,7 @@ static void dump_timerq(unsigned char key) { ts = _cpu(timers, i); +process_pending_softirqs(); printk("CPU%02d:\n", i); spin_lock_irqsave(>lock, flags); for ( j = 1; j <= GET_HEAP_SIZE(ts->heap); j++ ) -- 1.7.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [Resend PATCH 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine.
Resend because the patchset seems to miss xen devel maillist. This patchset is to fix triggering NMI watchdog during dump timer info on the huge machine with a mount of physical cpus. Detail please see change log of Patch 1. Previous discussion: https://patchwork.kernel.org/patch/9328449/ Lan Tianyu (2): Xen/Keyhandler: Make keyhandler always run in tasklet Xen/timer: Process softirq during dumping timer info xen/common/keyhandler.c |8 +--- xen/common/timer.c |1 + 2 files changed, 6 insertions(+), 3 deletions(-) LocalWords: 8f82fa7cd8f2407b92d6994a65084951cf28a247 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [Resend PATCH 1/2] Xen/Keyhandler: Make keyhandler always run in tasklet
Keyhandler may run for a long time in a timer handler on the large machine with a lot of physical cpus(E,G keyhandler for dumping timer info) when serial port driver works in the poll mode. When timer interrupt arrives, timer subsystem runs all timer handlers before programming next timer interrupt. So if timer handler runs longer than time for watchdog timeout, the timer handler of watchdog will be blocked to feed watchdog and xen hypervisor panics. This patch is to fix the issue via always scheduling a tasklet to run keyhandler to avoid timer handler running too long. Signed-off-by: Lan Tianyu--- xen/common/keyhandler.c |8 +--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c index 16de6e8..fce52d2 100644 --- a/xen/common/keyhandler.c +++ b/xen/common/keyhandler.c @@ -75,7 +75,9 @@ static struct keyhandler { static void keypress_action(unsigned long unused) { -handle_keypress(keypress_key, NULL); +console_start_log_everything(); +key_table[keypress_key].fn(keypress_key); +console_end_log_everything(); } static DECLARE_TASKLET(keypress_tasklet, keypress_action, 0); @@ -87,10 +89,10 @@ void handle_keypress(unsigned char key, struct cpu_user_regs *regs) if ( key >= ARRAY_SIZE(key_table) || !(h = _table[key])->fn ) return; -if ( !in_irq() || h->irq_callback ) +if ( h->irq_callback ) { console_start_log_everything(); -h->irq_callback ? h->irq_fn(key, regs) : h->fn(key); +h->irq_fn(key, regs); console_end_log_everything(); } else -- 1.7.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [qemu-mainline test] 101213: tolerable FAIL - PUSHED
flight 101213 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/101213/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 101203 test-amd64-amd64-xl-rtds 9 debian-install fail like 101203 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 101203 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass version targeted for testing: qemuucc9a366d3b161d255fcf25aad30e0c8fcc766013 baseline version: qemuuc640f2849ee8775fe1bbd7a2772610aa77816f9f Last test of basis 101203 2016-09-29 06:01:40 Z0 days Testing same since 101213 2016-09-29 19:12:02 Z0 days1 attempts People who touched revisions under test: Daniel P. BerrangeLluÃs Vilanova Peter Maydell Stefan Hajnoczi jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass
Re: [Xen-devel] [PATCH 14/24] libxl: allow to set the ratelimit value online for Credit2
On Mon, 2016-08-22 at 10:28 +0100, Ian Jackson wrote: > Dario Faggioli writes ("[PATCH 14/24] libxl: allow to set the > ratelimit value online for Credit2"): > ... > > > > -rc = xc_sched_credit_params_set(ctx->xch, poolid, ); > > -if ( rc < 0 ) { > > -LOGE(ERROR, "setting sched credit param"); > > -GC_FREE; > > -return ERROR_FAIL; > > +r = xc_sched_credit_params_set(ctx->xch, poolid, ); > > +if ( r < 0 ) { > > +LOGE(ERROR, "Setting Credit scheduler parameters"); > > +rc = ERROR_FAIL; > > +goto out; > > I had to read this three times to figure out what the change was. > > It is good that you are fixing the coding style but can you please > put > it in a separate patch ? > Done in v2. > But I wonder whether there will still be lots of rather formulaic > code > that could profitably be generalised somehow. I'd appreciate your > views on whether that would be possible, and whether it would be a > good idea.. > I've checked, as promised. TBH, as George said already, I don't see much more room for factoring or generalizing. Certainly, not in libxl. In xl (that would be next patch, though), I especially dislike those main_sched_credit(), main_sched_credit2(), etc., but I think that, given how the interface looks like, there is again few that we can do (there's some level of generalization and indirection already, actually). So, again, apart from splitting coding style and functional changes, I don't see other ways for improving this patch. If you have more concrete ides, please share them. :-) Thanks and Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 101209: tolerable FAIL - PUSHED
flight 101209 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/101209/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-rtds 9 debian-install fail blocked in 101182 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 101182 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 101182 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 101182 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 101182 Tests which did not succeed, but are not blocking: test-amd64-i386-rumprun-i386 1 build-check(1) blocked n/a test-amd64-amd64-rumprun-amd64 1 build-check(1) blocked n/a build-amd64-rumprun 5 rumprun-buildfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass build-i386-rumprun5 rumprun-buildfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass version targeted for testing: xen fb8be95ca0b5fc816fd2234925f95c3f82ead824 baseline version: xen 1e75ed8b64bc1a9b47e540e6f100f17ec6d97f1b Last test of basis 101182 2016-09-28 09:43:21 Z1 days Failing since101190 2016-09-28 16:15:31 Z1 days3 attempts Testing same since 101209 2016-09-29 15:14:48 Z0 days1 attempts People who touched revisions under test: Boris OstrovskyDaniel Kiper Dario Faggioli Ian Jackson Jan Beulich Jan Beulich [for non-ARM parts] Jan Beulich [non-arm parts] Juergen Gross Julien Grall Keir Fraser Kevin Tian Konrad Rzeszutek Wilk Konrad Rzeszutek Wilk [for Oracle, VirtualIron and Sun contributions] Kouya Shimura Lars Kurth Paul Lai Simon Horman Stefan Berger Stefano Stabellini
[Xen-devel] [PATCH v9 10/13] x86/setup: use XEN_IMG_OFFSET instead of...
..calculating its value during runtime. Signed-off-by: Daniel KiperAcked-by: Jan Beulich --- xen/arch/x86/setup.c |4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index eb674c8..f42bf7b 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -899,7 +899,6 @@ void __init noreturn __start_xen(unsigned long mbi_p) l4_pgentry_t *pl4e; l3_pgentry_t *pl3e; l2_pgentry_t *pl2e; -uint64_t load_start; int i, j, k; /* Select relocation address. */ @@ -913,9 +912,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) * with a barrier(). After this we must *not* modify static/global * data until after we have switched to the relocated pagetables! */ -load_start = (unsigned long)_start - XEN_VIRT_START; barrier(); -move_memory(e + load_start, load_start, _end - _start, 1); +move_memory(e + XEN_IMG_OFFSET, XEN_IMG_OFFSET, _end - _start, 1); /* Walk initial pagetables, relocating page directory entries. */ pl4e = __va(__pa(idle_pg_table)); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v9 07/13] x86: add multiboot2 protocol support for EFI platforms
This way Xen can be loaded on EFI platforms using GRUB2 and other boot loaders which support multiboot2 protocol. Signed-off-by: Daniel Kiper--- v9 - suggestions/fixes: - use .L labels instead of numeric ones in multiboot2 data scanning loops (suggested by Jan Beulich). v8 - suggestions/fixes: - use __bss_start(%rip)/__bss_end(%rip) instead of of .startof.(.bss)(%rip)/$.sizeof.(.bss) because latter is not tested extensively in different built environments yet (suggested by Andrew Cooper), - fix multiboot2 data scanning loop in x86_32 code (suggested by Jan Beulich), - add check for extra mem for mbi data if Xen is loaded via multiboot2 protocol on EFI platform (suggested by Jan Beulich), - improve comments (suggested by Jan Beulich). v7 - suggestions/fixes: - do not allocate twice memory for trampoline if we were loaded via multiboot2 protocol on EFI platform, - wrap long line (suggested by Jan Beulich), - improve comments (suggested by Jan Beulich). v6 - suggestions/fixes: - improve label names in assembly error printing code (suggested by Jan Beulich), - improve comments (suggested by Jan Beulich), - various minor cleanups and fixes (suggested by Jan Beulich). v4 - suggestions/fixes: - remove redundant BSS alignment, - update BSS alignment check, - use __set_bit() instead of set_bit() if possible (suggested by Jan Beulich), - call efi_arch_cpu() from efi_multiboot2() even if the same work is done later in other place right now (suggested by Jan Beulich), - xen/arch/x86/efi/stub.c:efi_multiboot2() fail properly on EFI platforms, - do not read data beyond the end of multiboot2 information in xen/arch/x86/boot/head.S (suggested by Jan Beulich), - use 32-bit registers in x86_64 code if possible (suggested by Jan Beulich), - multiboot2 information address is 64-bit in x86_64 code, so, treat it is as is (suggested by Jan Beulich), - use cmovcc if possible, - leave only one space between rep and stosq (suggested by Jan Beulich), - improve error handling, - improve early error messages, (suggested by Jan Beulich), - improve early error messages printing code, - improve label names (suggested by Jan Beulich), - improve comments (suggested by Jan Beulich), - various minor cleanups. v3 - suggestions/fixes: - take into account alignment when skipping multiboot2 fixed part (suggested by Konrad Rzeszutek Wilk), - improve segment registers initialization (suggested by Jan Beulich), - improve comments (suggested by Jan Beulich and Konrad Rzeszutek Wilk), - improve commit message (suggested by Jan Beulich). v2 - suggestions/fixes: - generate multiboot2 header using macros (suggested by Jan Beulich), - switch CPU to x86_32 mode before jumping to 32-bit code (suggested by Andrew Cooper), - reduce code changes to increase patch readability (suggested by Jan Beulich), - improve comments (suggested by Jan Beulich), - ignore MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO tag on EFI platform and find on my own multiboot2.mem_lower value, - stop execution if EFI platform is detected in legacy BIOS path. --- xen/arch/x86/boot/head.S | 260 ++--- xen/arch/x86/efi/efi-boot.h | 54 +++- xen/arch/x86/efi/stub.c | 38 ++ xen/arch/x86/x86_64/asm-offsets.c |2 + xen/arch/x86/xen.lds.S|4 +- xen/common/efi/boot.c | 11 ++ 6 files changed, 346 insertions(+), 23 deletions(-) diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index d423fd8..0155b32 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -89,6 +89,13 @@ multiboot2_header_start: 0, /* Number of the lines - no preference. */ \ 0 /* Number of bits per pixel - no preference. */ +/* Inhibit bootloader from calling ExitBootServices(). */ +mb2ht_init MB2_HT(EFI_BS), MB2_HT(OPTIONAL) + +/* EFI64 entry point. */ +mb2ht_init MB2_HT(ENTRY_ADDRESS_EFI64), MB2_HT(OPTIONAL), \ + sym_phys(__efi64_start) + /* Multiboot2 header end tag. */ mb2ht_init MB2_HT(END), MB2_HT(REQUIRED) .Lmultiboot2_header_end: @@ -100,20 +107,49 @@ multiboot2_header_start: gdt_boot_descr: .word 6*8-1 .long sym_phys(trampoline_gdt) +.long 0 /* Needed for 64-bit lgdt */ + +cs32_switch_addr: +.long sym_phys(cs32_switch) +.word BOOT_CS32 + +.align 4 +vga_text_buffer: +.long 0xb8000 .Lbad_cpu_msg: .asciz "ERR: Not a 64-bit CPU!" .Lbad_ldr_msg: .asciz "ERR: Not a Multiboot bootloader!" +.Lbad_ldr_nbs: .asciz "ERR: Bootloader shutdown EFI x64 boot services!"
[Xen-devel] [PATCH v9 08/13] x86/boot: implement early command line parser in C
Current early command line parser implementation in assembler is very difficult to change to relocatable stuff using segment registers. This requires a lot of changes in very weird and fragile code. So, reimplement this functionality in C. This way code will be relocatable out of the box (without playing with segment registers) and much easier to maintain. Additionally, put all common cmdline.c and reloc.c definitions into defs.h header. This way we do not duplicate needlessly some stuff. And finally remove unused xen/include/asm-x86/config.h header from reloc.c dependencies. Suggested-by: Andrew CooperSigned-off-by: Daniel Kiper Acked-by: Jan Beulich --- v7 - suggestions/fixes: - add min() macro (suggested by Jan Beulich), - add padding to early_boot_opts_t in more standard way (suggested by Jan Beulich), - simplify defs.h dependencies (suggested by Jan Beulich). v6 - suggestions/fixes: - put common cmdline.c and reloc.c definitions into defs.h header (suggested by Jan Beulich), - use xen/include/xen/stdbool.h and bool type from it instead of own defined bool_t (suggested by Jan Beulich), - define delim_chars as constant (suggested by Jan Beulich), - properly align trampoline.S:early_boot_opts struct (suggested by Jan Beulich), - fix overflow check in strtoui() (suggested by Jan Beulich), - remove unused xen/include/asm-x86/config.h header from reloc.c dependencies, - improve commit message. v4 - suggestions/fixes: - move to stdcall calling convention (suggested by Jan Beulich), - define bool_t and use it properly (suggested by Jan Beulich), - put list of delimiter chars into static const char[] (suggested by Jan Beulich), - use strlen() instead of strlen_opt() (suggested by Jan Beulich), - change strtoi() to strtoui() and optimize it a bit (suggested by Jan Beulich), - define strchr() and use it in strtoui() (suggested by Jan Beulich), - optimize vga_parse() (suggested by Jan Beulich), - move !cmdline check from assembly to C (suggested by Jan Beulich), - remove my name from copyright (Oracle requirement) (suggested by Konrad Rzeszutek Wilk). v3 - suggestions/fixes: - optimize some code (suggested by Jan Beulich), - put VESA data into early_boot_opts_t members (suggested by Jan Beulich), - rename some functions and variables (suggested by Jan Beulich), - move around video.h include in xen/arch/x86/boot/trampoline.S (suggested by Jan Beulich), - fix coding style (suggested by Jan Beulich), - fix build with older GCC (suggested by Konrad Rzeszutek Wilk), - remove redundant comments (suggested by Jan Beulich), - add some comments - improve commit message (suggested by Jan Beulich). --- .gitignore |5 +- xen/arch/x86/Makefile |2 +- xen/arch/x86/boot/Makefile | 11 +- xen/arch/x86/boot/build32.mk |2 + xen/arch/x86/boot/cmdline.S| 367 xen/arch/x86/boot/cmdline.c| 340 + xen/arch/x86/boot/defs.h | 58 +++ xen/arch/x86/boot/edd.S|3 - xen/arch/x86/boot/head.S |8 + xen/arch/x86/boot/reloc.c | 13 +- xen/arch/x86/boot/trampoline.S | 15 ++ xen/arch/x86/boot/video.S |7 - 12 files changed, 437 insertions(+), 394 deletions(-) delete mode 100644 xen/arch/x86/boot/cmdline.S create mode 100644 xen/arch/x86/boot/cmdline.c create mode 100644 xen/arch/x86/boot/defs.h diff --git a/.gitignore b/.gitignore index eeabe0b..1cd886f 100644 --- a/.gitignore +++ b/.gitignore @@ -247,9 +247,10 @@ xen/arch/arm/xen.lds xen/arch/x86/asm-offsets.s xen/arch/x86/boot/mkelf32 xen/arch/x86/xen.lds +xen/arch/x86/boot/cmdline.S xen/arch/x86/boot/reloc.S -xen/arch/x86/boot/reloc.bin -xen/arch/x86/boot/reloc.lnk +xen/arch/x86/boot/*.bin +xen/arch/x86/boot/*.lnk xen/arch/x86/efi.lds xen/arch/x86/efi/check.efi xen/arch/x86/efi/disabled diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile index 2a0781a..e74fe62 100644 --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -220,5 +220,5 @@ clean:: rm -f asm-offsets.s *.lds boot/*.o boot/*~ boot/core boot/mkelf32 rm -f $(BASEDIR)/.xen-syms.[0-9]* boot/.*.d rm -f $(BASEDIR)/.xen.efi.[0-9]* efi/*.efi efi/disabled efi/mkreloc - rm -f boot/reloc.S boot/reloc.lnk boot/reloc.bin + rm -f boot/cmdline.S boot/reloc.S boot/*.lnk boot/*.bin rm -f note.o diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile index 06893d8..c6246c8 100644 --- a/xen/arch/x86/boot/Makefile +++ b/xen/arch/x86/boot/Makefile @@ -1,9 +1,16 @@ obj-bin-y += head.o -RELOC_DEPS = $(BASEDIR)/include/asm-x86/config.h
[Xen-devel] [PATCH v9 13/13] x86: add multiboot2 protocol support for relocatable images
Add multiboot2 protocol support for relocatable images. Only GRUB2 with "multiboot2: Add support for relocatable images" patch understands that feature. Older multiboot protocol (regardless of version) compatible loaders ignore it and everything works as usual. Signed-off-by: Daniel Kiper--- v9 - suggestions/fixes: - use .L labels instead of numeric ones in multiboot2 data scanning loop (suggested by Jan Beulich). v4 - suggestions/fixes: - do not get Xen image load base address from multiboot2 information in x86_64 code (suggested by Jan Beulich), - improve label names (suggested by Jan Beulich), - improve comments, (suggested by Jan Beulich). v3 - suggestions/fixes: - use %esi and %r15d instead of %ebp to store Xen image load base address, - rename some types and constants, - reformat xen/include/xen/multiboot2.h (suggested by Konrad Rzeszutek Wilk), - improve comments, - improve commit message (suggested by Konrad Rzeszutek Wilk). --- xen/arch/x86/boot/head.S | 16 xen/arch/x86/x86_64/asm-offsets.c |1 + xen/include/xen/multiboot2.h | 13 + 3 files changed, 30 insertions(+) diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index 26f48da..71c0273 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -82,6 +82,13 @@ multiboot2_header_start: /* Align modules at page boundry. */ mb2ht_init MB2_HT(MODULE_ALIGN), MB2_HT(REQUIRED) +/* Load address preference. */ +mb2ht_init MB2_HT(RELOCATABLE), MB2_HT(OPTIONAL), \ + sym_offs(start), /* Min load address. */ \ + 0x, /* The end of image max load address (4 GiB - 1). */ \ + 0x20, /* Load address alignment (2 MiB). */ \ + MULTIBOOT2_LOAD_PREFERENCE_HIGH + /* Console flags tag. */ mb2ht_init MB2_HT(CONSOLE_FLAGS), MB2_HT(OPTIONAL), \ MULTIBOOT2_CONSOLE_FLAGS_EGA_TEXT_SUPPORTED @@ -384,6 +391,15 @@ __start: cmp %edi,MB2_fixed_total_size(%ebx) jbe trampoline_bios_setup +/* Get Xen image load base address from Multiboot2 information. */ +cmpl$MULTIBOOT2_TAG_TYPE_LOAD_BASE_ADDR,MB2_tag_type(%ecx) +jne .Lmb2_mem_lower + +mov MB2_load_base_addr(%ecx),%esi +sub $XEN_IMG_OFFSET,%esi +jmp .Lmb2_next_tag + +.Lmb2_mem_lower: /* Get mem_lower from Multiboot2 information. */ cmpl$MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO,MB2_tag_type(%ecx) cmove MB2_mem_lower(%ecx),%edx diff --git a/xen/arch/x86/x86_64/asm-offsets.c b/xen/arch/x86/x86_64/asm-offsets.c index 5d7a8e5..beac5ca 100644 --- a/xen/arch/x86/x86_64/asm-offsets.c +++ b/xen/arch/x86/x86_64/asm-offsets.c @@ -174,6 +174,7 @@ void __dummy__(void) OFFSET(MB2_fixed_total_size, multiboot2_fixed_t, total_size); OFFSET(MB2_tag_type, multiboot2_tag_t, type); OFFSET(MB2_tag_size, multiboot2_tag_t, size); +OFFSET(MB2_load_base_addr, multiboot2_tag_load_base_addr_t, load_base_addr); OFFSET(MB2_mem_lower, multiboot2_tag_basic_meminfo_t, mem_lower); OFFSET(MB2_efi64_st, multiboot2_tag_efi64_t, pointer); OFFSET(MB2_efi64_ih, multiboot2_tag_efi64_ih_t, pointer); diff --git a/xen/include/xen/multiboot2.h b/xen/include/xen/multiboot2.h index 8dd5800..feb4297 100644 --- a/xen/include/xen/multiboot2.h +++ b/xen/include/xen/multiboot2.h @@ -59,11 +59,17 @@ #define MULTIBOOT2_HEADER_TAG_EFI_BS 7 #define MULTIBOOT2_HEADER_TAG_ENTRY_ADDRESS_EFI32 8 #define MULTIBOOT2_HEADER_TAG_ENTRY_ADDRESS_EFI64 9 +#define MULTIBOOT2_HEADER_TAG_RELOCATABLE 10 /* Header tag flags. */ #define MULTIBOOT2_HEADER_TAG_REQUIRED 0 #define MULTIBOOT2_HEADER_TAG_OPTIONAL 1 +/* Where image should be loaded (suggestion not requirement). */ +#define MULTIBOOT2_LOAD_PREFERENCE_NONE0 +#define MULTIBOOT2_LOAD_PREFERENCE_LOW 1 +#define MULTIBOOT2_LOAD_PREFERENCE_HIGH2 + /* Header console tag console_flags. */ #define MULTIBOOT2_CONSOLE_FLAGS_CONSOLE_REQUIRED 1 #define MULTIBOOT2_CONSOLE_FLAGS_EGA_TEXT_SUPPORTED2 @@ -90,6 +96,7 @@ #define MULTIBOOT2_TAG_TYPE_EFI_BS 18 #define MULTIBOOT2_TAG_TYPE_EFI32_IH 19 #define MULTIBOOT2_TAG_TYPE_EFI64_IH 20 +#define MULTIBOOT2_TAG_TYPE_LOAD_BASE_ADDR 21 /* Multiboot 2 tag alignment. */ #define MULTIBOOT2_TAG_ALIGN 8 @@ -120,6 +127,12 @@ typedef struct { typedef struct { u32 type; u32 size; +u32 load_base_addr; +} multiboot2_tag_load_base_addr_t; + +typedef struct { +u32 type; +u32 size; char string[]; } multiboot2_tag_string_t; -- 1.7.10.4
[Xen-devel] [PATCH v9 12/13] x86/boot: rename sym_phys() to sym_offs()
This way macro name better describes its function. Currently it is used to calculate symbol offset in relation to the beginning of Xen image mapping. However, value returned by sym_offs() for a given symbol is not always equal its physical address. There is no functional change. Suggested-by: Jan BeulichSigned-off-by: Daniel Kiper Acked-by: Jan Beulich --- v8 - suggestions/fixes: - improve commit message (suggested by Jan Beulich). --- xen/arch/x86/boot/head.S | 40 xen/arch/x86/boot/trampoline.S |2 +- xen/arch/x86/boot/wakeup.S |4 ++-- xen/arch/x86/boot/x86_64.S | 18 +- 4 files changed, 32 insertions(+), 32 deletions(-) diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index 4425436..26f48da 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -12,9 +12,9 @@ .text .code32 -#define sym_phys(sym) ((sym) - __XEN_VIRT_START) -#define sym_esi(sym) sym_phys(sym)(%esi) -#define sym_fs(sym) %fs:sym_phys(sym) +#define sym_offs(sym) ((sym) - __XEN_VIRT_START) +#define sym_esi(sym) sym_offs(sym)(%esi) +#define sym_fs(sym) %fs:sym_offs(sym) #define BOOT_CS320x0008 #define BOOT_CS640x0010 @@ -97,7 +97,7 @@ multiboot2_header_start: /* EFI64 entry point. */ mb2ht_init MB2_HT(ENTRY_ADDRESS_EFI64), MB2_HT(OPTIONAL), \ - sym_phys(__efi64_start) + sym_offs(__efi64_start) /* Multiboot2 header end tag. */ mb2ht_init MB2_HT(END), MB2_HT(REQUIRED) @@ -110,11 +110,11 @@ multiboot2_header_start: gdt_boot_descr: .word 7*8-1 gdt_boot_base: -.long sym_phys(trampoline_gdt) +.long sym_offs(trampoline_gdt) .long 0 /* Needed for 64-bit lgdt */ cs32_switch_addr: -.long sym_phys(cs32_switch) +.long sym_offs(cs32_switch) .word BOOT_CS32 .align 4 @@ -131,23 +131,23 @@ vga_text_buffer: .section .init.text, "ax", @progbits bad_cpu: -add $sym_phys(.Lbad_cpu_msg),%esi # Error message +add $sym_offs(.Lbad_cpu_msg),%esi # Error message jmp .Lget_vtb not_multiboot: -add $sym_phys(.Lbad_ldr_msg),%esi # Error message +add $sym_offs(.Lbad_ldr_msg),%esi # Error message jmp .Lget_vtb .Lmb2_no_st: -add $sym_phys(.Lbad_ldr_nst),%esi # Error message +add $sym_offs(.Lbad_ldr_nst),%esi # Error message jmp .Lget_vtb .Lmb2_no_ih: -add $sym_phys(.Lbad_ldr_nih),%esi # Error message +add $sym_offs(.Lbad_ldr_nih),%esi # Error message jmp .Lget_vtb .Lmb2_no_bs: -add $sym_phys(.Lbad_ldr_nbs),%esi # Error message +add $sym_offs(.Lbad_ldr_nbs),%esi # Error message xor %edi,%edi # No VGA text buffer jmp .Lsend_chr .Lmb2_efi_ia_32: -add $sym_phys(.Lbad_efi_msg),%esi # Error message +add $sym_offs(.Lbad_efi_msg),%esi # Error message xor %edi,%edi # No VGA text buffer jmp .Lsend_chr .Lget_vtb: @@ -352,7 +352,7 @@ __start: cli /* Load default Xen image load base address. */ -mov $sym_phys(__image_base__),%esi +mov $sym_offs(__image_base__),%esi /* Bootloaders may set multiboot{1,2}.mem_lower to a nonzero value. */ xor %edx,%edx @@ -503,8 +503,8 @@ trampoline_setup: jnz 1f /* Initialize BSS (no nasty surprises!). */ -mov $sym_phys(__bss_start),%edi -mov $sym_phys(__bss_end),%ecx +mov $sym_offs(__bss_start),%edi +mov $sym_offs(__bss_end),%ecx push%fs pop %es sub %edi,%ecx @@ -577,22 +577,22 @@ trampoline_setup: /* Apply relocations to bootstrap trampoline. */ mov sym_fs(trampoline_phys),%edx -mov $sym_phys(__trampoline_rel_start),%edi +mov $sym_offs(__trampoline_rel_start),%edi 1: mov %fs:(%edi),%eax add %edx,%fs:(%edi,%eax) add $4,%edi -cmp $sym_phys(__trampoline_rel_stop),%edi +cmp $sym_offs(__trampoline_rel_stop),%edi jb 1b /* Patch in the trampoline segment. */ shr $4,%edx -mov $sym_phys(__trampoline_seg_start),%edi +mov $sym_offs(__trampoline_seg_start),%edi 1: mov %fs:(%edi),%eax mov %dx,%fs:(%edi,%eax) add $4,%edi -cmp $sym_phys(__trampoline_seg_stop),%edi +cmp $sym_offs(__trampoline_seg_stop),%edi jb 1b /* Do not parse command line on EFI platform here. */ @@ -618,7 +618,7 @@
[Xen-devel] [PATCH v9 09/13] x86: change default load address from 1 MiB to 2 MiB
Subsequent patches introducing relocatable early boot code play with page tables using 2 MiB huge pages. If load address is not aligned at 2 MiB then code touching such page tables must have special cases for start and end of Xen image memory region. So, let's make life easier and move default load address from 1 MiB to 2 MiB. This way page table code will be nice and easy. Hence, there is a chance that it will be less error prone too... :-))) Additionally, drop first 2 MiB mapping from Xen image mapping. It is no longer needed. Signed-off-by: Daniel Kiper--- v8 - suggestions/fixes: - drop first 2 MiB mapping from Xen image mapping (suggested by Jan Beulich), - improve commit message. v7 - suggestions/fixes: - minor cleanups (suggested by Jan Beulich). --- xen/arch/x86/Makefile |2 +- xen/arch/x86/Rules.mk |3 +++ xen/arch/x86/boot/head.S |8 xen/arch/x86/boot/x86_64.S |5 +++-- xen/arch/x86/setup.c |3 ++- xen/arch/x86/xen.lds.S |2 +- 6 files changed, 10 insertions(+), 13 deletions(-) diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile index e74fe62..d5d0651 100644 --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -90,7 +90,7 @@ all_symbols = endif $(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32 - ./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) 0x10 \ + ./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) $(XEN_IMG_OFFSET) \ `$(NM) $(TARGET)-syms | sed -ne 's/^\([^ ]*\) . __2M_rwdata_end$$/0x\1/p'` ALL_OBJS := $(BASEDIR)/arch/x86/boot/built_in.o $(BASEDIR)/arch/x86/efi/built_in.o $(ALL_OBJS) diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk index 42be4bc..36e6386 100644 --- a/xen/arch/x86/Rules.mk +++ b/xen/arch/x86/Rules.mk @@ -1,9 +1,12 @@ # x86-specific definitions +XEN_IMG_OFFSET := 0x20 + CFLAGS += -I$(BASEDIR)/include CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-generic CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-default +CFLAGS += -DXEN_IMG_OFFSET=$(XEN_IMG_OFFSET) CFLAGS += '-D__OBJECT_LABEL__=$(subst /,$$,$(subst -,_,$(subst $(BASEDIR)/,,$(CURDIR))/$@))' # Prevent floating-point variables from creeping into Xen. diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index eba35d9..902ff50 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -479,14 +479,6 @@ trampoline_setup: mov %eax,sym_phys(boot_tsc_stamp) mov %edx,sym_phys(boot_tsc_stamp+4) -/* - * During boot, hook 4kB mappings of first 2MB of memory into L2. - * This avoids mixing cachability for the legacy VGA region, and is - * corrected when Xen relocates itself. - */ -mov $sym_phys(l1_identmap)+__PAGE_HYPERVISOR,%edi -mov %edi,sym_phys(l2_xenmap) - /* Apply relocations to bootstrap trampoline. */ mov sym_phys(trampoline_phys),%edx mov $sym_phys(__trampoline_rel_start),%edi diff --git a/xen/arch/x86/boot/x86_64.S b/xen/arch/x86/boot/x86_64.S index 139b2ca..7890374 100644 --- a/xen/arch/x86/boot/x86_64.S +++ b/xen/arch/x86/boot/x86_64.S @@ -121,8 +121,9 @@ GLOBAL(l2_identmap) * page. */ GLOBAL(l2_xenmap) -idx = 0 -.rept 8 +.quad 0 +idx = 1 +.rept 7 .quad sym_phys(__image_base__) + (idx << L2_PAGETABLE_SHIFT) + (PAGE_HYPERVISOR | _PAGE_PSE) idx = idx + 1 .endr diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index 9f50eb0..eb674c8 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -955,7 +955,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) * Undo the temporary-hooking of the l1_identmap. __2M_text_start * is contained in this PTE. */ -BUG_ON(l2_table_offset((unsigned long)_erodata) == +BUG_ON(using_2M_mapping() && + l2_table_offset((unsigned long)_erodata) == l2_table_offset((unsigned long)_stext)); *pl2e++ = l2e_from_pfn(xen_phys_start >> PAGE_SHIFT, PAGE_HYPERVISOR_RX | _PAGE_PSE); diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S index e0e2529..45251fd 100644 --- a/xen/arch/x86/xen.lds.S +++ b/xen/arch/x86/xen.lds.S @@ -55,7 +55,7 @@ SECTIONS __2M_text_start = .; /* Start of 2M superpages, mapped RX. */ #endif - . = __XEN_VIRT_START + MB(1); + . = __XEN_VIRT_START + XEN_IMG_OFFSET; _start = .; .text : { _stext = .;/* Text and read-only data */ -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v9 11/13] x86: make Xen early boot code relocatable
Every multiboot protocol (regardless of version) compatible image must specify its load address (in ELF or multiboot header). Multiboot protocol compatible loader have to load image at specified address. However, there is no guarantee that the requested memory region (in case of Xen it starts at 2 MiB and ends at ~5 MiB) where image should be loaded initially is a RAM and it is free (legacy BIOS platforms are merciful for Xen but I found at least one EFI platform on which Xen load address conflicts with EFI boot services; it is Dell PowerEdge R820 with latest firmware). To cope with that problem we must make Xen early boot code relocatable and help boot loader to relocate image in proper way by suggesting, not requesting specific load addresses as it is right now, allowed address ranges. This patch does former. It does not add multiboot2 protocol interface which is done in "x86: add multiboot2 protocol support for relocatable images" patch. This patch changes following things: - %esi and %r15d registers are used as a storage for Xen image load base address (%r15d shortly because %rsi is used for EFI SystemTable address in 64-bit code); both registers are (%esi is mostly) unused in early boot code and preserved during C functions calls (%esi in 32-bit code and %r15d in 64-bit code), - %fs is used as base for Xen data relative addressing in 32-bit code if it is possible; %esi is used for that thing during error printing because it is not always possible to properly and efficiently initialize %fs. Signed-off-by: Daniel Kiper--- v8 - suggestions/fixes: - use shld instead of mov and shr in BOOT_FS segment descriptor base address initialization (suggested by Jan Beulich), - simplify code updating frame addresses in page tables (suggested by Jan Beulich), - print Xen image base addresses using "%#lx" format (suggested by Jan Beulich), - improve comments (suggested by Jan Beulich). v6 - suggestions/fixes: - leave static mapping of first 16 MiB in l2_identmap as is (suggested by Jan Beulich), - use xen_phys_start instead of xen_img_load_base_addr (suggested by Daniel Kiper and Jan Beulich), - simplify BOOT_FS segment descriptor base address initialization (suggested by Jan Beulich), - fix BOOT_FS segment limit (suggested by Jan Beulich), - do not rename sym_phys in this patch (suggested by Jan Beulich), - rename esi_offset/fs_offset to sym_esi/sym_fs respectively (suggested by Jan Beulich), - use add instead of lea in assembly error printing code (suggested by Jan Beulich), - improve comments (suggested by Jan Beulich), - improve commit message (suggested by Jan Beulich), - various minor cleanups and fixes (suggested by Jan Beulich). v4 - suggestions/fixes: - do not relocate Xen image if boot loader did work for us (suggested by Andrew Cooper and Jan Beulich), - initialize xen_img_load_base_addr in EFI boot code too, - properly initialize trampoline_xen_phys_start, - calculate Xen image load base address in x86_64 code ourselves, (suggested by Jan Beulich), - change how and when Xen image base address is printed, - use %fs instead of %esi for relative addressing (suggested by Andrew Cooper and Jan Beulich), - create esi_offset and fs_offset() macros in assembly, - calculate mkelf32 argument automatically, - optimize and cleanup code, - improve comments, - improve commit message. v3 - suggestions/fixes: - improve segment registers initialization (suggested by Jan Beulich), - simplify Xen image load base address calculation (suggested by Jan Beulich), - use %esi and %r15d instead of %ebp to store Xen image load base address, - use %esi instead of %fs for relative addressing; this way we get shorter and simpler code, - rename some variables and constants (suggested by Jan Beulich), - improve comments (suggested by Konrad Rzeszutek Wilk), - improve commit message (suggested by Jan Beulich). --- xen/arch/x86/boot/head.S | 161 + xen/arch/x86/boot/trampoline.S|5 ++ xen/arch/x86/boot/x86_64.S| 21 +++-- xen/arch/x86/setup.c | 14 ++-- xen/arch/x86/x86_64/asm-offsets.c |3 + xen/include/asm-x86/page.h|2 +- 6 files changed, 156 insertions(+), 50 deletions(-) diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index 902ff50..4425436 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -13,12 +13,15 @@ .code32 #define sym_phys(sym) ((sym) - __XEN_VIRT_START) +#define sym_esi(sym) sym_phys(sym)(%esi) +#define sym_fs(sym) %fs:sym_phys(sym) #define BOOT_CS320x0008 #define BOOT_CS640x0010 #define BOOT_DS 0x0018 #define BOOT_PSEUDORM_CS
[Xen-devel] [PATCH v9 03/13] x86: allow EFI reboot method neither on EFI platforms...
..nor EFI platforms with runtime services enabled. Suggested-by: Jan BeulichSigned-off-by: Daniel Kiper Acked-by: Jan Beulich --- v6 - suggestions/fixes: - move this commit behind "efi: create efi_enabled()" commit (suggested by Jan Beulich). v5 - suggestions/fixes: - fix build error (suggested by Jan Beulich), - improve commit message. --- xen/arch/x86/shutdown.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/xen/arch/x86/shutdown.c b/xen/arch/x86/shutdown.c index 54c2c79..b429fd0 100644 --- a/xen/arch/x86/shutdown.c +++ b/xen/arch/x86/shutdown.c @@ -80,6 +80,9 @@ static void __init set_reboot_type(char *str) break; str++; } + +if ( reboot_type == BOOT_EFI && !efi_enabled(EFI_RS) ) +reboot_type = BOOT_INVALID; } custom_param("reboot", set_reboot_type); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v9 04/13] x86: properly calculate xen ELF end of image address
This patch is prereq for "efi: build xen.gz with EFI code" patch which adds, among others, xen/arch/x86/efi/relocs-dummy.S to xen.gz output. Below there is a description why it is needed. Currently xen ELF end of image address is calculated using first line from "nm -nr xen/xen-syms" output. However, potentially it may contain symbol address not related to the end of image in any way. It can happen if a symbol is introduced with address larger than _end symbol address. Such situation encountered when I linked xen ELF binary with xen/arch/x86/efi/relocs-dummy.S. Then first line from "nm -nr xen/xen-syms" contained "82d0c000 A ALT_START" and xen ELF image memory size was silently set to 1023 MiB. This issue happened because there is no check which symbol address is used to calculate end of image address. So, let's fix it and take ELF end of image address by reading __2M_rwdata_end symbol address from nm output. This way xen ELF image build process is not prone to changes in order of nm output. Signed-off-by: Daniel Kiper--- v9 - suggestions/fixes: - use __2M_rwdata_end symbol instead of _end symbol (suggested by Jan Beulich), - really fix indention (suggested by Jan Beulich), - improve commit message (suggested by Jan Beulich). v8 - suggestions/fixes: - use spaces instead of tab in indentation (suggested by Jan Beulich and Konrad Rzeszutek Wilk), - improve commit message (suggested by Jan Beulich). v7 - suggestions/fixes: - use sed instead of awk (suggested by Jan Beulich), - improve commit message (suggested by Jan Beulich). --- xen/arch/x86/Makefile |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile index 931917d..e40897f 100644 --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -91,7 +91,7 @@ endif $(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32 ./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) 0x10 \ - `$(NM) -nr $(TARGET)-syms | head -n 1 | sed -e 's/^\([^ ]*\).*/0x\1/'` + `$(NM) $(TARGET)-syms | sed -ne 's/^\([^ ]*\) . __2M_rwdata_end$$/0x\1/p'` ALL_OBJS := $(BASEDIR)/arch/x86/boot/built_in.o $(BASEDIR)/arch/x86/efi/built_in.o $(ALL_OBJS) -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v9 02/13] efi: create efi_enabled()
First of all we need to differentiate between legacy BIOS and EFI platforms during runtime, not during build, because one image will have legacy and EFI code and can be executed on both platforms. Additionally, we need more fine grained knowledge about EFI environment and check for EFI platform and EFI loader separately to properly support multiboot2 protocol. In general Xen loaded by this protocol uses memory mappings and loaded modules in similar way to Xen loaded by multiboot (v1) protocol. Hence, create efi_enabled() which checks available features in efi_flags. This patch defines EFI_BOOT, EFI_LOADER and EFI_RS features. EFI_BOOT is equal to old efi_enabled == 1. EFI_RS ease control on runtime services usage. EFI_LOADER tells that Xen was loaded directly from EFI as PE executable. Suggested-by: Jan BeulichSigned-off-by: Daniel Kiper Reviewed-by: Jan Beulich --- v7 - suggestions/fixes: - remove efi_enabled(EFI_RS) check from mapcache_current_vcpu() (suggested by Daniel Kiper and Jan Beulich), - remove BUG() from xen/arch/x86/efi/stub.c:efi_rs_using_pgtables() (suggested by Daniel Kiper and Jan Beulich). v6 - suggestions/fixes: - define efi_enabled() as "bool efi_enabled(unsigned int feature)" instead of "bool_t efi_enabled(int feature)" (suggested by Jan Beulich), - define efi_flags as unsigned int (suggested by Jan Beulich), - various minor cleanups and fixes (suggested by Jan Beulich). v5 - suggestions/fixes: - squash three patches into one (suggested by Jan Beulich), - introduce all features at once (suggested by Jan Beulich), - efi_enabled() returns bool_t instead of unsigned int (suggested by Jan Beulich), - update commit message. v4 - suggestions/fixes: - rename EFI_PLATFORM to EFI_BOOT (suggested by Jan Beulich), - move EFI_BOOT definition to efi struct definition (suggested by Jan Beulich), - remove unneeded efi.flags initialization (suggested by Jan Beulich), - use __set_bit() instead of set_bit() if possible (suggested by Jan Beulich), - do efi_enabled() cleanup (suggested by Jan Beulich), - improve comments (suggested by Jan Beulich), - improve commit message. v3 - suggestions/fixes: - define efi struct in xen/arch/x86/efi/stub.c in earlier patch (suggested by Jan Beulich), - improve comments (suggested by Jan Beulich), - improve commit message (suggested by Jan Beulich). --- xen/arch/x86/dmi_scan.c|4 ++-- xen/arch/x86/domain_page.c |2 +- xen/arch/x86/efi/stub.c|8 xen/arch/x86/mpparse.c |4 ++-- xen/arch/x86/setup.c | 12 +++- xen/arch/x86/shutdown.c|2 +- xen/arch/x86/time.c|2 +- xen/common/efi/boot.c | 19 +++ xen/common/efi/runtime.c | 14 -- xen/common/version.c |2 +- xen/drivers/acpi/osl.c |2 +- xen/include/xen/efi.h |8 ++-- 12 files changed, 49 insertions(+), 30 deletions(-) diff --git a/xen/arch/x86/dmi_scan.c b/xen/arch/x86/dmi_scan.c index b049e31..8dcb640 100644 --- a/xen/arch/x86/dmi_scan.c +++ b/xen/arch/x86/dmi_scan.c @@ -238,7 +238,7 @@ const char *__init dmi_get_table(paddr_t *base, u32 *len) { static unsigned int __initdata instance; - if (efi_enabled) { + if (efi_enabled(EFI_BOOT)) { if (efi_smbios3_size && !(instance & 1)) { *base = efi_smbios3_address; *len = efi_smbios3_size; @@ -696,7 +696,7 @@ static void __init dmi_decode(struct dmi_header *dm) void __init dmi_scan_machine(void) { - if ((!efi_enabled ? dmi_iterate(dmi_decode) : + if ((!efi_enabled(EFI_BOOT) ? dmi_iterate(dmi_decode) : dmi_efi_iterate(dmi_decode)) == 0) dmi_check_system(dmi_blacklist); else diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c index d86f8fe..a58ef8e 100644 --- a/xen/arch/x86/domain_page.c +++ b/xen/arch/x86/domain_page.c @@ -36,7 +36,7 @@ static inline struct vcpu *mapcache_current_vcpu(void) * domain's page tables but current may point at another domain's VCPU. * Return NULL as though current is not properly set up yet. */ -if ( efi_enabled && efi_rs_using_pgtables() ) +if ( efi_rs_using_pgtables() ) return NULL; /* diff --git a/xen/arch/x86/efi/stub.c b/xen/arch/x86/efi/stub.c index 07c2bd0..4158124 100644 --- a/xen/arch/x86/efi/stub.c +++ b/xen/arch/x86/efi/stub.c @@ -4,9 +4,10 @@ #include #include -#ifndef efi_enabled -const bool_t efi_enabled = 0; -#endif +bool efi_enabled(unsigned int feature) +{ +return false; +} void __init efi_init_memory(void) { } @@ -14,7 +15,6 @@ void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e) { } bool_t efi_rs_using_pgtables(void) { -
[Xen-devel] [PATCH v9 00/13] x86: multiboot2 protocol support
Hi, I am sending ninth version of multiboot2 protocol support for legacy BIOS and EFI platforms. This patch series release contains fixes for all known issues. The final goal is xen.efi binary file which could be loaded by EFI loader, multiboot (v1) protocol (only on legacy BIOS platforms) and multiboot2 protocol. This way we will have: - smaller Xen code base, - one code base for xen.gz and xen.efi, - one build method for xen.gz and xen.efi; xen.efi will be extracted from xen(-syms) file using objcopy or special custom tool, - xen.efi build will not so strongly depend on a given GCC and binutils version. Here is short list of changes since v8: - changed patches: 01, 04, 06, 07, 13. Julien Grall raised some concerns in regards to EFI boot allocator implementation on ARM. After some discussion with him and Jan I did a few changes requested by them. Hence, I am asking especially ARM guys to review patch #06 and check that everything is OK. I hope that (at least some) features provided by this patch series will be included in Xen 4.8 release in one way or another. If you are not interested in this patch series at all please drop me a line and I will remove you from distribution list. Daniel .gitignore|5 +- xen/arch/x86/Makefile |8 +- xen/arch/x86/Rules.mk |3 + xen/arch/x86/boot/Makefile| 12 +- xen/arch/x86/boot/build32.mk |2 + xen/arch/x86/boot/cmdline.S | 367 xen/arch/x86/boot/cmdline.c | 340 xen/arch/x86/boot/defs.h | 58 + xen/arch/x86/boot/edd.S |3 - xen/arch/x86/boot/head.S | 538 ++ xen/arch/x86/boot/reloc.c | 151 +-- xen/arch/x86/boot/trampoline.S| 22 +++- xen/arch/x86/boot/video.S |7 -- xen/arch/x86/boot/wakeup.S|4 +- xen/arch/x86/boot/x86_64.S| 44 +++ xen/arch/x86/dmi_scan.c |4 +- xen/arch/x86/domain_page.c|2 +- xen/arch/x86/efi/Makefile | 12 +- xen/arch/x86/efi/efi-boot.h | 65 +++--- xen/arch/x86/efi/stub.c | 46 ++- xen/arch/x86/mpparse.c|4 +- xen/arch/x86/setup.c | 34 +++--- xen/arch/x86/shutdown.c |5 +- xen/arch/x86/time.c |2 +- xen/arch/x86/x86_64/asm-offsets.c | 15 +++ xen/arch/x86/xen.lds.S| 10 +- xen/common/efi/boot.c | 85 - xen/common/efi/runtime.c | 23 +++- xen/common/version.c |2 +- xen/drivers/acpi/osl.c|2 +- xen/include/asm-x86/page.h|2 +- xen/include/xen/efi.h |8 +- xen/include/xen/multiboot2.h | 182 33 files changed, 1528 insertions(+), 539 deletions(-) Daniel Kiper (13): x86: add multiboot2 protocol support efi: create efi_enabled() x86: allow EFI reboot method neither on EFI platforms... x86: properly calculate xen ELF end of image address efi: build xen.gz with EFI code efi: create new early memory allocator x86: add multiboot2 protocol support for EFI platforms x86/boot: implement early command line parser in C x86: change default load address from 1 MiB to 2 MiB x86/setup: use XEN_IMG_OFFSET instead of... x86: make Xen early boot code relocatable x86/boot: rename sym_phys() to sym_offs() x86: add multiboot2 protocol support for relocatable images ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v9 05/13] efi: build xen.gz with EFI code
Build xen.gz with EFI code. We need this to support multiboot2 protocol on EFI platforms. If we wish to load non-ELF file using multiboot (v1) or multiboot2 then it must contain "linear" (or "flat") representation of code and data. This is requirement of both boot protocols. Currently, PE file contains many sections which are not "linear" (one after another without any holes) or even do not have representation in a file (e.g. BSS). From EFI point of view everything is OK and works. However, this file layout cannot be properly interpreted by multiboot protocols family. In theory there is a chance that we could build proper PE file (from multiboot protocols POV) using current build system. However, it means that xen.efi further diverge from Xen ELF file (in terms of contents and build method). On the other hand ELF has all needed properties. So, it means that this is good starting point for further development. Additionally, I think that this is also good starting point for further xen.efi code and build optimizations. It looks that there is a chance that finally we can generate xen.efi directly from Xen ELF using just simple objcopy or other tool. This way we will have one Xen binary which can be loaded by three boot protocols: EFI native loader, multiboot (v1) and multiboot2. Signed-off-by: Daniel KiperAcked-by: Jan Beulich --- v6 - suggestions/fixes: - improve efi_enabled() checks in efi_runtime_call() (suggested by Jan Beulich). v5 - suggestions/fixes: - properly calculate efi symbol address in xen/arch/x86/xen.lds.S (I hope that this change does not invalidate Jan's ACK). v4 - suggestions/fixes: - functions should return -ENOSYS instead of -EOPNOTSUPP if EFI runtime services are not available (suggested by Jan Beulich), - remove stale bits from xen/arch/x86/Makefile (suggested by Jan Beulich). v3 - suggestions/fixes: - check for EFI platform in EFI code (suggested by Jan Beulich), - fix Makefiles (suggested by Jan Beulich), - improve commit message (suggested by Jan Beulich). v2 - suggestions/fixes: - build EFI code only if it is supported in a given build environment (suggested by Jan Beulich). --- xen/arch/x86/Makefile |2 +- xen/arch/x86/efi/Makefile | 12 xen/arch/x86/xen.lds.S|4 ++-- xen/common/efi/boot.c |3 +++ xen/common/efi/runtime.c |9 + 5 files changed, 19 insertions(+), 11 deletions(-) diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile index e40897f..2a0781a 100644 --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -219,6 +219,6 @@ efi/mkreloc: efi/mkreloc.c clean:: rm -f asm-offsets.s *.lds boot/*.o boot/*~ boot/core boot/mkelf32 rm -f $(BASEDIR)/.xen-syms.[0-9]* boot/.*.d - rm -f $(BASEDIR)/.xen.efi.[0-9]* efi/*.o efi/.*.d efi/*.efi efi/disabled efi/mkreloc + rm -f $(BASEDIR)/.xen.efi.[0-9]* efi/*.efi efi/disabled efi/mkreloc rm -f boot/reloc.S boot/reloc.lnk boot/reloc.bin rm -f note.o diff --git a/xen/arch/x86/efi/Makefile b/xen/arch/x86/efi/Makefile index ad3fdf7..442f3fc 100644 --- a/xen/arch/x86/efi/Makefile +++ b/xen/arch/x86/efi/Makefile @@ -1,18 +1,14 @@ CFLAGS += -fshort-wchar -obj-y += stub.o - -create = test -e $(1) || touch -t 19990101 $(1) - efi := y$(shell rm -f disabled) efi := $(if $(efi),$(shell $(CC) $(filter-out $(CFLAGS-y) .%.d,$(CFLAGS)) -c check.c 2>disabled && echo y)) efi := $(if $(efi),$(shell $(LD) -mi386pep --subsystem=10 -o check.efi check.o 2>disabled && echo y)) -efi := $(if $(efi),$(shell rm disabled)y,$(shell $(call create,boot.init.o); $(call create,runtime.o))) - -extra-$(efi) += boot.init.o relocs-dummy.o runtime.o compat.o buildid.o +efi := $(if $(efi),$(shell rm disabled)y) %.o: %.ihex $(OBJCOPY) -I ihex -O binary $< $@ -stub.o: $(extra-y) +obj-y := stub.o +obj-$(efi) := boot.init.o compat.o relocs-dummy.o runtime.o +extra-$(efi) += buildid.o nogcov-$(efi) += stub.o diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S index 7676de9..b0b1c9b 100644 --- a/xen/arch/x86/xen.lds.S +++ b/xen/arch/x86/xen.lds.S @@ -270,10 +270,10 @@ SECTIONS .pad : { . = ALIGN(MB(16)); } :text -#else - efi = .; #endif + efi = DEFINED(efi) ? efi : .; + /* Sections to be discarded */ /DISCARD/ : { *(.exit.text) diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c index 56544dc..1ef5d0b 100644 --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -1251,6 +1251,9 @@ void __init efi_init_memory(void) } *extra, *extra_head = NULL; #endif +if ( !efi_enabled(EFI_BOOT) ) +return; + printk(XENLOG_INFO "EFI memory map:%s\n", map_bs ? " (mapping BootServices)" : ""); for ( i = 0; i < efi_memmap_size; i += efi_mdesc_size ) diff --git a/xen/common/efi/runtime.c b/xen/common/efi/runtime.c index 5f2de80..41f49f7 100644 ---
[Xen-devel] [PATCH v9 06/13] efi: create new early memory allocator
There is a problem with place_string() which is used as early memory allocator. It gets memory chunks starting from start symbol and goes down. Sadly this does not work when Xen is loaded using multiboot2 protocol because then the start lives on 1 MiB address and we should not allocate a memory from below of it. So, I tried to use mem_lower address calculated by GRUB2. However, this solution works only on some machines. There are machines in the wild (e.g. Dell PowerEdge R820) which uses first ~640 KiB for boot services code or data... :-((( Hence, we need new memory allocator for Xen EFI boot code which is quite simple and generic and could be used by place_string() and efi_arch_allocate_mmap_buffer(). I think about following solutions: 1) We could use native EFI allocation functions (e.g. AllocatePool() or AllocatePages()) to get memory chunk. However, later (somewhere in __start_xen()) we must copy its contents to safe place or reserve it in e820 memory map and map it in Xen virtual address space. This means that the code referring to Xen command line, loaded modules and EFI memory map, mostly in __start_xen(), will be further complicated and diverge from legacy BIOS cases. Additionally, both former things have to be placed below 4 GiB because their addresses are stored in multiboot_info_t structure which has 32-bit relevant members. 2) We may allocate memory area statically somewhere in Xen code which could be used as memory pool for early dynamic allocations. Looks quite simple. Additionally, it would not depend on EFI at all and could be used on legacy BIOS platforms if we need it. However, we must carefully choose size of this pool. We do not want increase Xen binary size too much and waste too much memory but also we must fit at least memory map on x86 EFI platforms. As I saw on small machine, e.g. IBM System x3550 M2 with 8 GiB RAM, memory map may contain more than 200 entries. Every entry on x86-64 platform is 40 bytes in size. So, it means that we need more than 8 KiB for EFI memory map only. Additionally, if we use this memory pool for Xen and modules command line storage (it would be used when xen.efi is executed as EFI application) then we should add, I think, about 1 KiB. In this case, to be on safe side, we should assume at least 64 KiB pool for early memory allocations. Which is about 4 times of our earlier calculations. However, during discussion on Xen-devel Jan Beulich suggested that just in case we should use 1 MiB memory pool like it is in original place_string() implementation. So, let's use 1 MiB as it was proposed. If we think that we should not waste unallocated memory in the pool on running system then we can mark this region as __initdata and move all required data to dynamically allocated places somewhere in __start_xen(). 2a) We could put memory pool into .bss.page_aligned section. Then allocate memory chunks starting from the lowest address. After init phase we can free unused portion of the memory pool as in case of .init.text or .init.data sections. This way we do not need to allocate any space in image file and freeing of unused area in the memory pool is very simple. Now #2a solution is implemented because it is quite simple and requires limited number of changes, especially in __start_xen(). New allocator is quite generic and can be used on ARM platforms too. Though it is not enabled on ARM yet due to lack of some prereq. List of them is placed before ebmalloc code. Signed-off-by: Daniel Kiper--- v9 - suggestions/fixes: - call free_ebmalloc_unused_mem() from efi_init_memory() instead of xen/arch/arm/setup.c:init_done() (suggested by Jan Beulich), - improve comments. v8 - suggestions/fixes: - disable whole ebmalloc machinery on ARM platforms, - add comment saying what should be done before enabling ebmalloc on ARM, (suggested by Julien Grall), - move ebmalloc code before efi-boot.h inclusion and remove unneeded forward declaration (suggested by Jan Beulich), - remove free_ebmalloc_unused_mem() call from xen/arch/arm/setup.c:init_done() (suggested by Julien Grall), - improve commit message. v7 - suggestions/fixes: - enable most of ebmalloc machinery on ARM platforms (suggested by Jan Beulich), - remove unneeded cast (suggested by Jan Beulich), - wrap long line (suggested by Jan Beulich), - improve commit message. v6 - suggestions/fixes: - optimize ebmalloc allocator, - move ebmalloc machinery to xen/common/efi/boot.c (suggested by Jan Beulich), - enforce PAGE_SIZE ebmalloc_mem alignment (suggested by Jan Beulich), - ebmalloc() must allocate properly aligned memory regions (suggested by Jan Beulich), - printk() should use XENLOG_INFO (suggested by Jan Beulich). v4 - suggestions/fixes: - move from #2 solution
[Xen-devel] [PATCH v9 01/13] x86: add multiboot2 protocol support
Add multiboot2 protocol support. Alter min memory limit handling as we now may not find it from either multiboot (v1) or multiboot2. This way we are laying the foundation for EFI + GRUB2 + Xen development. Signed-off-by: Daniel KiperReviewed-by: Jan Beulich --- v9 - suggestions/fixes: - use .L label instead of numeric one in multiboot2 data scanning loop; I hope that this change does not invalidate Jan's Reviewed-by (suggested by Jan Beulich). v8 - suggestions/fixes: - use sizeof(/) instead of sizeof() if it is possible (suggested by Jan Beulich). v7 - suggestions/fixes: - rename mbi_mbi/mbi2_mbi to mbi_reloc/mbi2_reloc respectively (suggested by Jan Beulich), - initialize mbi_out->flags using "|=" instead of "=" (suggested by Jan Beulich), - use sizeof(*mmap_dst) instead of sizeof(memory_map_t) if it makes sense (suggested by Jan Beulich). v6 - suggestions/fixes: - properly index multiboot2_tag_mmap_t.entries[] (suggested by Jan Beulich), - do not index mbi_out_mods[] beyond its end (suggested by Andrew Cooper), - reduce number of casts (suggested by Andrew Cooper and Jan Beulich), - add braces to increase code readability (suggested by Andrew Cooper). v5 - suggestions/fixes: - check multiboot2_tag_mmap_t.entry_size before multiboot2_tag_mmap_t.entries[] use (suggested by Jan Beulich), - properly index multiboot2_tag_mmap_t.entries[] (suggested by Jan Beulich), - use "type name[]" instad of "type name[0]" in xen/include/xen/multiboot2.h (suggested by Jan Beulich), - remove unneeded comment (suggested by Jan Beulich). v4 - suggestions/fixes: - avoid assembly usage in xen/arch/x86/boot/reloc.c, - fix boundary check issue and optimize for() loops in mbi2_mbi(), - move to stdcall calling convention, - remove unneeded typeof() from ALIGN_UP() macro (suggested by Jan Beulich), - add and use NULL definition in xen/arch/x86/boot/reloc.c (suggested by Jan Beulich), - do not read data beyond the end of multiboot2 information in xen/arch/x86/boot/head.S (suggested by Jan Beulich), - add :req to some .macro arguments (suggested by Jan Beulich), - use cmovcc if possible, - add .L to multiboot2_header_end label (suggested by Jan Beulich), - add .L to multiboot2_proto label (suggested by Jan Beulich), - improve label names (suggested by Jan Beulich). v3 - suggestions/fixes: - reorder reloc() arguments (suggested by Jan Beulich), - remove .L from multiboot2 header labels (suggested by Andrew Cooper, Jan Beulich and Konrad Rzeszutek Wilk), - take into account alignment when skipping multiboot2 fixed part (suggested by Konrad Rzeszutek Wilk), - create modules data if modules count != 0 (suggested by Jan Beulich), - improve macros (suggested by Jan Beulich), - reduce number of casts (suggested by Jan Beulich), - use const if possible (suggested by Jan Beulich), - drop static and __used__ attribute from reloc() (suggested by Jan Beulich), - remove isolated/stray __packed attribute from multiboot2_memory_map_t type definition (suggested by Jan Beulich), - reformat xen/include/xen/multiboot2.h (suggested by Konrad Rzeszutek Wilk), - improve comments (suggested by Konrad Rzeszutek Wilk), - remove hard tabs (suggested by Jan Beulich and Konrad Rzeszutek Wilk). v2 - suggestions/fixes: - generate multiboot2 header using macros (suggested by Jan Beulich), - improve comments (suggested by Jan Beulich), - simplify assembly in xen/arch/x86/boot/head.S (suggested by Jan Beulich), - do not include include/xen/compiler.h in xen/arch/x86/boot/reloc.c (suggested by Jan Beulich), - do not read data beyond the end of multiboot2 information (suggested by Jan Beulich). v2 - not fixed yet: - dynamic dependency generation for xen/arch/x86/boot/reloc.S; this requires more work; I am not sure that it pays because potential patch requires more changes than addition of just multiboot2.h to Makefile (suggested by Jan Beulich), - isolated/stray __packed attribute usage for multiboot2_memory_map_t (suggested by Jan Beulich). --- xen/arch/x86/boot/Makefile|3 +- xen/arch/x86/boot/head.S | 107 ++- xen/arch/x86/boot/reloc.c | 148 ++-- xen/arch/x86/x86_64/asm-offsets.c |9 ++ xen/include/xen/multiboot2.h | 169 + 5 files changed, 426 insertions(+), 10 deletions(-) create mode 100644 xen/include/xen/multiboot2.h diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile index 5fdb5ae..06893d8 100644 --- a/xen/arch/x86/boot/Makefile +++ b/xen/arch/x86/boot/Makefile @@ -1,6 +1,7 @@
Re: [Xen-devel] [PATCH v2 15/16] x86/PV: use generic emulator for privileged instruction handling
On 28/09/16 09:18, Jan Beulich wrote: > There's a new emulator return code being added to allow bypassing > certain operations (see the code comment). Its handling in the epilogue > code involves moving the raising of the single step trap until after > registers were updated. This should probably have been that way from > the beginning, to allow the inject_hw_exception() hook to see updated > register state (in case it cares) - it's a trap, after all. I agree. (However, given the complexity of this patch, it really would be better to split changes like the #DB handling out into a separate patch). > > The other small tweak to the emulator is to single iteration handling > of INS and OUTS: Since we don't want to handle any other memory access > instructions, we want these to be handled by the rep_ins() / rep_outs() > hooks here too. The read() / write() hook pointers get checked for that > purpose. Moving the non-rep INS/OUTS instructions into rep_ins/outs() (perhaps with dropping the rep_ prefix from the callback names) seems sensible. However, making this implicit on a check against the read/write hooks doesn't seem sensible. Anyone looking at the code is going to get thoroughly confused. Can't we make the ins/outs hook deal properly with a rep of 1, and have x86_emulate() know not to update %ecx in this case? > > And finally handling of exceptions gets changed for REP INS / REP OUTS: > If the hook return X86EMUL_EXCEPTION, register state will still get > updated if some iterations have been performed (but the rIP update will > get suppressed if not all of them did get handled). Isn't this what happens on real hardware anyway? > While on the HVM side > the VA -> LA -> PA translation process clips the number of repetitions, > doing so would unduly complicate the PV side code being added here. > > Signed-off-by: Jan Beulich> --- > One thing to be considered is that despite avoiding the handling of > memory reads and writes (other than for INS and OUTS) the set of insns > now getting potentially handled by the emulator is much larger than > before. A possible solution to this would be a new hook to be called > between decode and execution stages, allowing further restrictions to > be enforced. Of course this could easily be a follow-up patch, as the > one here is quite big already. I think this would be a very sensible precaution. I would suggest even that this patch doesn't get committed without being adjacent to such a patch. > > Another thing to consider is to the extend the X86EMUL_EXCEPTION > handling change mentioned above to other string instructions. In that > case this should probably be broken out into a prereq patch. Yes. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 16/16] x86emul: don't assume a memory operand
On 28/09/16 09:19, Jan Beulich wrote: > Especially for x86_insn_operand_ea() to return dependable segment > information even when the caller didn't consider applicability, we > shouldn't have ea.type start out as OP_MEM. Make it OP_NONE instead, > and set it to OP_MEM when we actually encounter memory like operands. > > This requires to eliminate the XSA-123 fix, which has been no longer > necessary since the elimination of the union in commit dd766684e7. That > in turn allows restricting the scope of override_seg to x86_decode(). > At this occasion also make it have a proper type, instead of plain int. > > Signed-off-by: Jan Beulich> > --- a/xen/arch/x86/x86_emulate/x86_emulate.c > +++ b/xen/arch/x86/x86_emulate/x86_emulate.c > @@ -1647,7 +1647,6 @@ struct x86_emulate_state { > opcode_desc_t desc; > union vex vex; > union evex evex; > -int override_seg; > > /* > * Data operand effective address (usually computed from ModRM). > @@ -1683,7 +1682,6 @@ struct x86_emulate_state { > #define lock_prefix (state->lock_prefix) > #define vex (state->vex) > #define evex (state->evex) > -#define override_seg (state->override_seg) > #define ea (state->ea) > > static int > @@ -1712,6 +1710,7 @@ x86_decode_onebyte( > case 0xa0: case 0xa1: /* mov mem.offs,{%al,%ax,%eax,%rax} */ > case 0xa2: case 0xa3: /* mov {%al,%ax,%eax,%rax},mem.offs */ > /* Source EA is not encoded via ModRM. */ > +ea.type = OP_MEM; > ea.mem.off = insn_fetch_bytes(ad_bytes); > break; > > @@ -1802,11 +1801,11 @@ x86_decode( > { > uint8_t b, d, sib, sib_index, sib_base; > unsigned int def_op_bytes, def_ad_bytes, opcode; > +enum x86_segment override_seg = x86_seg_none; > int rc = X86EMUL_OKAY; > > memset(state, 0, sizeof(*state)); > -override_seg = -1; > -ea.type = OP_MEM; > +ea.type = OP_NONE; > ea.mem.seg = x86_seg_ds; > ea.reg = PTR_POISON; > state->regs = ctxt->regs; > @@ -2102,6 +2101,7 @@ x86_decode( > else if ( ad_bytes == 2 ) > { > /* 16-bit ModR/M decode. */ > +ea.type = OP_MEM; > switch ( modrm_rm ) > { > case 0: > @@ -2152,6 +2152,7 @@ x86_decode( > else > { > /* 32/64-bit ModR/M decode. */ > +ea.type = OP_MEM; > if ( modrm_rm == 4 ) > { > sib = insn_fetch_type(uint8_t); > @@ -2216,7 +2217,7 @@ x86_decode( > } > } > > -if ( override_seg != -1 && ea.type == OP_MEM ) > +if ( override_seg != x86_seg_none ) I don't see why the "ea.type == OP_MEM" should be dropped at this point. We have already set ea.type appropriately for memory instructions by this point, and it does open up the case where instructions which would have triggered XSA-123 get incorrect information reported if queried with x86_insn_operand_ea() ~Andrew > ea.mem.seg = override_seg; > > /* Fetch the immediate operand, if present. */ > @@ -4253,13 +4254,11 @@ x86_emulate( > generate_exception_if(limit < sizeof(long) || >(limit & (limit - 1)), EXC_UD, -1); > base &= ~(limit - 1); > -if ( override_seg == -1 ) > -override_seg = x86_seg_ds; > if ( ops->rep_stos ) > { > unsigned long nr_reps = limit / sizeof(zero); > > -rc = ops->rep_stos(, override_seg, base, sizeof(zero), > +rc = ops->rep_stos(, ea.mem.seg, base, sizeof(zero), > _reps, ctxt); > if ( rc == X86EMUL_OKAY ) > { > @@ -4271,7 +4270,7 @@ x86_emulate( > } > while ( limit ) > { > -rc = ops->write(override_seg, base, , sizeof(zero), > ctxt); > +rc = ops->write(ea.mem.seg, base, , sizeof(zero), ctxt); > if ( rc != X86EMUL_OKAY ) > goto done; > base += sizeof(zero); > @@ -5257,7 +5256,6 @@ x86_emulate( > #undef rex_prefix > #undef lock_prefix > #undef vex > -#undef override_seg > #undef ea > > #ifdef __XEN__ > > > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 10/24] xen: tracing: improve Credit2's tickle_check and burn_credits records
On Thu, 2016-09-29 at 18:28 +0100, George Dunlap wrote: > On 29/09/16 18:23, Dario Faggioli wrote: > > In that case, knowing where a certain vcpu that we're asking to > > burn > > its credit is running, may mean going quite a bit up in the trace, > > to > > find its last context switch/runstate change, which is not always > > the > > easiest thing to do. > > > > It indeed can be scripted, but when _looking_ at a trace, trying to > > figure out why you're observing this or that weird behavior, I > > think > > knowing v->processor is a useful information. > > But if you're using xenalyze, xenalyze will know where the vcpu is > running / was last run; couldn't you have xenalyze report that > information when dumping the burn_credits record? > Yes, this is indeed a possibility. Xenalyze is not doing anything like or similar to that for now (at least, not in dump mode), that's probably why it did not occur to me that this could be done. But yeah, we've already discussed that it can become more intelligent and do more complex things and more refined reports, and this can well fit into that. > Again, I'm just pushing back to make sure the additional trace volume > is > actually necessary. :-) > Sure, that's fine! Ok, let's drop this patch for now then. :-) Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 101212: tolerable all pass - PUSHED
flight 101212 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/101212/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass version targeted for testing: xen eabe1c39d1cd4fcef18ec50571db3c70c055c1fb baseline version: xen fb8be95ca0b5fc816fd2234925f95c3f82ead824 Last test of basis 101207 2016-09-29 11:03:56 Z0 days Testing same since 101212 2016-09-29 19:01:00 Z0 days1 attempts People who touched revisions under test: Razvan CojocaruTamas K Lengyel Wei Liu jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=eabe1c39d1cd4fcef18ec50571db3c70c055c1fb + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke eabe1c39d1cd4fcef18ec50571db3c70c055c1fb + branch=xen-unstable-smoke + revision=eabe1c39d1cd4fcef18ec50571db3c70c055c1fb + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.7-testing + '[' xeabe1c39d1cd4fcef18ec50571db3c70c055c1fb = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/linux-firmware.git ++ :
Re: [Xen-devel] [PATCH v2 14/16] x86emul: sort opcode 0f01 special case switch() statement
On 28/09/16 09:18, Jan Beulich wrote: > Sort the special case opcode 0f01 entries numerically, insert blank > lines between each of the cases, and properly place opening braces. > > No functional change. > > Signed-off-by: Jan BeulichReviewed-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 13/16] x86emul: support XSETBV
On 28/09/16 09:17, Jan Beulich wrote: > This is a prereq for switching PV privileged op emulation to the > generic instruction emulator. Since handle_xsetbv() is already capable > of dealing with all guest kinds, avoid introducing another hook here. > > Signed-off-by: Jan BeulichReviewed-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 12/16] x86/PV: split out dealing with MSRs from privileged instruction handling
On 28/09/16 09:16, Jan Beulich wrote: > This is in preparation for using the generic emulator here. > > Signed-off-by: Jan BeulichThis looks like only code motion, so Reviewed-by: Andrew Cooper There is some rather unhelpful behaviour with the cases where we silently discard access to MSRs such as MSR_FAM10H_MMIO_CONF_BASE, but that is definitely not something to fix now. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 11/16] x86/PV: split out dealing with DRn from privileged instruction handling
On 28/09/16 09:15, Jan Beulich wrote: > This is in preparation for using the generic emulator here. > > Some care is needed temporarily to not unduly alter guest register > state: The local variable "res" can only go away once this code got > fully switched over to using x86_emulate(). > > Also switch to IS_ERR_VALUE() instead of (incorrectly) open coding it. It isn't actually an ERR_PTR(). That bit of code pre-dates the introduction of ERR_PTR() by some margin. The return code of do_get_debugreg() is broken and needs fixing, along with the ABI of the debugreg hypercalls. This change does cause an ABI change for PV guests, as they now can't read a debug register whose value is in the top 4k of linear address space, (ather than the top 8th of a page previously), but given that the ABI is already known broken, I am not sure I care too much. Either way, keeping it like this, or switching back to the previous opencoding, Reviewed-by: Andrew Cooper___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 10/16] x86/PV: split out dealing with CRn from privileged instruction handling
On 28/09/16 09:14, Jan Beulich wrote: > This is in preparation for using the generic emulator here. > > Signed-off-by: Jan Beulich> > --- a/xen/arch/x86/traps.c > +++ b/xen/arch/x86/traps.c > @@ -2255,6 +2255,107 @@ unsigned long guest_to_host_gpr_switch(u > > void (*pv_post_outb_hook)(unsigned int port, u8 value); > > +static int priv_op_read_cr(unsigned int reg, unsigned long *val, > + struct x86_emulate_ctxt *ctxt) > +{ > +const struct vcpu *curr = current; > + > +switch ( reg ) > +{ > +case 0: /* Read CR0 */ > +*val = (read_cr0() & ~X86_CR0_TS) | curr->arch.pv_vcpu.ctrlreg[0]; > +return X86EMUL_OKAY; > + > +case 2: /* Read CR2 */ > +case 4: /* Read CR4 */ > +*val = curr->arch.pv_vcpu.ctrlreg[reg]; > +return X86EMUL_OKAY; > + > +case 3: /* Read CR3 */ > +{ > +const struct domain *currd = curr->domain; > +unsigned long mfn; Any chance of switching this to mfn_t while you are moving it? > + > +if ( !is_pv_32bit_domain(currd) ) > +{ > +mfn = pagetable_get_pfn(curr->arch.guest_table); > +*val = xen_pfn_to_cr3(mfn_to_gmfn(currd, mfn)); > +} > +else > +{ > +l4_pgentry_t *pl4e = > + > map_domain_page(_mfn(pagetable_get_pfn(curr->arch.guest_table))); > + > +mfn = l4e_get_pfn(*pl4e); > +unmap_domain_page(pl4e); > +*val = compat_pfn_to_cr3(mfn_to_gmfn(currd, mfn)); > +} > +/* PTs should not be shared */ > +BUG_ON(page_get_owner(mfn_to_page(mfn)) == dom_cow); > +return X86EMUL_OKAY; > +} > +} > + > +return X86EMUL_UNHANDLEABLE; > +} > + > +static int priv_op_write_cr(unsigned int reg, unsigned long val, > +struct x86_emulate_ctxt *ctxt) > +{ > +struct vcpu *curr = current; > + > +switch ( reg ) > +{ > +case 0: /* Write CR0 */ > +if ( (val ^ read_cr0()) & ~X86_CR0_TS ) > +{ > +gdprintk(XENLOG_WARNING, > +"Attempt to change unmodifiable CR0 flags\n"); > +break; > +} > +do_fpu_taskswitch(!!(val & X86_CR0_TS)); > +return X86EMUL_OKAY; > + > +case 2: /* Write CR2 */ > +curr->arch.pv_vcpu.ctrlreg[2] = val; > +arch_set_cr2(curr, val); > +return X86EMUL_OKAY; > + > +case 3: /* Write CR3 */ > +{ > +struct domain *currd = curr->domain; > +unsigned long gfn; Similarly, gfn_t ? Reviewed-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 09/16] x86/32on64: use generic instruction decoding for call gate emulation
On 28/09/16 09:13, Jan Beulich wrote: > @@ -3204,179 +3285,59 @@ static void emulate_gate_op(struct cpu_u > return; > } > > -op_bytes = op_default = ar & _SEGMENT_DB ? 4 : 2; > -ad_default = ad_bytes = op_default; > -opnd_sel = opnd_off = 0; > -jump = -1; > -for ( eip = regs->eip; eip - regs->_eip < 10; ) > +ctxt.ctxt.addr_size = ar & _SEGMENT_DB ? 32 : 16; > +/* Leave zero in ctxt.ctxt.sp_size, as it's not needed for decoding. */ Are you sure this is safe? What if the instruction is substituted under our feet? Currently, the only issues I can spot would be a load of "& 0" in truncate_word() and friends, but my gut feeling is that this is not a safe or sensible thing to rely on. Everything else looks fine though. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [Qemu-devel] [PATCH 2/2] xen: add qemu device for each pvusb backend
Gerd Hoffmannwrites: > Hi, > >> > Hmm, I think the xen core needs better QOM support ... >> > >> > struct XenDevice should have a DeviceState element, so it can be used as >> > device object directly instead of attaching a device object like >> > this ... >> >> Hmm, interesting idea. The device object could even be added in >> Xen common code if the backend is indicating the need for it via a >> special flag/field. I'll have a try. > > No, not optional. Just turn *all* xen devices into QOM objects. Yes, please. > XenDevice should probably a subclass of the base device object > (DeviceState), and all Xen backends (block, net, fb, pvusb, ...) > should be subclasses of XenDevice. > > The latter is probably how things are modeled already, just the QOM > object stuff is missing (register classes, macros to cast objects, ...) > because qdev (the QOM predecessor) didn't have that. > > Once this is in place you can simply use DEVICE(xendevice) to get the > DeviceState pointer. Related thread: qdevification of xen_disk ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] pub-headers: reduce C99 dependencies
On Thu, Sep 29, 2016 at 12:22:37PM -0700, Julien Grall wrote: > > > On 29/09/2016 12:11, Julien Grall wrote: > >Hi Jan, > > > >On 28/09/2016 23:04, Jan Beulich wrote: > >On 28.09.16 at 21:42,wrote: > >>>Hi Jan, > >>> > >>>On 28/09/2016 05:00, Jan Beulich wrote: > For consumers not using (fully) C99-aware compilers, limit the number > of places where tweaking of the headers would be necessary: Introduce > and use xen_mk_ullong(), allowing its helper macro to be overridden at > once. > > For now don't touch public/io/, which also has a few offenders. > > The need to include xen.h in hvm/e820.h demonstrates that it is a bad > idea to include public headers first thing - arch/x86/hvm/mtrr.c needs > adjustment just because of this. > > Signed-off-by: Jan Beulich > --- > I wonder why all those ARM constants carry the ULL suffix despite only > two of them actually exceeding 32 significant bits. > >>> > >>>I am not the author of the code, but I think it was to declare all the > >>>constants of a given set uniformed. > >>> > >>>For instance all the GUEST_* constants are used to define the layout of > >>>the guest. This may be shuffle in this future (this is not part of ABI) > >> > >>Oh, they're in a Xen/tools only section (the #endif could really be > >>annotated to help spot this) - in that case I could as well leave them > >>alone. Any preference? > > > >Correct. I can send a patch to annotate the #endif. > > It looks like that I did not reply to your question. Do we expect the > toolstack to always C99 standard? If not, then we should keep the > xen_mk_ullong. > > I am fine either way. > We have -std=gnu99 in top-level Config.mk -- both xen and tools will be built with that. Wei. > Cheers, > > -- > Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 08/16] SVM: use generic instruction decoding
On 28/09/16 09:13, Jan Beulich wrote: > ... instead of custom handling. To facilitate this break out init code > from _hvm_emulate_one() into the new hvm_emulate_init(), and make > hvmemul_insn_fetch( globally available. ) > int __get_instruction_length_from_list(struct vcpu *v, > const enum instruction_index *list, unsigned int list_count) > { > struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb; > -unsigned int i, j, inst_len = 0; > -enum instruction_index instr = 0; > -u8 buf[MAX_INST_LEN]; > -const u8 *opcode = NULL; > -unsigned long fetch_addr, fetch_limit; > -unsigned int fetch_len, max_len; > +struct hvm_emulate_ctxt ctxt; > +struct x86_emulate_state *state; > +unsigned int inst_len, j, modrm_rm, modrm_reg; > +int modrm_mod; > Despite knowing how this works, it is still confusing to read. Do you mind putting in a comment such as: /* In a debug build, always use x86_decode_insn() and compare with hardware. */ > +#ifdef NDEBUG > if ( (inst_len = svm_nextrip_insn_length(v)) != 0 ) > return inst_len; > > if ( vmcb->exitcode == VMEXIT_IOIO ) > return vmcb->exitinfo2 - vmcb->rip; > +#endif > > -/* Fetch up to the next page break; we'll fetch from the next page > - * later if we have to. */ > -fetch_addr = svm_rip2pointer(v, _limit); > -if ( vmcb->rip > fetch_limit ) > -return 0; > -max_len = min(fetch_limit - vmcb->rip + 1, MAX_INST_LEN + 0UL); > -fetch_len = min_t(unsigned int, max_len, > - PAGE_SIZE - (fetch_addr & ~PAGE_MASK)); > -if ( !fetch(vmcb, buf, fetch_addr, fetch_len) ) > +ASSERT(v == current); > +hvm_emulate_prepare(, guest_cpu_user_regs()); > +hvm_emulate_init(, NULL, 0); > +state = x86_decode_insn(, hvmemul_insn_fetch); > +if ( IS_ERR_OR_NULL(state) ) > return 0; > > -while ( (inst_len < max_len) && is_prefix(buf[inst_len]) ) > -{ > -inst_len++; > -if ( inst_len >= fetch_len ) > -{ > -if ( !fetch(vmcb, buf + fetch_len, fetch_addr + fetch_len, > -max_len - fetch_len) ) > -return 0; > -fetch_len = max_len; > -} > +inst_len = x86_insn_length(state, ); > +modrm_mod = x86_insn_modrm(state, _rm, _reg); > +x86_emulate_free_state(state); > +#ifndef NDEBUG > +if ( vmcb->exitcode == VMEXIT_IOIO ) > +j = vmcb->exitinfo2 - vmcb->rip; > +else > +j = svm_nextrip_insn_length(v); > +if ( j && j != inst_len ) > +{ > +gprintk(XENLOG_WARNING, "insn-len[%02x]=%u (exp %u)\n", > +ctxt.ctxt.opcode, inst_len, j); > +return j; > } > +#endif > > for ( j = 0; j < list_count; j++ ) > { > -instr = list[j]; > -opcode = opc_bytes[instr]; > +enum instruction_index instr = list[j]; > > -for ( i = 0; (i < opcode[0]) && ((inst_len + i) < max_len); i++ ) > +ASSERT(instr >= 0 && instr < ARRAY_SIZE(opc_tab)); This is another ASSERT() used as a bounds check, and will suffer a build failure on clang. You need to use s/enum instruction_index/unsigned int/ to fix the build issue. Can I also request the use of if ( instr >= ARRAY_SIZE(opc_tab) ) { ASSERT_UNREACHABLE(); return 0; } instead? > --- a/xen/arch/x86/x86_emulate/x86_emulate.c > +++ b/xen/arch/x86/x86_emulate/x86_emulate.c > @@ -5200,3 +5214,89 @@ x86_emulate( > #undef vex > #undef override_seg > #undef ea > + > +#ifdef __XEN__ > + > +#include > + > +struct x86_emulate_state * > +x86_decode_insn( > +struct x86_emulate_ctxt *ctxt, > +int (*insn_fetch)( > +enum x86_segment seg, unsigned long offset, > +void *p_data, unsigned int bytes, > +struct x86_emulate_ctxt *ctxt)) > +{ > +static DEFINE_PER_CPU(struct x86_emulate_state, state); > +struct x86_emulate_state *state = _cpu(state); > +const struct x86_emulate_ops ops = { This can be static, to avoid having it reconstructed on the stack each function call. Otherwise, Reviewed-by: Andrew Cooper___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] pub-headers: reduce C99 dependencies
On 29/09/2016 12:11, Julien Grall wrote: Hi Jan, On 28/09/2016 23:04, Jan Beulich wrote: On 28.09.16 at 21:42,wrote: Hi Jan, On 28/09/2016 05:00, Jan Beulich wrote: For consumers not using (fully) C99-aware compilers, limit the number of places where tweaking of the headers would be necessary: Introduce and use xen_mk_ullong(), allowing its helper macro to be overridden at once. For now don't touch public/io/, which also has a few offenders. The need to include xen.h in hvm/e820.h demonstrates that it is a bad idea to include public headers first thing - arch/x86/hvm/mtrr.c needs adjustment just because of this. Signed-off-by: Jan Beulich --- I wonder why all those ARM constants carry the ULL suffix despite only two of them actually exceeding 32 significant bits. I am not the author of the code, but I think it was to declare all the constants of a given set uniformed. For instance all the GUEST_* constants are used to define the layout of the guest. This may be shuffle in this future (this is not part of ABI) Oh, they're in a Xen/tools only section (the #endif could really be annotated to help spot this) - in that case I could as well leave them alone. Any preference? Correct. I can send a patch to annotate the #endif. It looks like that I did not reply to your question. Do we expect the toolstack to always C99 standard? If not, then we should keep the xen_mk_ullong. I am fine either way. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xtf test] 101211: all pass - PUSHED
flight 101211 xtf real [real] http://logs.test-lab.xenproject.org/osstest/logs/101211/ Perfect :-) All tests in this flight passed as required version targeted for testing: xtf 50b7556924ad173285f4116dc9e0937600bf5bee baseline version: xtf 962dd4b225c17ae48d36a978602b245dbf77bec5 Last test of basis 101185 2016-09-28 12:14:21 Z1 days Testing same since 101211 2016-09-29 18:15:50 Z0 days1 attempts People who touched revisions under test: Andrew Cooperjobs: build-amd64-xtf pass build-amd64 pass build-amd64-pvopspass test-xtf-amd64-amd64-1 pass test-xtf-amd64-amd64-2 pass test-xtf-amd64-amd64-3 pass test-xtf-amd64-amd64-4 pass test-xtf-amd64-amd64-5 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xtf + revision=50b7556924ad173285f4116dc9e0937600bf5bee + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xtf 50b7556924ad173285f4116dc9e0937600bf5bee + branch=xtf + revision=50b7556924ad173285f4116dc9e0937600bf5bee + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xtf + xenbranch=xen-unstable + '[' xxtf = xlinux ']' + linuxbranch= + '[' x = x ']' + qemuubranch=qemu-upstream-unstable + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable + prevxenbranch=xen-4.7-testing + '[' x50b7556924ad173285f4116dc9e0937600bf5bee = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git ++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/xen/git/linux-pvops.git ++ : git://xenbits.xen.org/linux-pvops.git ++ : tested/linux-3.14 ++ : tested/linux-arm-xen ++ '['
Re: [Xen-devel] [PATCH] pub-headers: reduce C99 dependencies
Hi Jan, On 28/09/2016 05:00, Jan Beulich wrote: For consumers not using (fully) C99-aware compilers, limit the number of places where tweaking of the headers would be necessary: Introduce and use xen_mk_ullong(), allowing its helper macro to be overridden at once. For now don't touch public/io/, which also has a few offenders. The need to include xen.h in hvm/e820.h demonstrates that it is a bad idea to include public headers first thing - arch/x86/hvm/mtrr.c needs adjustment just because of this. Signed-off-by: Jan BeulichFor the ARM parts: Reviewed-by: Julien Grall Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] pub-headers: reduce C99 dependencies
Hi Jan, On 28/09/2016 23:04, Jan Beulich wrote: On 28.09.16 at 21:42,wrote: Hi Jan, On 28/09/2016 05:00, Jan Beulich wrote: For consumers not using (fully) C99-aware compilers, limit the number of places where tweaking of the headers would be necessary: Introduce and use xen_mk_ullong(), allowing its helper macro to be overridden at once. For now don't touch public/io/, which also has a few offenders. The need to include xen.h in hvm/e820.h demonstrates that it is a bad idea to include public headers first thing - arch/x86/hvm/mtrr.c needs adjustment just because of this. Signed-off-by: Jan Beulich --- I wonder why all those ARM constants carry the ULL suffix despite only two of them actually exceeding 32 significant bits. I am not the author of the code, but I think it was to declare all the constants of a given set uniformed. For instance all the GUEST_* constants are used to define the layout of the guest. This may be shuffle in this future (this is not part of ABI) Oh, they're in a Xen/tools only section (the #endif could really be annotated to help spot this) - in that case I could as well leave them alone. Any preference? Correct. I can send a patch to annotate the #endif. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86emul: fix {,i}mul and {,i}div
On Thu, Sep 29, 2016 at 07:08:10AM -0600, Jan Beulich wrote: > Commit a3db233ede ("x86emul: use DstEax also for {,I}{MUL,DIV}") went > a little too far: DstEax and SrcEax weren't really meant to be used > together with ModRM - they assume modrm_reg remains zero by the time > the destination / source register pointer gets calculated. Don't fully > undo that commit though, but instead just correct the register pointer, > and don't use dst.val as input for mul and imul (div and idiv did avoid > that already). > > Reported-by: Konrad Rzeszutek Wilkand Tested-by: Konrad Rzeszutek Wilk ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5] vm_event: Implement ARM SMC events
On Wed, 28 Sep 2016, Tamas K Lengyel wrote: > The ARM SMC instructions are already configured to trap to Xen by default. In > this patch we allow a user-space process in a privileged domain to receive > notification of when such event happens through the vm_event subsystem by > introducing the PRIVILEGED_CALL type. > > The intended use-case for this feature is for a monitor application to be able > insert tap-points into the domU kernel-code. For this task only unconditional > SMC instruction should be used. > > Signed-off-by: Tamas K Lengyel> Acked-by: Wei Liu > Acked-by: Razvan Cojocaru Reviewed-by: Stefano Stabellini > Cc: Ian Jackson > Cc: Stefano Stabellini > Cc: Julien Grall > > v5: Only check if monitor.privileged_call_enabled is set once > --- > tools/libxc/include/xenctrl.h | 2 + > tools/libxc/xc_monitor.c| 14 +++ > tools/tests/xen-access/xen-access.c | 30 +++ > xen/arch/arm/Makefile | 1 + > xen/arch/arm/monitor.c | 76 > + > xen/arch/arm/traps.c| 16 +++- > xen/include/asm-arm/domain.h| 5 +++ > xen/include/asm-arm/monitor.h | 18 +++-- > xen/include/public/domctl.h | 1 + > xen/include/public/vm_event.h | 7 > 10 files changed, 155 insertions(+), 15 deletions(-) > create mode 100644 xen/arch/arm/monitor.c > > diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h > index 560ce7b..eb53172 100644 > --- a/tools/libxc/include/xenctrl.h > +++ b/tools/libxc/include/xenctrl.h > @@ -2168,6 +2168,8 @@ int xc_monitor_guest_request(xc_interface *xch, domid_t > domain_id, > int xc_monitor_debug_exceptions(xc_interface *xch, domid_t domain_id, > bool enable, bool sync); > int xc_monitor_cpuid(xc_interface *xch, domid_t domain_id, bool enable); > +int xc_monitor_privileged_call(xc_interface *xch, domid_t domain_id, > + bool enable); > /** > * This function enables / disables emulation for each REP for a > * REP-compatible instruction. > diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c > index 4298813..15a7c32 100644 > --- a/tools/libxc/xc_monitor.c > +++ b/tools/libxc/xc_monitor.c > @@ -185,6 +185,20 @@ int xc_monitor_cpuid(xc_interface *xch, domid_t > domain_id, bool enable) > return do_domctl(xch, ); > } > > +int xc_monitor_privileged_call(xc_interface *xch, domid_t domain_id, > + bool enable) > +{ > +DECLARE_DOMCTL; > + > +domctl.cmd = XEN_DOMCTL_monitor_op; > +domctl.domain = domain_id; > +domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE > +: XEN_DOMCTL_MONITOR_OP_DISABLE; > +domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_PRIVILEGED_CALL; > + > +return do_domctl(xch, ); > +} > + > /* > * Local variables: > * mode: C > diff --git a/tools/tests/xen-access/xen-access.c > b/tools/tests/xen-access/xen-access.c > index ed18c71..9d4f957 100644 > --- a/tools/tests/xen-access/xen-access.c > +++ b/tools/tests/xen-access/xen-access.c > @@ -338,6 +338,8 @@ void usage(char* progname) > fprintf(stderr, "Usage: %s [-m] write|exec", progname); > #if defined(__i386__) || defined(__x86_64__) > fprintf(stderr, > "|breakpoint|altp2m_write|altp2m_exec|debug|cpuid"); > +#elif defined(__arm__) || defined(__aarch64__) > +fprintf(stderr, "|privcall"); > #endif > fprintf(stderr, > "\n" > @@ -362,6 +364,7 @@ int main(int argc, char *argv[]) > int required = 0; > int breakpoint = 0; > int shutting_down = 0; > +int privcall = 0; > int altp2m = 0; > int debug = 0; > int cpuid = 0; > @@ -431,6 +434,11 @@ int main(int argc, char *argv[]) > { > cpuid = 1; > } > +#elif defined(__arm__) || defined(__aarch64__) > +else if ( !strcmp(argv[0], "privcall") ) > +{ > +privcall = 1; > +} > #endif > else > { > @@ -563,6 +571,16 @@ int main(int argc, char *argv[]) > } > } > > +if ( privcall ) > +{ > +rc = xc_monitor_privileged_call(xch, domain_id, 1); > +if ( rc < 0 ) > +{ > +ERROR("Error %d setting privileged call trapping with > vm_event\n", rc); > +goto exit; > +} > +} > + > /* Wait for access */ > for (;;) > { > @@ -578,6 +596,9 @@ int main(int argc, char *argv[]) > if ( cpuid ) > rc = xc_monitor_cpuid(xch, domain_id, 0); > > +if ( privcall ) > +rc = xc_monitor_privileged_call(xch, domain_id, 0); > + > if ( altp2m ) > { > rc =
Re: [Xen-devel] [PATCH] x86emul: fix {,i}mul and {,i}div
On 29/09/16 14:08, Jan Beulich wrote: > Commit a3db233ede ("x86emul: use DstEax also for {,I}{MUL,DIV}") went > a little too far: DstEax and SrcEax weren't really meant to be used > together with ModRM - they assume modrm_reg remains zero by the time > the destination / source register pointer gets calculated. Don't fully > undo that commit though, but instead just correct the register pointer, > and don't use dst.val as input for mul and imul (div and idiv did avoid > that already). > > Reported-by: Konrad Rzeszutek Wilk> Signed-off-by: Jan Beulich Reviewed-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] qdevification of xen_disk
Hi Kevin, I agree with you, and if you would be so kind to send the patches, even untested, they would be much appreciated. Anthony or I will make sure to test them appropriately and fix them, if they turn out to be incomplete or partially broken. Would that be OK? Cheers, Stefano P.S. FYI Xen works on top of QEMU with PV guests. So theoretically you could install Xen inside QEMU and use Xen to lunch a PV guest which uses xen_block as Xen backend. Fun!! :-D On Thu, 29 Sep 2016, Kevin Wolf wrote: > Hi Stefano and all, > > while working on some part of the QEMU block layer infrastructure that > requires going from a BlockBackend to the qdev DeviceState, I noticed > that xen_disk is still not qdevified after all the years. It's the last > device, and has been for a while, that is blocking the necessary changes > in the block layer. > > Specifically, I'm talking about the blk_attach_dev_nofail() and > blk_detach_dev() functions (and related ones, which don't seem to be > used by xen_disk though), which must be converted from accepting a void* > for the device to DeviceState*. > > I think in theory it shouldn't be too hard to build a minimal qdev > device between xen_disk and the block layer. If you want me to, I can > try to do that - however, I don't have a Xen setup to actually test the > result, so if things break, you get to keep the pieces. If someone else > would like to look into this in the next few days, that might be the > better option. > > In any case, we need to do something to make xen_disk compatible with > the modern (well, not _that_ modern any more) infrastructure that all > other devices use. > > Kevin > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86emul: simplify LEAVE handling
On 29/09/16 14:08, Jan Beulich wrote: > There's no 1-byte operand size case to take care of here, and there's > no point doing the first writeback using dst fields - we can read rBP > and write rSP directly. > > Signed-off-by: Jan BeulichReviewed-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 10/24] xen: tracing: improve Credit2's tickle_check and burn_credits records
On 29/09/16 18:23, Dario Faggioli wrote: > On Tue, 2016-09-20 at 15:35 +0100, George Dunlap wrote: >> On 17/08/16 18:18, Dario Faggioli wrote: >>> >>> In both Credit2's trace records relative to checking >>> whether we want to preempt a vcpu (in runq_tickle()), >>> and to credits being burn, make it explicit on which >>> pcpu the vcpu being considered is running. >>> >>> Such information isn't currently available, not even >>> by looking at on which pcpu the events happen, as we >>> do both the above operation from a certain pcpu on >>> vcpus running on different pcpus. >> >> But you should be able to tell where a given vcpu is currently >> running >> from the runstate changes, right? Obviously xentrace_format couldn't >> tell you that, but xenalyze should be able to, unless there were lost >> trace records on the vcpu in question. >> > Well, yes and no. For instance, burn_credits() is not only called from > csched_schedule(), where indeed we have the information in close > records. It's also called from inside runq_tickle() itself (as we want > to update the credits of the various vcpus we are considering > preempting), which in turns can be called during vcpu wakeup. > > In that case, knowing where a certain vcpu that we're asking to burn > its credit is running, may mean going quite a bit up in the trace, to > find its last context switch/runstate change, which is not always the > easiest thing to do. > > It indeed can be scripted, but when _looking_ at a trace, trying to > figure out why you're observing this or that weird behavior, I think > knowing v->processor is a useful information. But if you're using xenalyze, xenalyze will know where the vcpu is running / was last run; couldn't you have xenalyze report that information when dumping the burn_credits record? Again, I'm just pushing back to make sure the additional trace volume is actually necessary. :-) -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 10/24] xen: tracing: improve Credit2's tickle_check and burn_credits records
On Tue, 2016-09-20 at 15:35 +0100, George Dunlap wrote: > On 17/08/16 18:18, Dario Faggioli wrote: > > > > In both Credit2's trace records relative to checking > > whether we want to preempt a vcpu (in runq_tickle()), > > and to credits being burn, make it explicit on which > > pcpu the vcpu being considered is running. > > > > Such information isn't currently available, not even > > by looking at on which pcpu the events happen, as we > > do both the above operation from a certain pcpu on > > vcpus running on different pcpus. > > But you should be able to tell where a given vcpu is currently > running > from the runstate changes, right? Obviously xentrace_format couldn't > tell you that, but xenalyze should be able to, unless there were lost > trace records on the vcpu in question. > Well, yes and no. For instance, burn_credits() is not only called from csched_schedule(), where indeed we have the information in close records. It's also called from inside runq_tickle() itself (as we want to update the credits of the various vcpus we are considering preempting), which in turns can be called during vcpu wakeup. In that case, knowing where a certain vcpu that we're asking to burn its credit is running, may mean going quite a bit up in the trace, to find its last context switch/runstate change, which is not always the easiest thing to do. It indeed can be scripted, but when _looking_ at a trace, trying to figure out why you're observing this or that weird behavior, I think knowing v->processor is a useful information. > My modus operandi has been "try to keep trace volume from growing" > rather than "wait until trace volume is noticably an issue and reduce > it". Presumably you've been doing a lot of tracing -- do you think I > should change my approach? > No, I think the approach is a good one. It's just that, in this case, I think this is useful information, so I'll keep the patch in v2. But if you're not sure, just ignore it, and we can sort this at another time. :-) Thanks and Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 06/30] x86/paging: introduce paging_set_allocation
On Thu, Sep 29, 2016 at 10:12:12AM -0600, Jan Beulich wrote: > >>> On 29.09.16 at 16:51,wrote: > > On Thu, Sep 29, 2016 at 04:51:42AM -0600, Jan Beulich wrote: > >> >>> On 27.09.16 at 17:57, wrote: > >> > @@ -1383,15 +1382,25 @@ int __init construct_dom0( > >> > nr_pages); > >> > } > >> > > >> > -if ( is_pvh_domain(d) ) > >> > -hap_set_alloc_for_pvh_dom0(d, dom0_paging_pages(d, nr_pages)); > >> > - > >> > /* > >> > * We enable paging mode again so guest_physmap_add_page will do the > >> > * right thing for us. > >> > */ > >> > >> I'm afraid you render this comment stale - please adjust it accordingly. > > > > Not AFAICT, this comment is referring to the next line, which is: > > > > d->arch.paging.mode = save_pvh_pg_mode; > > > > The classic PVH domain builder contains quite a lot of craziness and > > disables paging modes at certain points by playing with d->arch.paging.mode. > > Right, but your addition below that line now also depends on that > restore, afaict. Yes, that's completely right, sorry for overlooking it. I've expanded the comment to also reference paging_set_allocation (or else we would hit an assert). Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 07/24] xen: sched: don't rate limit context switches in case of yields
On Tue, 2016-09-20 at 14:32 +0100, George Dunlap wrote: > On 17/08/16 18:18, Dario Faggioli wrote: > > diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c > > @@ -1771,9 +1771,18 @@ csched_schedule( > > * cpu and steal it. > > */ > > > > -/* If we have schedule rate limiting enabled, check to see > > - * how long we've run for. */ > > -if ( !tasklet_work_scheduled > > +/* > > + * If we have schedule rate limiting enabled, check to see > > + * how long we've run for. > > + * > > + * If scurr is yielding, however, we don't let rate limiting > > kick in. > > + * In fact, it may be the case that scurr is about to spin, > > and there's > > + * no point forcing it to do so until rate limiting expires. > > + * > > + * While there, take the chance for clearing the yield flag at > > once. > > + */ > > +if ( !test_and_clear_bit(CSCHED_FLAG_VCPU_YIELD, > > >flags) > > It looks like YIELD is implemented by putting it lower in the > runqueue > in __runqueue_insert(). But here you're clearing the flag before the > insert happens -- won't this effectively disable yield() for credit1? > Yes, I think you're right... I'm not sure how I thought it would work. :-O Thanks for pointing this out, will fix in v2. > > diff --git a/xen/common/sched_credit2.c > > b/xen/common/sched_credit2.c > > index 569174b..c8e0ee7 100644 > > --- a/xen/common/sched_credit2.c > > +++ b/xen/common/sched_credit2.c > > @@ -2267,36 +2267,40 @@ runq_candidate(struct csched2_runqueue_data > > *rqd, > > [...] > This looks good, but the code re-organization probably goes better in > the previous patch. Since you're re-sending anyway, would you mind > moving it there? > > I'm not sure the credit2 yield-ratelimit needs to be in a separate > patch; since you're implementing yield in credit2 from scratch you > could > just implement it all in one go. But since you have a patch for > credit1 > anyway, I think whichever way is fine. > Ok, yes, I'm moving all the Credit2 bits from this patch to the one that is actually implementing yield in Credit2 from scratch, and leaving this as the patch that makes Credit1 stop ratelimiting upon yield. Thanks and Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] qdevification of xen_disk
Hi Stefano and all, while working on some part of the QEMU block layer infrastructure that requires going from a BlockBackend to the qdev DeviceState, I noticed that xen_disk is still not qdevified after all the years. It's the last device, and has been for a while, that is blocking the necessary changes in the block layer. Specifically, I'm talking about the blk_attach_dev_nofail() and blk_detach_dev() functions (and related ones, which don't seem to be used by xen_disk though), which must be converted from accepting a void* for the device to DeviceState*. I think in theory it shouldn't be too hard to build a minimal qdev device between xen_disk and the block layer. If you want me to, I can try to do that - however, I don't have a Xen setup to actually test the result, so if things break, you get to keep the pieces. If someone else would like to look into this in the next few days, that might be the better option. In any case, we need to do something to make xen_disk compatible with the modern (well, not _that_ modern any more) infrastructure that all other devices use. Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 08/30] xen/x86: do the PCI scan unconditionally
>>> On 29.09.16 at 17:11,wrote: > On Thu, Sep 29, 2016 at 07:55:00AM -0600, Jan Beulich wrote: >> >>> On 27.09.16 at 17:57, wrote: >> > Instead of being tied to the presence of an IOMMU >> >> At the very least I'd expect the "why" aspect to get mentioned >> here. > > TBH, it seems simpler to have it there rather than conditional to the > presence of an IOMMU. Also, I think scan_pci_devices failing should be > fatal. Would you be OK with me adding a panic if the scan fails? Hmm, no, not really. We can do without knowing of any PCI devices right now, so I don't see why this should become a panic. The more that we don't do a complete scan anyway, so we know information is going to be at best partial until Dom0 gives us a complete picture. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 06/30] x86/paging: introduce paging_set_allocation
>>> On 29.09.16 at 16:51,wrote: > On Thu, Sep 29, 2016 at 04:51:42AM -0600, Jan Beulich wrote: >> >>> On 27.09.16 at 17:57, wrote: >> > @@ -1383,15 +1382,25 @@ int __init construct_dom0( >> > nr_pages); >> > } >> > >> > -if ( is_pvh_domain(d) ) >> > -hap_set_alloc_for_pvh_dom0(d, dom0_paging_pages(d, nr_pages)); >> > - >> > /* >> > * We enable paging mode again so guest_physmap_add_page will do the >> > * right thing for us. >> > */ >> >> I'm afraid you render this comment stale - please adjust it accordingly. > > Not AFAICT, this comment is referring to the next line, which is: > > d->arch.paging.mode = save_pvh_pg_mode; > > The classic PVH domain builder contains quite a lot of craziness and > disables paging modes at certain points by playing with d->arch.paging.mode. Right, but your addition below that line now also depends on that restore, afaict. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 04/30] xen/x86: allow calling {sh/hap}_set_allocation with the idle domain
>>> On 29.09.16 at 16:37,wrote: > On Thu, Sep 29, 2016 at 04:43:07AM -0600, Jan Beulich wrote: >> >>> On 27.09.16 at 17:56, wrote: >> > --- a/xen/arch/x86/mm/hap/hap.c >> > +++ b/xen/arch/x86/mm/hap/hap.c >> > @@ -379,7 +379,9 @@ hap_set_allocation(struct domain *d, unsigned long > pages, int *preempted) >> > break; >> > >> > /* Check to see if we need to yield and try again */ >> > -if ( preempted && hypercall_preempt_check() ) >> > +if ( preempted && >> > + (is_idle_vcpu(current) ? softirq_pending(smp_processor_id()) > : >> > + hypercall_preempt_check()) ) >> >> So what is the supposed action for the caller to take in this case? >> I think this should at least be spelled out in the commit message. > > I'm not sure I follow, but I assume you mean in the idle vcpu case? Yes. Right now it is expected to schedule a hypercall continuation. > In that case the caller should call process_pending_softirqs. I will modify > the commit message to include it. Thanks. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 03/30] xen/x86: fix parameters and return value of *_set_allocation functions
>>> On 29.09.16 at 16:33,wrote: > On Thu, Sep 29, 2016 at 04:39:02AM -0600, Jan Beulich wrote: >> >>> On 27.09.16 at 17:56, wrote: >> > Return should be an int, and the number of pages should be an unsigned >> > long. >> >> I can see the former, but why the latter? Acting on 32-bit quantities >> is a little cheaper after all. > > This was requested by Andrew in the previous version: > > https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg03126.html > > But yes, an unsigned int is enough to hold the maximum number of pages for > a domain given than we support up to 1TB for HVM guests. Or maybe there are > plans to increase the maximum supported memory per guest? Larger guests or not - here we're talking about number of pages used internally for page tables, not numbers of pages assigned to the guest, aren't we? In any event I'm not sure which truncation issue Andrew had spotted back then. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 02/30] xen/x86: remove XENFEAT_hvm_pirqs for PVHv2 guests
>>> On 29.09.16 at 16:17,wrote: > On Wed, Sep 28, 2016 at 10:03:21AM -0600, Jan Beulich wrote: >> >>> On 27.09.16 at 17:56, wrote: >> > On PVHv2 guests we explicitly want to use native methods for routing >> > interrupts. >> > >> > Introduce a new XEN_X86_EMU_USE_PIRQ to notify Xen whether a HVM guest can >> > route physical interrupts (even from emulated devices) over event channels. >> >> So you specifically want this new flag off for PVHv2 guests? Based on >> just this description I did get the opposite impression... > > Yes, that's right, I don't want PVHv2 guests to know anything about PIRQs. I > > don't really know how to reword this, what about: > > "PVHv2 guests, unlike HVM guests, won't have the option to route interrupts > from physical or emulated devices over event channels using PIRQs. This > applies to both DomU and Dom0 PVHv2 guests. > > Introduce a new XEN_X86_EMU_USE_PIRQ to notify Xen whether a HVM guest can > route physical interrupts (even from emulated devices) over event channels, > and is thus allowed to use some of the PHYSDEV ops." SGTM Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 06/24] xen: credit2: implement yield()
On Tue, 2016-09-13 at 14:33 +0100, George Dunlap wrote: > On 17/08/16 18:18, Dario Faggioli wrote: > > Alternatively, we can actually _subtract_ some credits to a > > yielding vcpu. > > That will sort of make the effect of a call to yield last in time. > > But normally we want the yield to be temporary, right? The kinds of > places it typically gets called is when the vcpu is waiting for a > spinlock held by another (probably pre-empted) vcpu. Doing a > permanent > credit subtraction will bias the credit algorithm against cpus that > have > a high amount of spinlock contention (since probably all the vcpus > will > be calling yield pretty regularly) > Yes, indeed. Good point, actually. However, one can also think of a scenario where: - A yields, and is descheduled in favour of B, as a consequence of that - B runs for just a little while and blocks - C and A are in runqueue, and A, without counting the idle bias, has more credit than C. So A will be picked up again, even if it yielded very recently, and it may still be in the spinlock wait (or whatever place that is yielding in a tight loop) Well, in this case, A will yield again, and C will be picked, i.e., what would have happened in the first place, if we subtracted credits to A. (I.e., functionally, this would work the same way, but with more overhead.) So, again, can this happen? How frequently, both in absolute and relative terms? Very hard to tell! So, really... > > Yes, this is simple and should be effective for now. We can look at > improving it later. > ...glad you also think this. Let's go for it. :-) > > diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen- > > command-line.markdown > > @@ -1389,6 +1389,16 @@ Choose the default scheduler. > > ### sched\_credit2\_migrate\_resist > > > `= ` > > > > +### sched\_credit2\_yield\_bias > > +> `= ` > > + > > +> Default: `1000` > > + > > +Set how much a yielding vcpu will be penalized, in order to > > actually > > +give a chance to run to some other vcpu. This is basically a bias, > > in > > +favour of the non-yielding vcpus, expressed in microseconds > > (default > > +is 1ms). > > Probably add _us to the end to indicate that the number is in > microseconds. > Good idea, although right now we have "sched_credit2_migrate_resist" which does not have the suffixe. Still, I'm doing as you suggest because I like it better, and we'll fix "migrate_resist" later, if we want consistency. > > @@ -2247,10 +2267,22 @@ runq_candidate(struct csched2_runqueue_data > > *rqd, > > struct list_head *iter; > > struct csched2_vcpu *snext = NULL; > > struct csched2_private *prv = CSCHED2_PRIV(per_cpu(scheduler, > > cpu)); > > +int yield_bias = 0; > > > > /* Default to current if runnable, idle otherwise */ > > if ( vcpu_runnable(scurr->vcpu) ) > > +{ > > +/* > > + * The way we actually take yields into account is like > > this: > > + * if scurr is yielding, when comparing its credits with > > other > > + * vcpus in the runqueue, act like those other vcpus had > > yield_bias > > + * more credits. > > + */ > > +if ( unlikely(scurr->flags & CSFLAG_vcpu_yield) ) > > +yield_bias = CSCHED2_YIELD_BIAS; > > + > > snext = scurr; > > +} > > else > > snext = CSCHED2_VCPU(idle_vcpu[cpu]); > > > > @@ -2268,6 +2300,7 @@ runq_candidate(struct csched2_runqueue_data > > *rqd, > > list_for_each( iter, >runq ) > > { > > struct csched2_vcpu * svc = list_entry(iter, struct > > csched2_vcpu, runq_elem); > > +int svc_credit = svc->credit + yield_bias; > > Just curious, why did you decide to add yield_bias to everyone else, > rather than just subtracting it from snext->credit? > I honestly don't recall. :-) It indeed feels more natural to subtract to next. I've done it that way now, let me give it a test spin and resend... > > @@ -2918,6 +2957,14 @@ csched2_init(struct scheduler *ops) > > printk(XENLOG_INFO "load tracking window lenght %llu ns\n", > > 1ULL << opt_load_window_shift); > > > > +if ( opt_yield_bias < CSCHED2_YIELD_BIAS_MIN ) > > +{ > > +printk("WARNING: %s: opt_yield_bias %d too small, > > resetting\n", > > + __func__, opt_yield_bias); > > +opt_yield_bias = 1000; /* 1 ms */ > > +} > > Why do we need a minimum bias? And why reset it to 1ms rather than > SCHED2_YIELD_BIAS_MIN? > You know what, I don't think we need that. I probably was thinking that we may always want to force yield to have _some_ effect, but there may be (or may will be) someone who just want to disable it at all... And in that case, this check will be in his way. I'll kill it. Thanks and regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd.,
[Xen-devel] [ovmf test] 101206: all pass - PUSHED
flight 101206 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/101206/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 84bc72fb7ddaa26105bfe5bf36115099da1e60b1 baseline version: ovmf edb0fda25ea9b2ef73db18bf5cf0798340209f28 Last test of basis 101199 2016-09-29 02:49:33 Z0 days Testing same since 101206 2016-09-29 10:45:18 Z0 days1 attempts People who touched revisions under test: Ruiyu Nijobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=ovmf + revision=84bc72fb7ddaa26105bfe5bf36115099da1e60b1 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 84bc72fb7ddaa26105bfe5bf36115099da1e60b1 + branch=ovmf + revision=84bc72fb7ddaa26105bfe5bf36115099da1e60b1 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=ovmf + xenbranch=xen-unstable + '[' xovmf = xlinux ']' + linuxbranch= + '[' x = x ']' + qemuubranch=qemu-upstream-unstable + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable + prevxenbranch=xen-4.7-testing + '[' x84bc72fb7ddaa26105bfe5bf36115099da1e60b1 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git ++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git ++ :
Re: [Xen-devel] [PATCH v2 07/30] xen/x86: split the setup of Dom0 permissions to a function
On Thu, Sep 29, 2016 at 07:47:22AM -0600, Jan Beulich wrote: > >>> On 27.09.16 at 17:57,wrote: > > So that it can also be used by the PVH-specific domain builder. This is just > > code motion, it should not introduce any functional change. > > > > Signed-off-by: Roger Pau Monné > > Looks generally okay, but please do minor style corrections as you > move code: > > > --- a/xen/arch/x86/domain_build.c > > +++ b/xen/arch/x86/domain_build.c > > @@ -869,6 +869,89 @@ static __init void setup_pv_physmap(struct domain *d, > > unsigned long pgtbl_pfn, > > unmap_domain_page(l4start); > > } > > > > +static int __init setup_permissions(struct domain *d) > > +{ > > +unsigned long mfn; > > +int i, rc = 0; > > i should be unsigned int, and the initializer of rc could be avoided. Done (I've also converted the first assignment to rc below from |= to =). > > +/* The hardware domain is initially permitted full I/O capabilities. */ > > +rc |= ioports_permit_access(d, 0, 0x); > > +rc |= iomem_permit_access(d, 0UL, (1UL << (paddr_bits - PAGE_SHIFT)) - > > 1); > > +rc |= irqs_permit_access(d, 1, nr_irqs_gsi - 1); > > + > > +/* > > + * Modify I/O port access permissions. > > + */ > > This is a single line comment - I understand it's trying to be more of a > separator than the others, but I'd prefer for it to do so by being > followed by a blank line. > > > +/* Master Interrupt Controller (PIC). */ > > +rc |= ioports_deny_access(d, 0x20, 0x21); > > +/* Slave Interrupt Controller (PIC). */ > > +rc |= ioports_deny_access(d, 0xA0, 0xA1); > > +/* Interval Timer (PIT). */ > > +rc |= ioports_deny_access(d, 0x40, 0x43); > > +/* PIT Channel 2 / PC Speaker Control. */ > > +rc |= ioports_deny_access(d, 0x61, 0x61); > > +/* ACPI PM Timer. */ > > +if ( pmtmr_ioport ) > > +rc |= ioports_deny_access(d, pmtmr_ioport, pmtmr_ioport + 3); > > +/* PCI configuration space (NB. 0xcf8 has special treatment). */ > > +rc |= ioports_deny_access(d, 0xcfc, 0xcff); > > +/* Command-line I/O ranges. */ > > +process_dom0_ioports_disable(d); > > + > > +/* > > + * Modify I/O memory access permissions. > > + */ > > Dito. > > > -BUG_ON(rc != 0); > > +rc = setup_permissions(d); > > +if ( rc != 0 ) > > +panic("Failed to setup Dom0 permissions"); > > To be honest, I'm not sure of this BUG_ON() -> panic() conversion. > I think I'd prefer it to stay the way it was. We're not really expecting > for any of this to fail anyway. > > Jan Done, fixed all the above, thanks. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [qemu-mainline test] 101203: tolerable FAIL - PUSHED
flight 101203 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/101203/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 101186 test-amd64-amd64-xl-rtds 9 debian-install fail like 101186 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 101186 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass version targeted for testing: qemuuc640f2849ee8775fe1bbd7a2772610aa77816f9f baseline version: qemuu4a58f35b793d5d09d6cef921bf6ed7ffc39669fd Last test of basis 101186 2016-09-28 12:16:03 Z1 days Failing since101192 2016-09-28 17:50:33 Z0 days2 attempts Testing same since 101203 2016-09-29 06:01:40 Z0 days1 attempts People who touched revisions under test: Alex BennéeAnthony PERARD David Gibson David Gibson (ppc parts) Fam Zheng Felipe Franciosi Gerd Hoffmann Hervé Poussineau Laurent Vivier Lin Ma Marc-André Lureau Paolo Bonzini Paulina Szubarczyk Pavel Dovgalyuk Peter Maydell Peter Xu Roger Pau Monné Sergey Fedorov Sergey Fedorov Stefan Hajnoczi Yaowei Bai jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386
Re: [Xen-devel] [PATCH 2/2] xen: add qemu device for each pvusb backend
Hi, > > Hmm, I think the xen core needs better QOM support ... > > > > struct XenDevice should have a DeviceState element, so it can be used as > > device object directly instead of attaching a device object like > > this ... > > Hmm, interesting idea. The device object could even be added in > Xen common code if the backend is indicating the need for it via a > special flag/field. I'll have a try. No, not optional. Just turn *all* xen devices into QOM objects. XenDevice should probably a subclass of the base device object (DeviceState), and all Xen backends (block, net, fb, pvusb, ...) should be subclasses of XenDevice. The latter is probably how things are modeled already, just the QOM object stuff is missing (register classes, macros to cast objects, ...) because qdev (the QOM predecessor) didn't have that. Once this is in place you can simply use DEVICE(xendevice) to get the DeviceState pointer. cheers, Gerd ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 08/30] xen/x86: do the PCI scan unconditionally
On Thu, Sep 29, 2016 at 07:55:00AM -0600, Jan Beulich wrote: > >>> On 27.09.16 at 17:57,wrote: > > Instead of being tied to the presence of an IOMMU > > At the very least I'd expect the "why" aspect to get mentioned > here. TBH, it seems simpler to have it there rather than conditional to the presence of an IOMMU. Also, I think scan_pci_devices failing should be fatal. Would you be OK with me adding a panic if the scan fails? > > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > > @@ -219,7 +219,8 @@ int __init amd_iov_detect(void) > > > > if ( !amd_iommu_perdev_intremap ) > > printk(XENLOG_WARNING "AMD-Vi: Using global interrupt remap table > > is not recommended (see XSA-36)!\n"); > > -return scan_pci_devices(); > > + > > +return 0; > > } > > Not how an error return from the function at least here does not > get ignored, leading to the IOMMU not getting enabled. This behavior > should be preserved, and I think it actually should extend to VT-d. > Furthermore iiuc you make PVHv2 Dom0 setup depend on this > succeeding, which may then require further propagation of the > success status here (or maybe, just like PVHv1, you require a > functional IOMMU, in which case failing IOMMU setup would be > sufficient). Yes, PVHv2, just like PVHv1 requires a functional IOMMU, see check_hwdom_reqs in passthrough/iommu.c (if the hardware domain is using a translated paging mode an IOMMU is required). Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 101202: regressions - FAIL
flight 101202 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/101202/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-credit2 15 guest-start/debian.repeat fail REGR. vs. 101182 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-rtds 9 debian-install fail blocked in 101182 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail like 101182 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 101182 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 101182 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 101182 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 101182 Tests which did not succeed, but are not blocking: test-amd64-i386-rumprun-i386 1 build-check(1) blocked n/a test-amd64-amd64-rumprun-amd64 1 build-check(1) blocked n/a build-amd64-rumprun 5 rumprun-buildfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass build-i386-rumprun5 rumprun-buildfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass version targeted for testing: xen a6f681cf110a1833cfe46d514cb8824e9b63fe22 baseline version: xen 1e75ed8b64bc1a9b47e540e6f100f17ec6d97f1b Last test of basis 101182 2016-09-28 09:43:21 Z1 days Failing since101190 2016-09-28 16:15:31 Z0 days2 attempts Testing same since 101202 2016-09-29 04:38:39 Z0 days1 attempts People who touched revisions under test: Boris OstrovskyDaniel Kiper Dario Faggioli Ian Jackson Jan Beulich Jan Beulich [for non-ARM parts] Jan Beulich [non-arm parts] Juergen Gross Julien Grall Keir Fraser Kevin Tian Konrad Rzeszutek Wilk Konrad Rzeszutek Wilk [for Oracle, VirtualIron and
[Xen-devel] [libvirt test] 101200: tolerable FAIL - PUSHED
flight 101200 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/101200/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass version targeted for testing: libvirt f4332dd2d8535b10eef8bce14bf7dc61200495f1 baseline version: libvirt bff2f781ab9d687b57c62e79aedf02a2cee2b77c Last test of basis 101176 2016-09-28 04:23:14 Z1 days Testing same since 101200 2016-09-29 04:30:48 Z0 days1 attempts People who touched revisions under test: Jim Fehligjobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-libvirt-xsm pass test-armhf-armhf-libvirt-xsm fail test-amd64-i386-libvirt-xsm pass test-amd64-amd64-libvirt pass test-armhf-armhf-libvirt fail test-amd64-i386-libvirt pass test-amd64-amd64-libvirt-pairpass test-amd64-i386-libvirt-pair pass test-armhf-armhf-libvirt-qcow2 fail test-armhf-armhf-libvirt-raw fail test-amd64-amd64-libvirt-vhd pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=libvirt + revision=f4332dd2d8535b10eef8bce14bf7dc61200495f1 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push libvirt f4332dd2d8535b10eef8bce14bf7dc61200495f1 + branch=libvirt +
Re: [Xen-devel] [PATCH 2/2] xen: add qemu device for each pvusb backend
On 27/09/16 11:08, Gerd Hoffmann wrote: > Hi, > >> struct usbback_info { >> struct XenDevice xendev; /* must be first */ >> +char id[24]; >> +struct USBBACKDevice *dev; >> USBBus bus; >> void *urb_sring; >> void *conn_sring; >> @@ -116,6 +124,10 @@ struct usbback_info { >> QEMUBH *bh; >> }; >> >> +typedef struct USBBACKDevice { >> +DeviceState qdev; >> +} USBBACKDevice; > > Hmm, I think the xen core needs better QOM support ... > > struct XenDevice should have a DeviceState element, so it can be used as > device object directly instead of attaching a device object like > this ... Hmm, interesting idea. The device object could even be added in Xen common code if the backend is indicating the need for it via a special flag/field. I'll have a try. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [Qemu-devel] [PATCH 2/2] xen: add qemu device for each pvusb backend
On 27/09/16 11:00, Daniel P. Berrange wrote: > On Mon, Sep 26, 2016 at 02:43:57PM +0200, Juergen Gross wrote: >> In order to be able to specify to which pvusb controller a new pvusb >> device should be added we need a qemu device for each pvusb controller >> with an associated id. >> >> Add such a device when a new controller is requested and attach the >> usb bus of that controller to the new device. Any device connected to >> that controller can now specify the bus and port directly via its >> properties. >> >> Signed-off-by: Juergen Gross>> --- >> hw/usb/xen-usb.c | 81 >> +++- >> 1 file changed, 68 insertions(+), 13 deletions(-) >> >> @@ -733,10 +740,10 @@ static void usbback_portid_add(struct usbback_info >> *usbif, unsigned port, >> { >> unsigned speed; >> char *portname; >> -USBPort *p; >> Error *local_err = NULL; >> QDict *qdict; >> QemuOpts *opts; >> +char tmp[32]; >> >> if (usbif->ports[port - 1].dev) { >> return; >> @@ -749,11 +756,14 @@ static void usbback_portid_add(struct usbback_info >> *usbif, unsigned port, >> return; >> } >> portname++; >> -p = &(usbif->ports[port - 1].port); >> -snprintf(p->path, sizeof(p->path), "%s", portname); >> >> qdict = qdict_new(); >> qdict_put(qdict, "driver", qstring_from_str("usb-host")); >> +snprintf(tmp, sizeof(tmp), "%s.0", usbif->id); > > Don't snprintf into fixed length buffers. g_strdup_printf() does the > right thing Okay, will change it. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen: add an own bus for xen backend devices
On 27/09/16 10:53, Gerd Hoffmann wrote: > On Mo, 2016-09-26 at 14:43 +0200, Juergen Gross wrote: >> Add a bus for Xen backend devices in order to be able to establish a >> dedicated device path for pluggable devices. > > Looks sane to me. Can take this through the usb queue if I get an ack > from xen. > >> +#define TYPE_XENSYSDEV "xensysdev" >> +#define TYPE_XENSYSBUS "xen-sysbus" > > I'd make this consistent and use the dash either for both or not at all. Okay, I'll change it to use the dash in both cases. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/2] Xen pvUSB correction
On 27/09/16 10:51, Gerd Hoffmann wrote: > On Mo, 2016-09-26 at 14:43 +0200, Juergen Gross wrote: >> Trying to use pvUSB in a Xen guest with a qemu emulated USB controller >> will crash qemu as it tries to attach a pvUSB device to the emulated >> controller. > > Hmm. --verbose please. > > While this clearly doesn't do what you intended I think it should not > have crashed qemu. pvUSB devices should work on emulated controller > (and emulated devices should work on the pvUSB controller). If they > don't you probably taking shortcuts somewhere which work only for the > pvUSB device on pvUSB controller case. Of course a pvUSB device connected by the pvUSB controller is expecting to be on that controller when doing I/Os. I believe this was the problem here: The device was attached to an emulated USB controller and the pvUSB controller started an I/O which confused the emulated one. > Please check. There is something wrong, sure. A pvUSB device ending on the wrong controller should never receive I/Os from the pvUSB controller. I'll check that. But this problem is independent from the one solved by these patches: I have to make sure the device is connected to the pvUSB controller or otherwise the guest won't be able to access it the way it was meant to. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 04/30] xen/x86: allow calling {sh/hap}_set_allocation with the idle domain
On Thu, Sep 29, 2016 at 04:43:07AM -0600, Jan Beulich wrote: > >>> On 27.09.16 at 17:56,wrote: > > --- a/xen/arch/x86/mm/hap/hap.c > > +++ b/xen/arch/x86/mm/hap/hap.c > > @@ -379,7 +379,9 @@ hap_set_allocation(struct domain *d, unsigned long > > pages, int *preempted) > > break; > > > > /* Check to see if we need to yield and try again */ > > -if ( preempted && hypercall_preempt_check() ) > > +if ( preempted && > > + (is_idle_vcpu(current) ? softirq_pending(smp_processor_id()) : > > + hypercall_preempt_check()) ) > > So what is the supposed action for the caller to take in this case? > I think this should at least be spelled out in the commit message. I'm not sure I follow, but I assume you mean in the idle vcpu case? In that case the caller should call process_pending_softirqs. I will modify the commit message to include it. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 03/30] xen/x86: fix parameters and return value of *_set_allocation functions
On Thu, Sep 29, 2016 at 04:39:02AM -0600, Jan Beulich wrote: > >>> On 27.09.16 at 17:56,wrote: > > Return should be an int, and the number of pages should be an unsigned long. > > I can see the former, but why the latter? Acting on 32-bit quantities > is a little cheaper after all. This was requested by Andrew in the previous version: https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg03126.html But yes, an unsigned int is enough to hold the maximum number of pages for a domain given than we support up to 1TB for HVM guests. Or maybe there are plans to increase the maximum supported memory per guest? Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 10/30] xen/x86: allow the emulated APICs to be enbled for the hardware domain
>>> On 27.09.16 at 17:57,wrote: > +static bool emulation_flags_ok(const struct domain *d, uint32_t emflags) > +{ > + > +if ( is_hvm_domain(d) ) > +{ > +if ( is_hardware_domain(d) && > + emflags != > (XEN_X86_EMU_PIT|XEN_X86_EMU_LAPIC|XEN_X86_EMU_IOAPIC)) > +return false; > +if ( !is_hardware_domain(d) && > + emflags != XEN_X86_EMU_ALL && emflags != 0 ) > +return false; > +} > +else > +{ > +/* PV or classic PVH. */ > +if ( is_hardware_domain(d) && emflags != XEN_X86_EMU_PIT ) > +return false; > +if ( !is_hardware_domain(d) && emflags != 0 ) > +return false; Previous code permitted XEN_X86_EMU_PIT also for the non-hardware domains afaict. You shouldn't change behavior without saying so and explaining why. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 09/30] x86/vtd: fix and simplify mapping RMRR regions
>>> On 27.09.16 at 17:57,wrote: > The current code used by Intel VTd will only map RMRR regions for the > hardware domain, but will fail to map RMRR regions for unprivileged domains > unless the page tables are shared between EPT and IOMMU. Okay, if that's the case it surely should get fixed. > Fix this and > simplify the code, removing the {set/clear}_identity_p2m_entry helpers and > just using the normal MMIO mapping functions. This simplification, however, goes too far. Namely ... > -int set_identity_p2m_entry(struct domain *d, unsigned long gfn, > - p2m_access_t p2ma, unsigned int flag) > -{ > -p2m_type_t p2mt; > -p2m_access_t a; > -mfn_t mfn; > -struct p2m_domain *p2m = p2m_get_hostp2m(d); > -int ret; > - > -if ( !paging_mode_translate(p2m->domain) ) > -{ > -if ( !need_iommu(d) ) > -return 0; > -return iommu_map_page(d, gfn, gfn, IOMMUF_readable|IOMMUF_writable); > -} > - > -gfn_lock(p2m, gfn, 0); > - > -mfn = p2m->get_entry(p2m, gfn, , , 0, NULL, NULL); > - > -if ( p2mt == p2m_invalid || p2mt == p2m_mmio_dm ) > -ret = p2m_set_entry(p2m, gfn, _mfn(gfn), PAGE_ORDER_4K, > -p2m_mmio_direct, p2ma); > -else if ( mfn_x(mfn) == gfn && p2mt == p2m_mmio_direct && a == p2ma ) > -{ > -ret = 0; > -/* > - * PVH fixme: during Dom0 PVH construction, p2m entries are being set > - * but iomem regions are not mapped with IOMMU. This makes sure that > - * RMRRs are correctly mapped with IOMMU. > - */ > -if ( is_hardware_domain(d) && !iommu_use_hap_pt(d) ) > -ret = iommu_map_page(d, gfn, gfn, > IOMMUF_readable|IOMMUF_writable); > -} > -else > -{ > -if ( flag & XEN_DOMCTL_DEV_RDM_RELAXED ) > -ret = 0; > -else > -ret = -EBUSY; > -printk(XENLOG_G_WARNING > - "Cannot setup identity map d%d:%lx," > - " gfn already mapped to %lx.\n", > - d->domain_id, gfn, mfn_x(mfn)); ... this logic (and its clear side counterpart) should not be removed without replacement. Note in this context how you render "flag" an unused parameter of rmrr_identity_mapping(). > --- a/xen/include/xen/p2m-common.h > +++ b/xen/include/xen/p2m-common.h > @@ -2,6 +2,7 @@ > #define _XEN_P2M_COMMON_H > > #include > +#include > > /* > * Additional access types, which are used to further restrict > @@ -46,6 +47,35 @@ int unmap_mmio_regions(struct domain *d, > mfn_t mfn); > > /* > + * Preemptive Helper for mapping/unmapping MMIO regions. > + */ Single line comment. > +static inline int modify_mmio_11(struct domain *d, unsigned long pfn, > + unsigned long nr_pages, bool map) Why do you make this an inline function? And I have to admit that I dislike this strange use of number 11 - what's wrong with continuing to use the term "direct map" in one way or another? > +{ > +int rc; > + > +while ( nr_pages > 0 ) > +{ > +rc = (map ? map_mmio_regions : unmap_mmio_regions) > + (d, _gfn(pfn), nr_pages, _mfn(pfn)); > +if ( rc == 0 ) > +break; > +if ( rc < 0 ) > +{ > +printk(XENLOG_ERR > +"Failed to %smap %#lx - %#lx into domain %d memory map: > %d\n", "Failed to identity %smap [%#lx,%#lx) for d%d: %d\n" And I think XENLOG_WARNING would do - whether this actually is a problem depends on further factors. > + map ? "" : "un", pfn, pfn + nr_pages, d->domain_id, rc); > +return rc; > +} > +nr_pages -= rc; > +pfn += rc; > +process_pending_softirqs(); Is this what you call "preemptive"? > +} > + > +return rc; The way this is coded it appears to possibly return non-zero even in success case. I think this would therefore better be a for ( ; ; ) loop. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 02/30] xen/x86: remove XENFEAT_hvm_pirqs for PVHv2 guests
On Wed, Sep 28, 2016 at 10:03:21AM -0600, Jan Beulich wrote: > >>> On 27.09.16 at 17:56,wrote: > > On PVHv2 guests we explicitly want to use native methods for routing > > interrupts. > > > > Introduce a new XEN_X86_EMU_USE_PIRQ to notify Xen whether a HVM guest can > > route physical interrupts (even from emulated devices) over event channels. > > So you specifically want this new flag off for PVHv2 guests? Based on > just this description I did get the opposite impression... Yes, that's right, I don't want PVHv2 guests to know anything about PIRQs. I don't really know how to reword this, what about: "PVHv2 guests, unlike HVM guests, won't have the option to route interrupts from physical or emulated devices over event channels using PIRQs. This applies to both DomU and Dom0 PVHv2 guests. Introduce a new XEN_X86_EMU_USE_PIRQ to notify Xen whether a HVM guest can route physical interrupts (even from emulated devices) over event channels, and is thus allowed to use some of the PHYSDEV ops." > > --- a/xen/arch/x86/hvm/hvm.c > > +++ b/xen/arch/x86/hvm/hvm.c > > @@ -4117,6 +4117,8 @@ static long hvm_memory_op(int cmd, > > XEN_GUEST_HANDLE_PARAM(void) arg) > > > > static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > > { > > +struct domain *d = current->domain; > > currd please. > > > @@ -4128,7 +4130,9 @@ static long hvm_physdev_op(int cmd, > > XEN_GUEST_HANDLE_PARAM(void) arg) > > case PHYSDEVOP_eoi: > > case PHYSDEVOP_irq_status_query: > > case PHYSDEVOP_get_free_pirq: > > -return do_physdev_op(cmd, arg); > > +return ((d->arch.emulation_flags & XEN_X86_EMU_USE_PIRQ) || > > + is_pvh_vcpu(current)) ? > > is_pvh_domain(currd) Thanks, it's fixed now. I've also taken the opportunity to fixing two other instances of current and current->domain. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 08/30] xen/x86: do the PCI scan unconditionally
>>> On 27.09.16 at 17:57,wrote: > Instead of being tied to the presence of an IOMMU At the very least I'd expect the "why" aspect to get mentioned here. > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -219,7 +219,8 @@ int __init amd_iov_detect(void) > > if ( !amd_iommu_perdev_intremap ) > printk(XENLOG_WARNING "AMD-Vi: Using global interrupt remap table is > not recommended (see XSA-36)!\n"); > -return scan_pci_devices(); > + > +return 0; > } Not how an error return from the function at least here does not get ignored, leading to the IOMMU not getting enabled. This behavior should be preserved, and I think it actually should extend to VT-d. Furthermore iiuc you make PVHv2 Dom0 setup depend on this succeeding, which may then require further propagation of the success status here (or maybe, just like PVHv1, you require a functional IOMMU, in which case failing IOMMU setup would be sufficient). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-4.7-testing test] 101197: tolerable FAIL - PUSHED
flight 101197 xen-4.7-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/101197/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 101076 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 101076 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 101076 test-amd64-amd64-xl-rtds 9 debian-install fail like 101076 Tests which did not succeed, but are not blocking: test-amd64-amd64-rumprun-amd64 1 build-check(1) blocked n/a test-amd64-i386-rumprun-i386 1 build-check(1) blocked n/a build-i386-rumprun5 rumprun-buildfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass build-amd64-rumprun 5 rumprun-buildfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass version targeted for testing: xen 506182e00772ece038ca5207c54dfab3de57adef baseline version: xen a7edbdcac31ad55e93e482f66050bc4ffd04d8a9 Last test of basis 101076 2016-09-21 08:07:08 Z8 days Testing same since 101188 2016-09-28 15:09:14 Z0 days2 attempts People who touched revisions under test: Andrew CooperAndrew Cooper Borislav Petkov Dario Faggioli Emanuel Czirai George Dunlap Jan Beulich Kevin Tian jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64-xtf pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt
[Xen-devel] Adding new custom devices to Xen via QEMU
Hello, My name is Jason Dickens and I'm a Research Scientist here at GrammaTech. Some of our research involves securing hypervisors and we have needed to add to and/or modify Xen. I have been successful in modifying the source for various purposes, but my question now is about devices. We have a custom device model implemented in QEMU which works great with QEMU (on Intel) standalone and with KVM, however, we now want access to it in Xen using the same modified QEMU build. The only problem I seem to be having is getting Xen to send the MMIO R/W's to QEMU. The device is being realized, but guest access to the physical address range I expect to reference the device seem to go no place. I see in the source calls such as "register_io_handler" that other devices use to effect the EPT mapping. Is this what I need? My main question is whether or not it is truly necessary to change Xen itself in order to introduce new devices in Xen using QEMU, or is there just a configuration setting? And what is the simplest way to have a range of physical addresses access a custom QEMU device? Thanks, Jason ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-4.6-testing baseline-only test] 67782: regressions - FAIL
This run is configured for baseline tests only. flight 67782 xen-4.6-testing real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/67782/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-multivcpu 6 xen-boot fail REGR. vs. 67743 test-xtf-amd64-amd64-519 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 67743 test-xtf-amd64-amd64-5 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 67743 test-xtf-amd64-amd64-534 xtf/test-hvm64-invlpg~shadow fail REGR. vs. 67743 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 67743 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail like 67743 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 67743 test-amd64-amd64-xl-qemut-winxpsp3 9 windows-install fail like 67743 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 67743 Tests which did not succeed, but are not blocking: test-amd64-i386-rumprun-i386 1 build-check(1) blocked n/a test-amd64-amd64-rumprun-amd64 1 build-check(1) blocked n/a test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass build-amd64-rumprun 5 rumprun-buildfail never pass build-i386-rumprun5 rumprun-buildfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl-midway 12 migrate-support-checkfail never pass test-armhf-armhf-xl-midway 13 saverestore-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass version targeted for testing: xen ef005cc1f86de8db0880c7b1e233ef9d2b44b4ef baseline version: xen 245fa11021f8f123a82aa7e894d044d8f0ae6923 Last test of basis67743 2016-09-22 02:17:23 Z7 days Testing same since67782 2016-09-29 01:17:52 Z0 days1 attempts People who touched revisions under test: Andrew CooperAndrew Cooper Borislav Petkov Dario Faggioli Emanuel Czirai George Dunlap Jan Beulich Kevin Tian jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm
[Xen-devel] [PATCH] x86emul: simplify LEAVE handling
There's no 1-byte operand size case to take care of here, and there's no point doing the first writeback using dst fields - we can read rBP and write rSP directly. Signed-off-by: Jan Beulich--- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -3130,21 +3130,16 @@ x86_emulate( case 0xc9: /* leave */ /* First writeback, to %%esp. */ -dst.type = OP_REG; dst.bytes = (mode_64bit() && (op_bytes == 4)) ? 8 : op_bytes; -dst.reg = (unsigned long *)&_regs.esp; -dst.val = _regs.ebp; - -/* Flush first writeback, since there is a second. */ switch ( dst.bytes ) { -case 1: *(uint8_t *)dst.reg = (uint8_t)dst.val; break; -case 2: *(uint16_t *)dst.reg = (uint16_t)dst.val; break; -case 4: *dst.reg = (uint32_t)dst.val; break; /* 64b: zero-ext */ -case 8: *dst.reg = dst.val; break; +case 2: *(uint16_t *)&_regs.esp = (uint16_t)_regs.ebp; break; +case 4: _regs.esp = (uint32_t)_regs.ebp; break; /* 64b: zero-ext */ +case 8: _regs.esp = _regs.ebp; break; } /* Second writeback, to %%ebp. */ +dst.type = OP_REG; dst.reg = (unsigned long *)&_regs.ebp; if ( (rc = read_ulong(x86_seg_ss, sp_post_inc(dst.bytes), , dst.bytes, ctxt, ops)) ) x86emul: simplify LEAVE handling There's no 1-byte operand size case to take care of here, and there's no point doing the first writeback using dst fields - we can read rBP and write rSP directly. Signed-off-by: Jan Beulich --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -3130,21 +3130,16 @@ x86_emulate( case 0xc9: /* leave */ /* First writeback, to %%esp. */ -dst.type = OP_REG; dst.bytes = (mode_64bit() && (op_bytes == 4)) ? 8 : op_bytes; -dst.reg = (unsigned long *)&_regs.esp; -dst.val = _regs.ebp; - -/* Flush first writeback, since there is a second. */ switch ( dst.bytes ) { -case 1: *(uint8_t *)dst.reg = (uint8_t)dst.val; break; -case 2: *(uint16_t *)dst.reg = (uint16_t)dst.val; break; -case 4: *dst.reg = (uint32_t)dst.val; break; /* 64b: zero-ext */ -case 8: *dst.reg = dst.val; break; +case 2: *(uint16_t *)&_regs.esp = (uint16_t)_regs.ebp; break; +case 4: _regs.esp = (uint32_t)_regs.ebp; break; /* 64b: zero-ext */ +case 8: _regs.esp = _regs.ebp; break; } /* Second writeback, to %%ebp. */ +dst.type = OP_REG; dst.reg = (unsigned long *)&_regs.ebp; if ( (rc = read_ulong(x86_seg_ss, sp_post_inc(dst.bytes), , dst.bytes, ctxt, ops)) ) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH] x86emul: fix {,i}mul and {,i}div
Commit a3db233ede ("x86emul: use DstEax also for {,I}{MUL,DIV}") went a little too far: DstEax and SrcEax weren't really meant to be used together with ModRM - they assume modrm_reg remains zero by the time the destination / source register pointer gets calculated. Don't fully undo that commit though, but instead just correct the register pointer, and don't use dst.val as input for mul and imul (div and idiv did avoid that already). Reported-by: Konrad Rzeszutek WilkSigned-off-by: Jan Beulich --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -3845,18 +3845,19 @@ x86_emulate( emulate_1op("neg", dst, _regs.eflags); break; case 4: /* mul */ +dst.reg = (unsigned long *)&_regs.eax; _regs.eflags &= ~(EFLG_OF|EFLG_CF); switch ( dst.bytes ) { case 1: -dst.val = (uint8_t)dst.val; +dst.val = (uint8_t)_regs.eax; dst.val *= src.val; if ( (uint8_t)dst.val != (uint16_t)dst.val ) _regs.eflags |= EFLG_OF|EFLG_CF; dst.bytes = 2; break; case 2: -dst.val = (uint16_t)dst.val; +dst.val = (uint16_t)_regs.eax; dst.val *= src.val; if ( (uint16_t)dst.val != (uint32_t)dst.val ) _regs.eflags |= EFLG_OF|EFLG_CF; @@ -3864,7 +3865,7 @@ x86_emulate( break; #ifdef __x86_64__ case 4: -dst.val = (uint32_t)dst.val; +dst.val = _regs._eax; dst.val *= src.val; if ( (uint32_t)dst.val != dst.val ) _regs.eflags |= EFLG_OF|EFLG_CF; @@ -3873,7 +3874,7 @@ x86_emulate( #endif default: u[0] = src.val; -u[1] = dst.val; +u[1] = _regs.eax; if ( mul_dbl(u) ) _regs.eflags |= EFLG_OF|EFLG_CF; _regs.edx = u[1]; @@ -3882,12 +3883,13 @@ x86_emulate( } break; case 5: /* imul */ +dst.reg = (unsigned long *)&_regs.eax; imul: _regs.eflags &= ~(EFLG_OF|EFLG_CF); switch ( dst.bytes ) { case 1: -dst.val = (int8_t)src.val * (int8_t)dst.val; +dst.val = (int8_t)src.val * (int8_t)_regs.eax; if ( (int8_t)dst.val != (int16_t)dst.val ) _regs.eflags |= EFLG_OF|EFLG_CF; ASSERT(b > 0x6b); @@ -3895,7 +3897,7 @@ x86_emulate( break; case 2: dst.val = ((uint32_t)(int16_t)src.val * - (uint32_t)(int16_t)dst.val); + (uint32_t)(int16_t)_regs.eax); if ( (int16_t)dst.val != (int32_t)dst.val ) _regs.eflags |= EFLG_OF|EFLG_CF; if ( b > 0x6b ) @@ -3904,7 +3906,7 @@ x86_emulate( #ifdef __x86_64__ case 4: dst.val = ((uint64_t)(int32_t)src.val * - (uint64_t)(int32_t)dst.val); + (uint64_t)(int32_t)_regs.eax); if ( (int32_t)dst.val != dst.val ) _regs.eflags |= EFLG_OF|EFLG_CF; if ( b > 0x6b ) @@ -3913,7 +3915,7 @@ x86_emulate( #endif default: u[0] = src.val; -u[1] = dst.val; +u[1] = _regs.eax; if ( imul_dbl(u) ) _regs.eflags |= EFLG_OF|EFLG_CF; if ( b > 0x6b ) @@ -3923,6 +3925,7 @@ x86_emulate( } break; case 6: /* div */ +dst.reg = (unsigned long *)&_regs.eax; switch ( src.bytes ) { case 1: @@ -3968,6 +3971,7 @@ x86_emulate( } break; case 7: /* idiv */ +dst.reg = (unsigned long *)&_regs.eax; switch ( src.bytes ) { case 1: x86emul: fix {,i}mul and {,i}div Commit a3db233ede ("x86emul: use DstEax also for {,I}{MUL,DIV}") went a little too far: DstEax and SrcEax weren't really meant to be used together with ModRM - they assume modrm_reg remains zero by the time the destination / source register pointer gets calculated. Don't fully undo that commit though, but instead just correct the register pointer, and don't use dst.val as input for mul and imul (div and idiv did avoid that already). Reported-by: Konrad Rzeszutek Wilk Signed-off-by: Jan Beulich --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -3845,18 +3845,19 @@ x86_emulate( emulate_1op("neg", dst, _regs.eflags); break;
Re: [Xen-devel] [PATCH v2 01/30] xen/x86: move setup of the VM86 TSS to the domain builder
On Wed, Sep 28, 2016 at 09:35:21AM -0600, Jan Beulich wrote: > >>> On 27.09.16 at 17:56,wrote: > > This is also required for PVHv2 guests if they want to use real-mode, and > > hvmloader is not executed for those kind of guests. > > While the intention is fine, I'm not convinced of consuming yet another > special page here: Other than the way hvmloader's allocation works, > here you permanently take away a page from the guest unconditionally > which (a) is used only on VMX, (b) only on old hardware, and (c) VMX > code appears to even be able to help itself without this TSS (at the > price of doing more emulation). Yes, real mode should also work without this. Given that I don't think we expect real-mode to be used for mostly anything but early AP initialization, I guess we could just leave PVHv2 guests without the TSS. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [distros-debian-wheezy test] 67783: all pass
flight 67783 distros-debian-wheezy real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/67783/ Perfect :-) All tests in this flight passed as required baseline version: flight 67745 jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-amd64-wheezy-netboot-pvgrub pass test-amd64-i386-i386-wheezy-netboot-pvgrub pass test-amd64-i386-amd64-wheezy-netboot-pygrub pass test-amd64-amd64-i386-wheezy-netboot-pygrub pass sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 101207: tolerable all pass - PUSHED
flight 101207 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/101207/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass version targeted for testing: xen fb8be95ca0b5fc816fd2234925f95c3f82ead824 baseline version: xen a6f681cf110a1833cfe46d514cb8824e9b63fe22 Last test of basis 101198 2016-09-29 02:02:02 Z0 days Testing same since 101207 2016-09-29 11:03:56 Z0 days1 attempts People who touched revisions under test: Jan BeulichPaul Lai jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=fb8be95ca0b5fc816fd2234925f95c3f82ead824 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke fb8be95ca0b5fc816fd2234925f95c3f82ead824 + branch=xen-unstable-smoke + revision=fb8be95ca0b5fc816fd2234925f95c3f82ead824 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.7-testing + '[' xfb8be95ca0b5fc816fd2234925f95c3f82ead824 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git ++ :
Re: [Xen-devel] [PATCH v8 06/13] efi: create new early memory allocator
On Thu, Sep 29, 2016 at 10:08:30AM +0200, Daniel Kiper wrote: > On Thu, Sep 29, 2016 at 01:40:44AM -0600, Jan Beulich wrote: > > >>> On 29.09.16 at 00:51,wrote: > > > v8 - suggestions/fixes: > > >- disable whole ebmalloc machinery on ARM platforms, > > > > This is certainly not in line with my understanding of the outcome of > > that discussion. > > Well, I understand that you would like to reduce number of changes > needed to enable ebmalloc ARM in the future. However, on the other > hand Julien wish to not have any calls to ebmalloc on ARM now. So, > if the ebmalloc code and data is not referenced anywhere on ARM then > I do not think that it make sens to build it into ARM xen image. > Especially if I am not sure that 1 MiB ebmalloc region is correct > or not on ARM. So, I would like to wait for Julien final opinion > about that. If he likes your proposal (maybe with some adjustments) > then I am not going to object. Julien, ping? Sorry for fire drill but tomorrow is last day before hard code freeze. Please tell us what do you think about this patch. Then I will prepare v9 and post it today evening or tomorrow morning. Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Regression introduced by a3db233 x86emul: use DstEax also for {, I}{MUL, DIV
>>> On 29.09.16 at 11:34,wrote: On 29.09.16 at 11:11, wrote: >> The commit a3db233 x86emul: use DstEax also for {,I}{MUL,DIV} >> introduces an regression when doing SR-IOV passthrough. > > I'll see if I can repro, Actually, I can see some variant of this (and without any SR-IOV), as soon as I use "unrestricted_guest=0" on the command line. Debugging now ... Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 18/24] xen: credit2: soft-affinity awareness fallback_cpu() and cpu_pick()
On 17/08/16 18:19, Dario Faggioli wrote: > For get_fallback_cpu(), by putting in place the "usual" > two steps (soft affinity step and hard affinity step) > loop. We just move the core logic of the function inside > the body of the loop itself. > > For csched2_cpu_pick(), what is important is to find > the runqueue with the least average load. Currently, > we do that by looping on all runqueues and checking, > well, their load. For soft affinity, we want to know > which one is the runqueue with the least load, among > the ones where the vcpu would prefer to be assigned. > > We find both the least loaded runqueue among the soft > affinity "friendly" ones, and the overall least loaded > one, in the same pass. > > (Also, kill a spurious ';' when defining MAX_LOAD.) > > Signed-off-by: Dario Faggioli> Signed-off-by: Justin T. Weaver Looks good: Reviewed-by: George Dunlap ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 06/30] x86/paging: introduce paging_set_allocation
>>> On 27.09.16 at 17:57,wrote: > @@ -1383,15 +1382,25 @@ int __init construct_dom0( > nr_pages); > } > > -if ( is_pvh_domain(d) ) > -hap_set_alloc_for_pvh_dom0(d, dom0_paging_pages(d, nr_pages)); > - > /* > * We enable paging mode again so guest_physmap_add_page will do the > * right thing for us. > */ I'm afraid you render this comment stale - please adjust it accordingly. > --- a/xen/arch/x86/mm/paging.c > +++ b/xen/arch/x86/mm/paging.c > @@ -954,6 +954,22 @@ void paging_write_p2m_entry(struct p2m_domain *p2m, > unsigned long gfn, > safe_write_pte(p, new); > } > > +int paging_set_allocation(struct domain *d, unsigned long pages, int > *preempted) Since you need to touch the other two functions anyway, please make the last parameter bool * here and there. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] hvmloader, pci: Don't try to relocate memory if 64-bit BAR is bigger than ~2GB
On Thu, Sep 29, 2016 at 03:36:00AM -0600, Jan Beulich wrote: > >>> On 29.09.16 at 11:23,wrote: > > On Thu, Sep 29, 2016 at 01:03:02AM -0600, Jan Beulich wrote: > >> >>> On 29.09.16 at 01:48, wrote: > >> > @@ -265,11 +266,30 @@ void pci_setup(void) > >> > bars[i].devfn = devfn; > >> > bars[i].bar_reg = bar_reg; > >> > bars[i].bar_sz = bar_sz; > >> > +bars[i].above_4gb = false; > >> > > >> > if ( ((bar_data & PCI_BASE_ADDRESS_SPACE) == > >> >PCI_BASE_ADDRESS_SPACE_MEMORY) || > >> > (bar_reg == PCI_ROM_ADDRESS) ) > >> > -mmio_total += bar_sz; > >> > +{ > >> > +/* > >> > + * If bigger than 2GB minus emulated devices BAR space > >> > and > >> > + * APIC space, then don't try to put under 4GB. > >> > + */ > >> > +if ( is_64bar && (mmio_total >= GB(2) || bar_sz >= > >> > + (GB(2) - HVM_BELOW_4G_MMIO_LENGTH - mmio_total)) ) > >> > >> As mentioned in the reply to your earlier mail already, the > >> subtraction of mmio_total here is risking wrap through zero (the > >> >= GB(2) check doesn't fully guard against that). > > > > I am still waking up so bear with me, but is the reason the mmio_total > >>= GB(2) check does not guard is because the compiler may choose > > to execute _both_ parts of the '||' conditional (or swap them and > > execute the 'mmio_total >= GB(2)' second)? > > No, it's because you subtract more than just mmio_total from GB(2). > > >> Furthermore you're now making behavior dependent on the order > >> devices appear on the bus: The same device appearing early may > >> get its BAR placed below 4Gb whereas when it appears late, it'll > >> get placed high. IOW I think this needs further refinement: We > >> should in a first pass place only 32-bit BARs. In a second pass we > >> can then see which 64-bit BARs still fit (and I think we then ought > >> to prefer small ones). Which means we should presumably account > >> 32- and 64-bit BARs here independent of any other considerations, > >> deferring the decision which 64-bit ones to place low until after this > >> first pass. > > > > Ok, that is going to require some surgery and movement of code to add > > some functions in that giant piece of code. Expect more patches next > > week (or would it be easier if I just sent them out for the next release > > considering the amount of patches that are floating this week that need > > review?) > > Well, I would view this as a bug fix, so it might still be allowed in. > Ask Wei if in doubt. > Before RC1, sure. After we cut RCs, anything that changes memory layout of the guests need to be considered carefully. Wei. > Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 05/30] xen/x86: assert that local_events_need_delivery is not called by the idle domain
>>> On 27.09.16 at 17:57,wrote: > It doesn't make sense since the idle domain doesn't receive any events. The change itself is fine, but I think it would help if the commit message made explicit why this is becoming relevant. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 04/30] xen/x86: allow calling {sh/hap}_set_allocation with the idle domain
>>> On 27.09.16 at 17:56,wrote: > --- a/xen/arch/x86/mm/hap/hap.c > +++ b/xen/arch/x86/mm/hap/hap.c > @@ -379,7 +379,9 @@ hap_set_allocation(struct domain *d, unsigned long pages, > int *preempted) > break; > > /* Check to see if we need to yield and try again */ > -if ( preempted && hypercall_preempt_check() ) > +if ( preempted && > + (is_idle_vcpu(current) ? softirq_pending(smp_processor_id()) : > + hypercall_preempt_check()) ) So what is the supposed action for the caller to take in this case? I think this should at least be spelled out in the commit message. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 07/16] x86emul: generate and make use of a canonical opcode representation
On 28/09/16 09:12, Jan Beulich wrote: > @@ -1732,13 +1745,35 @@ x86_decode_twobyte( > } > > static int > +x86_decode_0f38( > +struct x86_emulate_state *state, > +struct x86_emulate_ctxt *ctxt, > +const struct x86_emulate_ops *ops) > +{ > +switch ( ctxt->opcode & X86EMUL_OPC_MASK ) > +{ > +case 0x00 ... 0xef: > +case 0xf2 ... 0xff: > +ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK); > +break; > + > +case 0xf0: case 0xf1: /* movbe / crc32 */ > +if ( rep_prefix() ) > +ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK); > +break; > +} > + > +return X86EMUL_OKAY; > +} > + > +static int > x86_decode( > struct x86_emulate_state *state, > struct x86_emulate_ctxt *ctxt, > const struct x86_emulate_ops *ops) > { > uint8_t b, d, sib, sib_index, sib_base; > -unsigned int def_op_bytes, def_ad_bytes; > +unsigned int def_op_bytes, def_ad_bytes, opcode; > int rc = X86EMUL_OKAY; > > memset(state, 0, sizeof(*state)); > @@ -1819,29 +1854,31 @@ x86_decode( > > /* Opcode byte(s). */ > d = opcode_table[b]; > -if ( d == 0 ) > +if ( d == 0 && b == 0x0f) Spaces. Otherwise, Reviewed-by: Andrew Cooper___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel