Re: [Xen-devel] [PATCH v3 08/15] tools: create general interfaces to support psr allocation features
Per Jan's suggestion, remove people not related to tools/ patches to save mailbox space. > On Tue, Sep 05, 2017 at 05:32:30PM +0800, Yi Sun wrote: > > diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h > > index 484b5b7..9744087 100644 > > --- a/tools/libxl/libxl.h > > +++ b/tools/libxl/libxl.h > > @@ -931,6 +931,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, > > const libxl_mac *src); > > #define LIBXL_HAVE_PSR_L2_CAT 1 > > > > /* > > + * LIBXL_HAVE_PSR_GENERIC > > + * > > + * If this is defined, the Memory Bandwidth Allocation feature is > > supported. > > You should also mention that if this is defined the following public > functions are available: > > libxl_psr_{set/get}_val > libxl_psr_get_hw_info > libxl_psr_hw_info_list_free > Sure, thanks! > Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 07/15] x86: implement set value flow for MBA
On 17-09-19 10:57:16, Roger Pau Monn� wrote: > On Tue, Sep 05, 2017 at 05:32:29PM +0800, Yi Sun wrote: [...] > > +static bool cat_check_cbm(const struct feat_node *feat, unsigned long cbm) > > +{ > > +unsigned int first_bit, zero_bit; > > +unsigned int cbm_len = feat->cat.cbm_len; > > + > > +/* Set bits should only in the range of [0, cbm_len]. */ > > +if ( cbm & (~0ul << cbm_len) ) > > +return false; > > + > > +/* At least one bit need to be set. */ > > +if ( cbm == 0 ) > > +return false; > > You can join both checks into a single if. > Sure. > > + > > +first_bit = find_first_bit(, cbm_len); > > +zero_bit = find_next_zero_bit(, cbm_len, first_bit); > > + > > +/* Set bits should be contiguous. */ > > +if ( zero_bit < cbm_len && > > + find_next_bit(, cbm_len, zero_bit) < cbm_len ) > > +return false; > > + > > +return true; > > +} > > + [...] > > static void do_write_psr_msrs(void *data) > > Why does this function take a 'void *data' instead of 'const struct > cos_write_info *info'? > Because 'do_write_psr_msrs' is an parameter of 'on_selected_cpus' which is declared below: void on_selected_cpus( const cpumask_t *selected, void (*func) (void *info), void *info, int wait) > > { > > const struct cos_write_info *info = data; > > -struct feat_node *feat = info->feature; > > -const struct feat_props *props = info->props; > > -unsigned int i, cos = info->cos, cos_num = props->cos_num; > > +unsigned int i, index = 0, array_len = info->array_len, cos = > > info->cos; > > +const uint32_t *val_array = info->val; > > > > -for ( i = 0; i < cos_num; i++ ) > > +for ( i = 0; i < ARRAY_SIZE(feat_props); i++ ) > > { > > -if ( feat->cos_reg_val[cos * cos_num + i] != info->val[i] ) > > +struct feat_node *feat = info->features[i]; > > +const struct feat_props *props = info->props[i]; > > +unsigned int cos_num, j; > > + > > +if ( !feat || !props ) > > +continue; > > + > > +cos_num = props->cos_num; > > +if ( array_len < cos_num ) > > Not sure you need array_len, couldn't you use: > > if ( index + cos_num >= info->array_len ) > return; > > ? > Looks good. Thanks! > Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 06/15] x86: implement get value interface for MBA
On 17-09-19 10:15:42, Roger Pau Monn� wrote: > On Tue, Sep 05, 2017 at 05:32:28PM +0800, Yi Sun wrote: > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > > index 696eff2..7902af7 100644 > > --- a/xen/arch/x86/domctl.c > > +++ b/xen/arch/x86/domctl.c > > @@ -1496,6 +1496,13 @@ long arch_do_domctl( > > copyback = true; > > break; > > > > +case XEN_DOMCTL_PSR_ALLOC_GET_MBA_THRTL: > > +ret = psr_get_val(d, domctl->u.psr_alloc.target, > > + , PSR_TYPE_MBA_THRTL); > > +domctl->u.psr_alloc.data = val32; > > Hm, why does psr_get_val take a uint32_t * instead of a uint64_t *? So > that you can directly pass >u.psr_alloc.data. > > Or the other way around, why is domctl->u.psr_alloc.data a uint64_t > instead of a uint32_t? > There is a historical reason. The COS MSR is 64bit. So, the original codes in L3 CAT (submitted years ago) used uint64_t. But during L2 CAT review, per Jan's comment, the uint64_t is not necessary in psr.c. So, we convert it to uint32_t in psr.c and make the codes you see here. > Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 05/15] x86: implement get hw info flow for MBA
On 17-09-19 10:08:22, Roger Pau Monn� wrote: > On Tue, Sep 05, 2017 at 05:32:27PM +0800, Yi Sun wrote: > > diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c > > index 10776d2..0486d2d 100644 > > --- a/xen/arch/x86/psr.c > > +++ b/xen/arch/x86/psr.c > > @@ -491,7 +495,18 @@ static const struct feat_props l2_cat_props = { > > static bool mba_get_feat_info(const struct feat_node *feat, > >uint32_t data[], unsigned int array_len) > > { > > -return false; > > +if ( array_len != PSR_INFO_ARRAY_SIZE ) > > +return false; > > + > > +data[PSR_INFO_IDX_COS_MAX] = feat->cos_max; > > +data[PSR_INFO_IDX_MBA_THRTL_MAX] = feat->mba.thrtl_max; > > + > > +if ( feat->mba.linear ) > > +data[PSR_INFO_IDX_MBA_FLAG] |= XEN_SYSCTL_PSR_ALLOC_MBA_LINEAR; > > +else > > +data[PSR_INFO_IDX_MBA_FLAG] &= ~XEN_SYSCTL_PSR_ALLOC_MBA_LINEAR; > > This branch of the if shouldn't be needed... > > > + > > +return true; > > } > > > > static void mba_write_msr(unsigned int cos, uint32_t val, > > diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c > > index 1d3dbd0..4634cad 100644 > > --- a/xen/arch/x86/sysctl.c > > +++ b/xen/arch/x86/sysctl.c > > @@ -214,6 +214,25 @@ long arch_do_sysctl( > > break; > > } > > > > +case XEN_SYSCTL_PSR_ALLOC_get_mba_info: > > +{ > > +ret = psr_get_info(sysctl->u.psr_alloc.target, > > + PSR_TYPE_MBA_THRTL, data, ARRAY_SIZE(data)); > > ... because data should be initialized, ie: > > uint32_t data[PSR_INFO_ARRAY_SIZE] = { 0 }; > > So that we don't leak stack data in the sysctl. > Ok, thanks! > Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [ovmf baseline-only test] 72129: all pass
This run is configured for baseline tests only. flight 72129 ovmf real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/72129/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 424a5ec33b3d5a842bff3f4695d0bd709c91a163 baseline version: ovmf a3a4737051010a94832f7bceaa1fa414d7259da0 Last test of basis72127 2017-09-19 16:20:35 Z0 days Testing same since72129 2017-09-19 23:49:40 Z0 days1 attempts People who touched revisions under test: Ard Biesheuveljobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. commit 424a5ec33b3d5a842bff3f4695d0bd709c91a163 Author: Ard Biesheuvel Date: Fri Sep 15 16:06:02 2017 -0700 BaseTools/tools_def AARCH64: enable frame pointers for RELEASE builds Commit 8f0b62a5dac0 ("BaseTools/tools_def AARCH64: enable frame pointers for DEBUG builds") removed the -fomit-frame-pointer switch from the CFLAGS definitions that are shared between AARCH64 DEBUG and RELEASE builds, and moved it to the RELEASE specific ones, so that DEBUG builds can produce a backtrace when a crash occurs. This is actually a useful thing to have for RELEASE builds as well. AArch64 has 30 general purpose registers, and so the performance hit of having a frame pointer is unlikely to be noticeable, nor are the additional 8 bytes of stack space likely to present a problem. So remove -fomit-frame-pointer altogether this time. Contributed-under: TianoCore Contribution Agreement 1.1 Signed-off-by: Ard Biesheuvel Reviewed-by: Leif Lindholm Reviewed-by: Liming Gao commit 4bbcc285d5f74d34ec40733dde807f5a4f0cdf8c Author: Ard Biesheuvel Date: Mon Sep 11 17:50:29 2017 +0100 ArmPkg/PlatformBootManagerLib: process pending capsules Process any capsule HOBs that were left for us by CapsulePei. This involves calling ProcessCapsules() twice, as explained in the comment in DxeCapsuleLibFmp [sic]. 1) The first call must be before EndOfDxe. The system capsules is processed. If device capsule FMP protocols are exposted at this time and device FMP capsule has zero EmbeddedDriverCount, the device capsules are processed. Each individual capsule result is recorded in capsule record variable. System may reset in this function, if reset is required by capsule and all capsules are processed. If not all capsules are processed, reset will be defered to second call. 2) The second call must be after EndOfDxe and after ConnectAll, so that all device capsule FMP protocols are exposed. The system capsules are skipped. If the device capsules are NOT processed in first call, they are processed here. Each individual capsule result is recorded in capsule record variable. System may reset in this function, if reset is required by capsule processed in first call and second call. Contributed-under: TianoCore Contribution Agreement 1.1 Signed-off-by: Ard Biesheuvel Reviewed-by: Leif Lindholm ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [linux-4.1 baseline-only test] 72128: regressions - trouble: blocked/broken/fail/pass
This run is configured for baseline tests only. flight 72128 linux-4.1 real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/72128/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl 7 xen-boot fail REGR. vs. 71948 test-armhf-armhf-examine 7 rebootfail REGR. vs. 71948 test-armhf-armhf-xl-credit2 7 xen-boot fail REGR. vs. 71948 test-armhf-armhf-xl-multivcpu 7 xen-boot fail REGR. vs. 71948 test-armhf-armhf-libvirt 7 xen-boot fail REGR. vs. 71948 test-armhf-armhf-xl-midway7 xen-boot fail REGR. vs. 71948 test-armhf-armhf-libvirt-raw 7 xen-boot fail REGR. vs. 71948 test-armhf-armhf-xl-vhd 7 xen-boot fail REGR. vs. 71948 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 71948 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 7 xen-boot fail REGR. vs. 71948 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail REGR. vs. 71948 Tests which did not succeed, but are not blocking: test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-xl 1 build-check(1) blocked n/a build-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-examine 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit2 1 build-check(1) blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a build-arm64 2 hosts-allocate broken never pass build-arm64-pvops 2 hosts-allocate broken never pass build-arm64-xsm 2 hosts-allocate broken never pass build-arm64 3 capture-logs broken never pass build-arm64-pvops 3 capture-logs broken never pass build-arm64-xsm 3 capture-logs broken never pass test-armhf-armhf-libvirt-xsm 7 xen-boot fail blocked in 71948 test-armhf-armhf-xl-xsm 7 xen-boot fail blocked in 71948 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail like 71948 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 71948 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 71948 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-install fail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 10 windows-install fail never pass test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemut-win10-i386 17 guest-stop fail never pass test-amd64-amd64-xl-qemut-win10-i386 17 guest-stop fail never pass version targeted for testing: linux5fbef6af7dd9a92605bb7c426f26bd122fd0cd74 baseline version: linux1af952704416d76ad86963f04feb10a3da143901 Last test of basis71948 2017-08-07 21:48:50 Z 43 days Testing same since72128 2017-09-19 21:52:44 Z0 days1 attempts People who touched revisions under test: "Eric W. Biederman"Akinobu Mita Alan Stern Alan Swanson Alex Deucher Alex Williamson Alexander Potapenko Andrea Righi Andrew Morton Andrey Ryabinin Andrzej Hajda Anna Schumaker Anton Blanchard Ard Biesheuvel Arnaldo Carvalho de Melo Arnd Bergmann Arvind Yadav
Re: [Xen-devel] [PATCH v3 04/15] x86: implement data structure and CPU init flow for MBA
On 17-09-19 09:55:28, Roger Pau Monn� wrote: > On Tue, Sep 05, 2017 at 05:32:26PM +0800, Yi Sun wrote: > > This patch implements main data structures of MBA. > > > > Like CAT features, MBA HW info has cos_max which means the max thrtl > > register number, and thrtl_max which means the max throttle value > > (delay value). It also has a flag to represent if the throttle > > value is linear or not. > > > > One thrtl register of MBA stores a throttle value for one or more > > domains. The throttle value means the transaction time between L2 > > cache and next level memory to be delayed. > > "The throttle value contains the delay between L2 cache and the next > cache level." > > Seems better, but I'm not a native speaker anyway. > Or: "The throttle value means the delay between L2 cache and the next cache level." [...] > > struct feat_node { > > -/* cos_max and cbm_len are common values for all features so far. */ > > +/* cos_max is common values for all features so far. */ > > ...common among all features... > Ok, thanks! [...] > > +static int mba_init_feature(const struct cpuid_leaf *regs, > > +struct feat_node *feat, > > +struct psr_socket_info *info, > > +enum psr_feat_type type) > > +{ > > +/* No valid value so do not enable feature. */ > > +if ( !regs->a || !regs->d ) > > +return -ENOENT; > > + > > +if ( type != FEAT_TYPE_MBA ) > > +return -ENOENT; > > You can join the two checks above in a single if. > Sure. > > + > > +feat->cos_max = min(opt_cos_max, regs->d & CAT_COS_MAX_MASK); > > +if ( feat->cos_max < 1 ) > > +return -ENOENT; > > + > > +feat->mba.thrtl_max = (regs->a & MBA_THRTL_MAX_MASK) + 1; > > + > > +if ( regs->c & MBA_LINEAR_MASK ) > > +{ > > +feat->mba.linear = true; > > + > > +if ( feat->mba.thrtl_max >= 100 ) > > +return -ENOENT; > > +} > > + > > +/* We reserve cos=0 as default thrtl (0) which means no delay. */ > > +feat->cos_reg_val[0] = 0; > > AFAICT feat is allocated using xzalloc, so this will already be 0. > Yes, you are right. My original purpose is to explicitly let reader know that 'cos=0' is reserved. But the code is redundant that I will remove it. > > @@ -1389,6 +1480,7 @@ static void psr_cpu_init(void) > > unsigned int socket, cpu = smp_processor_id(); > > struct feat_node *feat; > > struct cpuid_leaf regs; > > +uint32_t reg_b; > > Not sure of the benefit between using regs.b or reg_b (it's only 1 > char shorter). > You can see the 'regs' is overwritten in below codes so that the 'regs.b' is not kept. To add a new local variable 'reg_b' here, we can avoid calling 'cpuid_count_leaf' for L2 CAT and MBA. > > > > if ( !psr_alloc_feat_enabled() || !boot_cpu_has(X86_FEATURE_PQE) ) > > goto assoc_init; > > @@ -1407,7 +1499,8 @@ static void psr_cpu_init(void) > > spin_lock_init(>ref_lock); > > > > cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 0, ); > > -if ( regs.b & PSR_RESOURCE_TYPE_L3 ) > > +reg_b = regs.b; > > +if ( reg_b & PSR_RESOURCE_TYPE_L3 ) > > { > > cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 1, ); > > > > @@ -1428,8 +1521,7 @@ static void psr_cpu_init(void) > > } > > } > > > > -cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 0, ); > > -if ( regs.b & PSR_RESOURCE_TYPE_L2 ) > > +if ( reg_b & PSR_RESOURCE_TYPE_L2 ) > > { > > cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 2, ); > > > > @@ -1441,6 +1533,18 @@ static void psr_cpu_init(void) > > feat_l2_cat = feat; > > } > > > > +if ( reg_b & PSR_RESOURCE_TYPE_MBA ) > > +{ > > +cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 3, ); > > + > > +feat = feat_mba; > > +feat_mba = NULL; > > +if ( !mba_init_feature(, feat, info, FEAT_TYPE_MBA) ) > > Seems kind of pointless that mba_init_feature returns an error code > when it's ignored by it's callers. You could switch it to bool if you > are going to use it like that. > Hmm, bool type seems better. Thanks! > Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 02/15] Rename PSR sysctl/domctl interfaces and xsm policy to make them be general
On 17-09-19 09:03:38, Roger Pau Monn� wrote: > On Tue, Sep 05, 2017 at 05:32:24PM +0800, Yi Sun wrote: > > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h > > index 0669c31..a953157 100644 > > --- a/xen/include/public/domctl.h > > +++ b/xen/include/public/domctl.h > > -struct xen_domctl_psr_cat_op { > > -#define XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM 0 > > -#define XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM 1 > > -#define XEN_DOMCTL_PSR_CAT_OP_SET_L3_CODE2 > > -#define XEN_DOMCTL_PSR_CAT_OP_SET_L3_DATA3 > > -#define XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE4 > > -#define XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA5 > > -#define XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM 6 > > -#define XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM 7 > > +struct xen_domctl_psr_alloc { > > +#define XEN_DOMCTL_PSR_ALLOC_SET_L3_CBM 0 > > +#define XEN_DOMCTL_PSR_ALLOC_GET_L3_CBM 1 > > +#define XEN_DOMCTL_PSR_ALLOC_SET_L3_CODE2 > > +#define XEN_DOMCTL_PSR_ALLOC_SET_L3_DATA3 > > +#define XEN_DOMCTL_PSR_ALLOC_GET_L3_CODE4 > > +#define XEN_DOMCTL_PSR_ALLOC_GET_L3_DATA5 > > +#define XEN_DOMCTL_PSR_ALLOC_SET_L2_CBM 6 > > +#define XEN_DOMCTL_PSR_ALLOC_GET_L2_CBM 7 > > IMHO, the _ALLOC_ part is not needed here, ALLOC_GET/SET seems quite > weird to me, and redundant, since the type itself already contains > _alloc). Ok. > > > uint32_t cmd; /* IN: XEN_DOMCTL_PSR_CAT_OP_* */ > > This comments needs fixing. > Yes, thanks! [...] > > diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h > > index 9e51af6..4759b10 100644 > > --- a/xen/include/public/sysctl.h > > +++ b/xen/include/public/sysctl.h > > @@ -36,7 +36,7 @@ > > #include "physdev.h" > > #include "tmem.h" > > > > -#define XEN_SYSCTL_INTERFACE_VERSION 0x000F > > +#define XEN_SYSCTL_INTERFACE_VERSION 0x0010 > > > > /* > > * Read console content from Xen buffer ring. > > @@ -743,22 +743,22 @@ struct xen_sysctl_pcitopoinfo { > > typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t; > > DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t); > > > > -#define XEN_SYSCTL_PSR_CAT_get_l3_info 0 > > -#define XEN_SYSCTL_PSR_CAT_get_l2_info 1 > > -struct xen_sysctl_psr_cat_op { > > +#define XEN_SYSCTL_PSR_ALLOC_get_l3_info 0 > > +#define XEN_SYSCTL_PSR_ALLOC_get_l2_info 1 > > Same here, I would drop the _ALLOC_. > Ok. > Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 01/15] docs: create Memory Bandwidth Allocation (MBA) feature document
On 17-09-18 18:16:40, Roger Pau Monn� wrote: > On Tue, Sep 05, 2017 at 05:32:23PM +0800, Yi Sun wrote: > > +* xl interfaces: > > + > > + 1. `psr-mba-show [domain-id]`: > > Is this limited to domain-id, or one can also use the domain name? > Most of the xl commands accept either a domain-id or a domain-name. > Both domain-id and domain-name can show it. I thought this is by default and no need to explicitly declare. If I am wrong, I will change it as below: `psr-mba-show [domain-id/domain-name]` [...] > > + 2. `psr-mba-set [OPTIONS] `: > > + > > + Set memory bandwidth throttling for domain. > > + > > + Options: > > + '-s': Specify the socket to process, otherwise all sockets are > > processed. > > + > > + Throttling value set in register implies the approximate amount of > > delaying > > + the traffic between core and memory. The higher throttling value > > results in > > + lower bandwidth. The max throttling value (MBA_MAX) supported can be > > got > > s/got/obtained/ > Thanks! > > + through CPUID. > > How can one get this value empirically? Do I need to use a external > tool? > Sorry for confusion. In fact, the MBA_MAX is got through CPUID in hypervisor. User can know it through psr-hwinfo. Will explain it. > > + > > + Linear mode: the input precision is defined as 100-(MBA_MAX). For > > instance, > > + if the MBA_MAX value is 90, the input precision is 10%. Values not an > > even > > + multiple of the precision (e.g., 12%) will be rounded down (e.g., to > > 10% > > + delay applied) by HW automatically. > > + > > + Non-linear mode: input delay values are powers-of-two from zero to the > > + MBA_MAX value from CPUID. In this case any values not a power of two > > will > > + be rounded down the next nearest power of two by HW automatically. > > Both of the above descriptions should be moved to mba-show IMHO, the > description there is incomplete and not helpful. > Ok, thanks! > > + > > +# Technical details > > + > > +MBA is a member of Intel PSR features, it shares the base PSR > > infrastructure > > +in Xen. > > + > > +## Hardware perspective > > + > > + MBA defines a range of MSRs to support specifying a delay value (Thrtl) > > per > > + COS, with details below. > > + > > + ``` > > + +++ > > + | MSR (per socket) |Address | > > + +++ > > + | IA32_L2_QOS_Ext_BW_Thrtl_0 | 0xD50 | > > + +++ > > + | ...| ... | > > + +++ > > + | IA32_L2_QOS_Ext_BW_Thrtl_n | 0xD50+n| > > + +++ > > + ``` > > + > > + When context switch happens, the COS ID of domain is written to > > per-thread MSR > > + `IA32_PQR_ASSOC`, and then hardware enforces bandwidth allocation > > according > > I think this is missing some context of the relation between a thread > and the MSR. I assume it's related to IA32_PQR_ASSOC, but I have no > idea what that constant means. > > What's more, Xen doesn't have threads, so you should maybe speak about > vCPUs instead? > As Jan's comment, this is for 'per-hyper-thread'. [...] > > +## Implementation Description > > + > > +* Hypervisor interfaces: > > + > > + 1. Boot line param: "psr=mba" to enable the feature. > > + > > + 2. SYSCTL: > > + - XEN_SYSCTL_PSR_MBA_get_info: Get system MBA information. > > So this is likely how one gets the mentioned MBA_MAX? > Yup. > > + > > + 3. DOMCTL: > > + - XEN_DOMCTL_PSR_MBA_OP_GET_THRTL: Get throttling for a domain. > > + - XEN_DOMCTL_PSR_MBA_OP_SET_THRTL: Set throttling for a domain. > > + > > +* xl interfaces: > > + > > + 1. psr-mba-show [domain-id] > > + Show system/domain runtime MBA throttling value. For linear mode, > > + it shows the decimal value. For non-linear mode, it shows > > hexadecimal > > + value. > > + => XEN_SYSCTL_PSR_MBA_get_info/XEN_DOMCTL_PSR_MBA_OP_GET_THRTL > > + > > + 2. psr-mba-set [OPTIONS] > > + Set bandwidth throttling for a domain. > > + => XEN_DOMCTL_PSR_MBA_OP_SET_THRTL > > + > > + 3. psr-hwinfo > > + Show PSR HW information, including L3 CAT/CDP/L2 CAT/MBA. > > + => XEN_SYSCTL_PSR_MBA_get_info > > 'psr-hwinfo' seems to be completely missing from the 'xl interfaces:' > section above. > Because this is not a newly added interface, I do not describe it in 'xl interfaces'. Is that necessary? > > +* Key data structure: > > + > > + 1. Feature HW info > > + > > + ``` > > + struct { > > + unsigned int thrtl_max; > > + bool linear; > > + } mba; > > + > > + - Member `thrtl_max` > > + > > + `thrtl_max` is the max throttling value to be set, i.e. MBA_MAX. > > + > > + - Member `linear` > > + > >
Re: [Xen-devel] [PATCH v3 01/15] docs: create Memory Bandwidth Allocation (MBA) feature document
On 17-09-19 00:07:36, Jan Beulich wrote: > >>> Roger Pau Monné09/18/17 7:21 PM >>> > >On Tue, Sep 05, 2017 at 05:32:23PM +0800, Yi Sun wrote: > >> +## Hardware perspective > >> + > >> + MBA defines a range of MSRs to support specifying a delay value (Thrtl) > >> per > >> + COS, with details below. > >> + > >> + ``` > >> + +++ > >> + | MSR (per socket) |Address | > >> + +++ > >> + | IA32_L2_QOS_Ext_BW_Thrtl_0 | 0xD50 | > >> + +++ > >> + | ...| ... | > >> + +++ > >> + | IA32_L2_QOS_Ext_BW_Thrtl_n | 0xD50+n| > >> + +++ > >> + ``` > >> + > >> + When context switch happens, the COS ID of domain is written to > >> per-thread MSR > >> + `IA32_PQR_ASSOC`, and then hardware enforces bandwidth allocation > >> according > > > >I think this is missing some context of the relation between a thread > >and the MSR. I assume it's related to IA32_PQR_ASSOC, but I have no > >idea what that constant means. > > > >What's more, Xen doesn't have threads, so you should maybe speak about > >vCPUs instead? > > I think talk is of hardware aspects here, i.e. "thread" as in "hyper-thread". > > Jan > Indeed. Will make it more clear. > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] vt-d: use two 32-bit writes to update DMAR fault address registers
> From: Roger Pau Monné [mailto:roger@citrix.com] > Sent: Monday, September 18, 2017 5:10 PM > > On Mon, Sep 18, 2017 at 05:05:18PM +0800, Haozhong Zhang wrote: > > On 09/18/17 02:30 -0600, Jan Beulich wrote: > > > >>> On 18.09.17 at 10:18,wrote: > > > >> From: Jan Beulich [mailto:jbeul...@suse.com] > > > >> Sent: Monday, September 11, 2017 6:03 PM > > > >> > > > >> >>> On 11.09.17 at 08:00, wrote: > > > >> > The 64-bit DMAR fault address is composed of two 32 bits registers > > > >> > DMAR_FEADDR_REG and DMAR_FEUADDR_REG. According to VT-d > spec: > > > >> > "Software is expected to access 32-bit registers as aligned > doublewords", > > > >> > a hypervisor should use two 32-bit writes to DMAR_FEADDR_REG > and > > > >> > DMAR_FEUADDR_REG separately in order to update a 64-bit fault > > > >> address, > > > >> > rather than a 64-bit write to DMAR_FEADDR_REG. > > > >> > > > > >> > Though I haven't seen any errors caused by such one 64-bit write > on > > > >> > real machines, it's still better to follow the specification. > > > >> > > > >> Any sane chipset should split qword accesses into dword ones if > > > >> they can't be handled at some layer. Also if you undo something > > > >> explicitly done by an earlier commit, please quote that commit > > > >> and say what was wrong. After all Kevin as the VT-d maintainer > > > >> agreed with the change back then. > > > > > > > > I'm OK with this change. > > > > > > Hmm, would you mind explaining? You were also okay with the > > > change in the opposite direction back then, and we've had no > > > reports of problems. > > > > > > > I haven't seen any issues of the current 64-bit write on recent Intel > > Haswell, Broadwell and Skylake Xeon platforms, so I guess the hardware > > can properly handle the 64-bits write to contiguous 32-bit registers. > > > > I actually encountered errors when running Xen on KVM/QEMU with > QEMU > > vIOMMU enabled, which (QEMU) disallows 64-bit writes to 32-bit > > registers and aborts if such writes happen. > > > > If this patch is considered senseless (as it does not fix any errors > > on real hardware), I'm fine to fix the above abort on QEMU side (i.e., > > let vIOMMU in QEMU follow the behavior of real hardware). > > I think that either the spec is changed to mention that quad-word > accesses are allowed, or this patch is applied. > > There's nothing wrong with the QEMU implementation, it adheres to the > spec. So unless the spec is changed, we might see issues with other > emulated DMAR units. > I checked with our hardware guy. It' recommended to strictly follow what spec says. there is no hardware-level guarantee that a 64b write will touch both FEADDR and FEUADDR. So we should fix Xen side. Thanks Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 113612: tolerable all pass - PUSHED
flight 113612 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/113612/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass version targeted for testing: xen 64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0 baseline version: xen 5f62fb184fdf2d10e13d4bad28cbe6c8b53be784 Last test of basis 113610 2017-09-19 19:02:29 Z0 days Testing same since 113612 2017-09-20 01:14:16 Z0 days1 attempts People who touched revisions under test: Julien GrallStefano Stabellini jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig export PERLLIB=.:. PERLLIB=.:. +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0 + branch=xen-unstable-smoke + revision=64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig export PERLLIB=.:.:. PERLLIB=.:.:. +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig +++ export PERLLIB=.:.:.:. +++ PERLLIB=.:.:.:. ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.9-testing + '[' x64cf3181e4d469a8bd7e7dee8ff2d3bf5b45f4b0 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ :
[Xen-devel] [qemu-mainline test] 113607: regressions - FAIL
flight 113607 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/113607/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 113302 test-armhf-armhf-xl-cubietruck 12 guest-startfail REGR. vs. 113302 Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stopfail REGR. vs. 113302 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 113302 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 113302 test-amd64-amd64-xl-rtds 10 debian-install fail like 113302 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 113302 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass version targeted for testing: qemuu7ec6a364916c0d1eba01128481e503a550a2b466 baseline version: qemuua6e8c1dacfd37d34542e33600dcc50b7683b735a Last test of basis 113302 2017-09-11 10:18:16 Z8 days Failing since113345 2017-09-12 00:21:07 Z8 days 15 attempts Testing same since 113607 2017-09-19 16:17:55 Z0 days1 attempts People who touched revisions under test: Alex BennéeAlexander Graf Alexey Kardashevskiy Alistair Francis Amador Pahim Benjamin Herrenschmidt Cornelia Huck Cédric Le Goater Daniel Henrique Barboza Daniel P. Berrange David Gibson David Hildenbrand Dr. David Alan Gilbert Eduardo Habkost Eduardo Otubo Eduardo Otubo Eric Auger Eric Blake Fam Zheng Feng Kan Gerd Hoffmann Gonglei Greg Kurz Hannes Reinecke Hannes Reinecke Igor Mammedov Jaroslaw Pelczar John Snow Joseph Myers Kamil Rytarowski Kevin Wolf Ladi
[Xen-devel] [linux-linus test] 113605: tolerable FAIL - PUSHED
flight 113605 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/113605/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-i386-xl-qemut-win7-amd64 18 guest-start/win.repeat fail blocked in 113594 test-amd64-amd64-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail blocked in 113594 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 113594 test-amd64-i386-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail like 113594 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 113594 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 113594 test-amd64-amd64-xl-rtds 10 debian-install fail like 113594 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 113594 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore fail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass version targeted for testing: linux12fcf66e74b16b96e57fc1ce32bdf27b3a426fd0 baseline version: linuxebb2c2437d8008d46796902ff390653822af6cc4 Last test of basis 113594 2017-09-19 05:21:34 Z0 days Testing same since 113605 2017-09-19 15:52:58 Z0 days1 attempts People who touched revisions under test: Arnd BergmannDennis Yang Geert Uytterhoeven Linus Torvalds Ronnie Sahlberg Shaohua Li Steve French jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass
Re: [Xen-devel] [RFC] Unicore Subproject Proposal
On Wed, 20 Sep 2017, Lars Kurth wrote: > Felipe, Simon, > a quick note to let you know that the Advisory Board in today’s AB meeting > decided to endorse your proposal. > Let me know how you proceed: from my perspective, we can kick off a formal > vote before you make modifications to the proposal, but I think it is better > to post v2 first. Congratulations! > On 07/09/2017, 05:26, "Felipe Huici"wrote: > > Dear all, > > Following up on discussions that Simon Kuenzer had with several of you at > the last Xen summit, we’re now submitting a Xen subproject proposal based > on our Unicore work. Could you please review it? > > Thanks, > > Felipe Huici & Simon Kuenzer - NEC Labs Heidelberg. > > > PROPOSAL: Unicore > = > > Roles > - > Project Leads: Simon Kuenzer (main lead) >Felipe Huici (co-lead) >Florian Schmidt (co-lead) > Project Mentor: Lars Kurth > Project Sponsor: -To be found- > > Background > -- > In recent years, several papers and projects dedicated to unikernels have > shown the immense potential for performance gains that these have. By > leveraging specialization and the use of minimalistic OSes, unikernels are > able to yield impressive numbers, including fast instantiation times (tens > of milliseconds or less), tiny memory footprints (a few MBs or even KBs), > high network throughput (10-40 Gb/s), and high consolidation (e.g., being > able to run thousands of instances on a single commodity server), not to > mention a reduced attack surface and the potential for easier > certification. Unikernel projects worthy of mention include MirageOS, > ClickOS, Erlang on Xen, OSv, HALVM, and Minicache, among others. > > The fundamental drawback of unikernels is that they require that > applications be manually ported to the underlying minimalistic OS (e.g. > having to port nginx, snort, mysql or memcached to MiniOS or OSv); this > requires both expert work and often considerable amount of time. In > essence, we need to pick between either high performance with unikernels, > or no porting effort but decreased performance and decreased efficiency > with standard OS/VM images. The goal of this proposal is to change this > status quo by providing a highly configurable unikernel code base; we call > this base Unicore. > > This project also aims to concentrate the various efforts currently going > on in the Xen community regarding minimalistic OSes (essentially different > variants of MiniOS). We think that splitting the community across these > variants is counter-productive and hope that Unicore will provide a common > place for all or most improvements and customizations of minimalistic > OSes. The long term goal is to replace something like MiniOS with a tool > that can automatically build such a minimalistic OS. > > Unicore - The "Unikernel Core" > - > The high level goal of Unicore is to be able to build unikernels targeted > at specific applications without requiring the time-consuming, expert work > that building such a unikernel requires today. An additional goal (or > hope) of Unicore is that all developers interested in unikernel > development would contribute by supplying libraries rather than working on > independent projects with different code bases as it is done now. The main > idea behind Unicore is depicted in Figure 1 and consists of two basic > components: > > > [Attachment: unicore-oneslider.pdf] > > > Figure 1. Unicore architecture. > > > Library pools would contain libraries that the user of Unicore can select > from to create the unikernel. From the bottom up, library pools are > organized into (1) the architecture library tool, containing libraries > specific to a computer architecture (e.g., x86_64, ARM32 or MIPS); (2) the > platform tool, where target platforms can be Xen, KVM, bare metal (i.e. no > virtualization) and user-space Linux; and (3) the main library pool, > containing a rich set of functionality to build the unikernel from. This > last library includes drivers (both virtual such as netback/netfront and > physical such as ixgbe), filesystems, memory allocators, schedulers, > network stacks, standard libs (e.g. libc, openssl, etc.), runtimes (e.g. a > Python interpreter and debugging and profiling tools. These pools of > libraries constitute a code base for creating unikernels. As shown, a > library can be relatively large (e.g libc) or quite small (a scheduler), > which should allow for a
Re: [Xen-devel] [PATCH v2 00/24] xen/arm: Memory subsystem clean-up
On Tue, 12 Sep 2017, Julien Grall wrote: > Hi all, > > This patch series contains clean-up for the ARM memory subsystem in > preparation > of reworking the page tables handling. > > A branch with the patches can be found on xenbits: > > https://xenbits.xen.org/git-http/people/julieng/xen-unstable.git > branch mm-cleanup-v2 All patches up to patch #12 are committed. There was a minor suggestion about adding few simple comments on patch #9, which I did on commit. > For all the changes see in each patch. > > Cheers, > > Julien Grall (24): > xen/x86: mm: Introduce {G,M}FN <-> {G,M}ADDR helpers > xen/mm: Use typesafe MFN for alloc_boot_pages return > xen/mm: Use __virt_to_mfn in map_domain_page instead of virt_to_mfn > xen/arm: mm: Redefine mfn_to_virt to use typesafe > xen/arm: hsr_iabt: Document RES0 field > xen/arm: traps: Don't define FAR_EL2 for ARM32 > xen/arm: arm32: Don't define FAR_EL1 > xen/arm: Add FnV field in hsr_*abt > xen/arm: Introduce hsr_xabt to gather common bits between hsr_dabt and > xen/arm: traps: Introduce a helper to read the hypersivor fault > register > xen/arm: traps: Improve logging for data/prefetch abort fault > xen/arm: Replace ioremap_attr(PAGE_HYPERVISOR_NOCACHE) call by > ioremap_nocache > xen/arm: page: Remove unused attributes DEV_NONSHARED and DEV_CACHED > xen/arm: page: Use directly BUFFERABLE and drop DEV_WC > xen/arm: page: Prefix memory types with MT_ > xen/arm: page: Use ARMv8 naming to improve readability > xen/arm: page: Clean-up the definition of MAIRVAL > xen/arm: mm: Rename and clarify AP[1] in the stage-1 page table > xen/arm: Switch to SYS_STATE_boot just after end_boot_allocator() > xen/arm: mm: Rename 'ai' into 'flags' in create_xen_entries > xen/arm: page: Describe the layout of flags used to update page tables > xen/arm: mm: Embed permission in the flags > xen/arm: mm: Handle permission flags when adding a new mapping > xen/arm: mm: Use memory flags for modify_xen_mappings rather than > custom one > > xen/arch/arm/kernel.c | 2 +- > xen/arch/arm/livepatch.c | 6 +-- > xen/arch/arm/mm.c | 74 +++-- > xen/arch/arm/platforms/exynos5.c | 2 +- > xen/arch/arm/platforms/omap5.c| 6 +-- > xen/arch/arm/platforms/vexpress.c | 2 +- > xen/arch/arm/setup.c | 12 +++-- > xen/arch/arm/traps.c | 52 +--- > xen/arch/x86/mm.c | 7 +-- > xen/arch/x86/numa.c | 2 +- > xen/arch/x86/srat.c | 5 +- > xen/common/page_alloc.c | 7 ++- > xen/drivers/acpi/osl.c| 2 +- > xen/drivers/video/arm_hdlcd.c | 2 +- > xen/include/asm-arm/cpregs.h | 2 - > xen/include/asm-arm/lpae.h| 2 +- > xen/include/asm-arm/mm.h | 3 +- > xen/include/asm-arm/page.h| 99 > ++- > xen/include/asm-arm/processor.h | 25 -- > xen/include/asm-x86/page.h| 4 ++ > xen/include/xen/domain_page.h | 2 +- > xen/include/xen/mm.h | 3 +- > 22 files changed, 200 insertions(+), 121 deletions(-) > > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC] Unicore Subproject Proposal
Felipe, Simon, a quick note to let you know that the Advisory Board in today’s AB meeting decided to endorse your proposal. Let me know how you proceed: from my perspective, we can kick off a formal vote before you make modifications to the proposal, but I think it is better to post v2 first. Lars On 07/09/2017, 05:26, "Felipe Huici"wrote: Dear all, Following up on discussions that Simon Kuenzer had with several of you at the last Xen summit, we’re now submitting a Xen subproject proposal based on our Unicore work. Could you please review it? Thanks, Felipe Huici & Simon Kuenzer - NEC Labs Heidelberg. PROPOSAL: Unicore = Roles - Project Leads: Simon Kuenzer (main lead) Felipe Huici (co-lead) Florian Schmidt (co-lead) Project Mentor: Lars Kurth Project Sponsor: -To be found- Background -- In recent years, several papers and projects dedicated to unikernels have shown the immense potential for performance gains that these have. By leveraging specialization and the use of minimalistic OSes, unikernels are able to yield impressive numbers, including fast instantiation times (tens of milliseconds or less), tiny memory footprints (a few MBs or even KBs), high network throughput (10-40 Gb/s), and high consolidation (e.g., being able to run thousands of instances on a single commodity server), not to mention a reduced attack surface and the potential for easier certification. Unikernel projects worthy of mention include MirageOS, ClickOS, Erlang on Xen, OSv, HALVM, and Minicache, among others. The fundamental drawback of unikernels is that they require that applications be manually ported to the underlying minimalistic OS (e.g. having to port nginx, snort, mysql or memcached to MiniOS or OSv); this requires both expert work and often considerable amount of time. In essence, we need to pick between either high performance with unikernels, or no porting effort but decreased performance and decreased efficiency with standard OS/VM images. The goal of this proposal is to change this status quo by providing a highly configurable unikernel code base; we call this base Unicore. This project also aims to concentrate the various efforts currently going on in the Xen community regarding minimalistic OSes (essentially different variants of MiniOS). We think that splitting the community across these variants is counter-productive and hope that Unicore will provide a common place for all or most improvements and customizations of minimalistic OSes. The long term goal is to replace something like MiniOS with a tool that can automatically build such a minimalistic OS. Unicore - The "Unikernel Core" - The high level goal of Unicore is to be able to build unikernels targeted at specific applications without requiring the time-consuming, expert work that building such a unikernel requires today. An additional goal (or hope) of Unicore is that all developers interested in unikernel development would contribute by supplying libraries rather than working on independent projects with different code bases as it is done now. The main idea behind Unicore is depicted in Figure 1 and consists of two basic components: [Attachment: unicore-oneslider.pdf] Figure 1. Unicore architecture. Library pools would contain libraries that the user of Unicore can select from to create the unikernel. From the bottom up, library pools are organized into (1) the architecture library tool, containing libraries specific to a computer architecture (e.g., x86_64, ARM32 or MIPS); (2) the platform tool, where target platforms can be Xen, KVM, bare metal (i.e. no virtualization) and user-space Linux; and (3) the main library pool, containing a rich set of functionality to build the unikernel from. This last library includes drivers (both virtual such as netback/netfront and physical such as ixgbe), filesystems, memory allocators, schedulers, network stacks, standard libs (e.g. libc, openssl, etc.), runtimes (e.g. a Python interpreter and debugging and profiling tools. These pools of libraries constitute a code base for creating unikernels. As shown, a library can be relatively large (e.g libc) or quite small (a scheduler), which should allow for a fair amount of customization for the unikernel. The Unicore build tool is in charge of compiling the application and the selected libraries together to create a binary for a specific platform and architecture (e.g., Xen
Re: [Xen-devel] [PATCH v2 24/24] xen/arm: mm: Use memory flags for modify_xen_mappings rather than custom one
On Tue, 12 Sep 2017, Julien Grall wrote: > This will help to consolidate the page-table code and avoid different > path depending on the action to perform. > > Signed-off-by: Julien Grall> Reviewed-by: Andre Przywara Much better now, thanks! Reviewed-by: Stefano Stabellini > --- > > Cc: Konrad Rzeszutek Wilk > Cc: Ross Lagerwall > > arch_livepatch_secure is now the same as on x86. It might be > possible to combine both, but I left that alone for now. > > Changes in v2: > - Add Andre's reviewed-by > --- > xen/arch/arm/livepatch.c | 6 +++--- > xen/arch/arm/mm.c | 5 ++--- > xen/include/asm-arm/page.h | 11 --- > 3 files changed, 5 insertions(+), 17 deletions(-) > > diff --git a/xen/arch/arm/livepatch.c b/xen/arch/arm/livepatch.c > index 3e53524365..279d52cc6c 100644 > --- a/xen/arch/arm/livepatch.c > +++ b/xen/arch/arm/livepatch.c > @@ -146,15 +146,15 @@ int arch_livepatch_secure(const void *va, unsigned int > pages, enum va_type type) > switch ( type ) > { > case LIVEPATCH_VA_RX: > -flags = PTE_RO; /* R set, NX clear */ > +flags = PAGE_HYPERVISOR_RX; > break; > > case LIVEPATCH_VA_RW: > -flags = PTE_NX; /* R clear, NX set */ > +flags = PAGE_HYPERVISOR_RW; > break; > > case LIVEPATCH_VA_RO: > -flags = PTE_NX | PTE_RO; /* R set, NX set */ > +flags = PAGE_HYPERVISOR_RO; > break; > > default: > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index a6b228ba9b..71de68fe0d 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -1041,8 +1041,8 @@ static int create_xen_entries(enum xenmap_operation op, > else > { > pte = *entry; > -pte.pt.ro = PTE_RO_MASK(flags); > -pte.pt.xn = PTE_NX_MASK(flags); > +pte.pt.ro = PAGE_RO_MASK(flags); > +pte.pt.xn = PAGE_XN_MASK(flags); > if ( !pte.pt.ro && !pte.pt.xn ) > { > printk("%s: Incorrect combination for addr=%lx\n", > @@ -1085,7 +1085,6 @@ int destroy_xen_mappings(unsigned long v, unsigned long > e) > > int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int flags) > { > -ASSERT((flags & (PTE_NX | PTE_RO)) == flags); > return create_xen_entries(MODIFY, s, INVALID_MFN, (e - s) >> PAGE_SHIFT, >flags); > } > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > index 814ed126ec..2b9d5e6a5c 100644 > --- a/xen/include/asm-arm/page.h > +++ b/xen/include/asm-arm/page.h > @@ -90,17 +90,6 @@ > #define PAGE_HYPERVISOR_WC (_PAGE_DEVICE|MT_NORMAL_NC) > > /* > - * Defines for changing the hypervisor PTE .ro and .nx bits. This is only to > be > - * used with modify_xen_mappings. > - */ > -#define _PTE_NX_BIT 0U > -#define _PTE_RO_BIT 1U > -#define PTE_NX (1U << _PTE_NX_BIT) > -#define PTE_RO (1U << _PTE_RO_BIT) > -#define PTE_NX_MASK(x) (((x) >> _PTE_NX_BIT) & 0x1U) > -#define PTE_RO_MASK(x) (((x) >> _PTE_RO_BIT) & 0x1U) > - > -/* > * Stage 2 Memory Type. > * > * These are valid in the MemAttr[3:0] field of an LPAE stage 2 page > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 22/24] xen/arm: mm: Embed permission in the flags
On Tue, 12 Sep 2017, Julien Grall wrote: > Currently, it is not possible to specify the permission of a new > mapping. It would be necessary to use the function modify_xen_mappings > with a different set of flags. > > Introduce a couple of new flags for the permissions (Non-eXecutable, > Read-Only) and also provides definition that combine the memory attribute > and permission for common combinations. > > PAGE_HYPERVISOR is now an alias to PAGE_HYPERVISOR_RW (read-write, > non-executable mappings). This does not affect the current mapping using > PAGE_HYPERVISOR because Xen is currently forcing all the mapping to be > non-executable by default (see mfn_to_xen_entry). > > A follow-up patch will change modify_xen_mappings to use the new flags. > > Signed-off-by: Julien Grall> > --- > > Changes in v2: > - Update the commit message > --- > xen/include/asm-arm/page.h | 22 +++--- > 1 file changed, 19 insertions(+), 3 deletions(-) > > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > index 4022b7dc33..814ed126ec 100644 > --- a/xen/include/asm-arm/page.h > +++ b/xen/include/asm-arm/page.h > @@ -66,12 +66,28 @@ > * Layout of the flags used for updating the hypervisor page tables > * > * [0:2] Memory Attribute Index > + * [3:4] Permission flags > */ > #define PAGE_AI_MASK(x) ((x) & 0x7U) > > -#define PAGE_HYPERVISOR (MT_NORMAL) > -#define PAGE_HYPERVISOR_NOCACHE (MT_DEVICE_nGnRE) > -#define PAGE_HYPERVISOR_WC (MT_NORMAL_NC) > +#define _PAGE_XN_BIT3 > +#define _PAGE_RO_BIT4 > +#define _PAGE_XN(1U << _PAGE_XN_BIT) > +#define _PAGE_RO(1U << _PAGE_RO_BIT) > +#define PAGE_XN_MASK(x) (((x) >> _PAGE_XN_BIT) & 0x1U) > +#define PAGE_RO_MASK(x) (((x) >> _PAGE_RO_BIT) & 0x1U) > + > +/* Device memory will always be mapped read-write non-executable. */ > +#define _PAGE_DEVICE_PAGE_XN > +#define _PAGE_NORMALMT_NORMAL I think I understand the intent behind these two definitions, but I find them more confusing then useful. Specifically, I find confusing that _PAGE_DEVICE specifies permissions but not memory attributes, while _PAGE_NORMAL specifies memory attributes but not permissions. I would probably remove the two definitions completely and only retain the useful comment above them. The patch looks good aside from this nit. > +#define PAGE_HYPERVISOR_RO (_PAGE_NORMAL|_PAGE_RO|_PAGE_XN) > +#define PAGE_HYPERVISOR_RX (_PAGE_NORMAL|_PAGE_RO) > +#define PAGE_HYPERVISOR_RW (_PAGE_NORMAL|_PAGE_XN) > + > +#define PAGE_HYPERVISOR PAGE_HYPERVISOR_RW > +#define PAGE_HYPERVISOR_NOCACHE (_PAGE_DEVICE|MT_DEVICE_nGnRE) > +#define PAGE_HYPERVISOR_WC (_PAGE_DEVICE|MT_NORMAL_NC) > > /* > * Defines for changing the hypervisor PTE .ro and .nx bits. This is only to > be ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 17/24] xen/arm: page: Clean-up the definition of MAIRVAL
On Tue, 12 Sep 2017, Julien Grall wrote: > Currently MAIRVAL is defined in term of MAIR0VAL and MAIR1VAL which are > both hardcoded value. This makes quite difficult to understand the value > written in both registers. > > Rework the definition by using value of each attribute shifted by their > associated index. > > Signed-off-by: Julien GrallAh! That's why you haven't properly updated MAIR0VAL and MAIR1VAL in the previous patches. In that case, please say explicitly in the commit messages of those patches that MAIR0VAL and MAIR1VAL will be properly update in a follow-up patch. > --- > Changes in v2: > - Move this patch after "xen/arm: page: Use ARMv8 naming to > improve readability" > --- > xen/include/asm-arm/page.h | 42 +- > 1 file changed, 25 insertions(+), 17 deletions(-) > > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > index 899fd1801a..088746828d 100644 > --- a/xen/include/asm-arm/page.h > +++ b/xen/include/asm-arm/page.h > @@ -22,6 +22,21 @@ > #define LPAE_SH_INNER 0x3 > > /* > + * Attribute Indexes. > + * > + * These are valid in the AttrIndx[2:0] field of an LPAE stage 1 page > + * table entry. They are indexes into the bytes of the MAIR* > + * registers, as defined above. "as defined above" should be "has defined below" now. Aside from this: Reviewed-by: Stefano Stabellini > + * > + */ > +#define MT_DEVICE_nGnRnE 0x0 > +#define MT_NORMAL_NC 0x1 > +#define MT_NORMAL_WT 0x2 > +#define MT_NORMAL_WB 0x3 > +#define MT_DEVICE_nGnRE 0x4 > +#define MT_NORMAL0x7 > + > +/* > * LPAE Memory region attributes. Indexed by the AttrIndex bits of a > * LPAE entry; the 8-bit fields are packed little-endian into MAIR0 and > MAIR1. > * > @@ -35,24 +50,17 @@ > * reserved 110 > * MT_NORMAL111 -- Write-back write-allocate > */ > -#define MAIR0VAL 0xeeaa4400 > -#define MAIR1VAL 0xff04 > -#define MAIRVAL (MAIR0VAL|MAIR1VAL<<32) > +#define MAIR(attr, mt) (_AC(attr, ULL) << ((mt) * 8)) > > -/* > - * Attribute Indexes. > - * > - * These are valid in the AttrIndx[2:0] field of an LPAE stage 1 page > - * table entry. They are indexes into the bytes of the MAIR* > - * registers, as defined above. > - * > - */ > -#define MT_DEVICE_nGnRnE 0x0 > -#define MT_NORMAL_NC 0x1 > -#define MT_NORMAL_WT 0x2 > -#define MT_NORMAL_WB 0x3 > -#define MT_DEVICE_nGnRE 0x4 > -#define MT_NORMAL0x7 > +#define MAIRVAL (MAIR(0x00, MT_DEVICE_nGnRnE)| \ > + MAIR(0x44, MT_NORMAL_NC)| \ > + MAIR(0xaa, MT_NORMAL_WT)| \ > + MAIR(0xee, MT_NORMAL_WB)| \ > + MAIR(0x04, MT_DEVICE_nGnRE) | \ > + MAIR(0xff, MT_NORMAL)) > + > +#define MAIR0VAL (MAIRVAL & 0x) > +#define MAIR1VAL (MAIRVAL >> 32) > > #define PAGE_HYPERVISOR (MT_NORMAL) > #define PAGE_HYPERVISOR_NOCACHE (MT_DEVICE_nGnRE) > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [ovmf test] 113608: all pass - PUSHED
flight 113608 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/113608/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 424a5ec33b3d5a842bff3f4695d0bd709c91a163 baseline version: ovmf a3a4737051010a94832f7bceaa1fa414d7259da0 Last test of basis 113599 2017-09-19 08:20:53 Z0 days Testing same since 113608 2017-09-19 16:50:44 Z0 days1 attempts People who touched revisions under test: Ard Biesheuveljobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=ovmf + revision=424a5ec33b3d5a842bff3f4695d0bd709c91a163 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig export PERLLIB=.:. PERLLIB=.:. +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 424a5ec33b3d5a842bff3f4695d0bd709c91a163 + branch=ovmf + revision=424a5ec33b3d5a842bff3f4695d0bd709c91a163 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig export PERLLIB=.:.:. PERLLIB=.:.:. +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig +++ export PERLLIB=.:.:.:. +++ PERLLIB=.:.:.:. ++ umask 002 + select_xenbranch + case "$branch" in + tree=ovmf + xenbranch=xen-unstable + '[' xovmf = xlinux ']' + linuxbranch= + '[' x = x ']' + qemuubranch=qemu-upstream-unstable + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable + prevxenbranch=xen-4.9-testing + '[' x424a5ec33b3d5a842bff3f4695d0bd709c91a163 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/linux-firmware.git ++ :
Re: [Xen-devel] [PATCH v2 16/24] xen/arm: page: Use ARMv8 naming to improve readability
On Tue, 12 Sep 2017, Julien Grall wrote: > This is based on the Linux ARMv8 naming scheme (see arch/arm64/mm/proc.S). > Each > type will contain "NORMAL" or "DEVICE" to make clear whether each attribute > targets device or normal memory. > > Signed-off-by: Julien Grall> > --- > > Changes in v2: > * Move the patch before "xen/arm: page: Clean-up the definition > of MAIRVAL" > --- > xen/arch/arm/kernel.c | 2 +- > xen/arch/arm/mm.c | 28 ++-- > xen/arch/arm/platforms/vexpress.c | 2 +- > xen/drivers/video/arm_hdlcd.c | 2 +- > xen/include/asm-arm/page.h| 32 > 5 files changed, 33 insertions(+), 33 deletions(-) > > diff --git a/xen/arch/arm/kernel.c b/xen/arch/arm/kernel.c > index 9c183f96da..a12baa86e7 100644 > --- a/xen/arch/arm/kernel.c > +++ b/xen/arch/arm/kernel.c > @@ -54,7 +54,7 @@ void copy_from_paddr(void *dst, paddr_t paddr, unsigned > long len) > s = paddr & (PAGE_SIZE-1); > l = min(PAGE_SIZE - s, len); > > -set_fixmap(FIXMAP_MISC, maddr_to_mfn(paddr), MT_BUFFERABLE); > +set_fixmap(FIXMAP_MISC, maddr_to_mfn(paddr), MT_NORMAL_NC); > memcpy(dst, src + s, l); > clean_dcache_va_range(dst, l); > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 7ffeb36bfa..fc76f03526 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -290,7 +290,7 @@ static inline lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned > attr) > > switch ( attr ) > { > -case MT_BUFFERABLE: > +case MT_NORMAL_NC: > /* > * ARM ARM: Overlaying the shareability attribute (DDI > * 0406C.b B3-1376 to 1377) > @@ -305,8 +305,8 @@ static inline lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned > attr) > */ > e.pt.sh = LPAE_SH_OUTER; > break; > -case MT_UNCACHED: > -case MT_DEV_SHARED: > +case MT_DEVICE_nGnRnE: > +case MT_DEVICE_nGnRE: > /* > * Shareability is ignored for non-Normal memory, Outer is as > * good as anything. > @@ -369,7 +369,7 @@ static void __init create_mappings(lpae_t *second, > > count = nr_mfns / LPAE_ENTRIES; > p = second + second_linear_offset(virt_offset); > -pte = mfn_to_xen_entry(_mfn(base_mfn), MT_WRITEALLOC); > +pte = mfn_to_xen_entry(_mfn(base_mfn), MT_NORMAL); > if ( granularity == 16 * LPAE_ENTRIES ) > pte.pt.contig = 1; /* These maps are in 16-entry contiguous chunks. > */ > for ( i = 0; i < count; i++ ) > @@ -422,7 +422,7 @@ void *map_domain_page(mfn_t mfn) > else if ( map[slot].pt.avail == 0 ) > { > /* Commandeer this 2MB slot */ > -pte = mfn_to_xen_entry(_mfn(slot_mfn), MT_WRITEALLOC); > +pte = mfn_to_xen_entry(_mfn(slot_mfn), MT_NORMAL); > pte.pt.avail = 1; > write_pte(map + slot, pte); > break; > @@ -543,7 +543,7 @@ static inline lpae_t pte_of_xenaddr(vaddr_t va) > { > paddr_t ma = va + phys_offset; > > -return mfn_to_xen_entry(maddr_to_mfn(ma), MT_WRITEALLOC); > +return mfn_to_xen_entry(maddr_to_mfn(ma), MT_NORMAL); > } > > /* Map the FDT in the early boot page table */ > @@ -652,7 +652,7 @@ void __init setup_pagetables(unsigned long > boot_phys_offset, paddr_t xen_paddr) > /* Initialise xen second level entries ... */ > /* ... Xen's text etc */ > > -pte = mfn_to_xen_entry(maddr_to_mfn(xen_paddr), MT_WRITEALLOC); > +pte = mfn_to_xen_entry(maddr_to_mfn(xen_paddr), MT_NORMAL); > pte.pt.xn = 0;/* Contains our text mapping! */ > xen_second[second_table_offset(XEN_VIRT_START)] = pte; > > @@ -669,7 +669,7 @@ void __init setup_pagetables(unsigned long > boot_phys_offset, paddr_t xen_paddr) > > /* ... Boot Misc area for xen relocation */ > dest_va = BOOT_RELOC_VIRT_START; > -pte = mfn_to_xen_entry(maddr_to_mfn(xen_paddr), MT_WRITEALLOC); > +pte = mfn_to_xen_entry(maddr_to_mfn(xen_paddr), MT_NORMAL); > /* Map the destination in xen_second. */ > xen_second[second_table_offset(dest_va)] = pte; > /* Map the destination in boot_second. */ > @@ -700,7 +700,7 @@ void __init setup_pagetables(unsigned long > boot_phys_offset, paddr_t xen_paddr) > unsigned long va = XEN_VIRT_START + (i << PAGE_SHIFT); > if ( !is_kernel(va) ) > break; > -pte = mfn_to_xen_entry(mfn, MT_WRITEALLOC); > +pte = mfn_to_xen_entry(mfn, MT_NORMAL); > pte.pt.table = 1; /* 4k mappings always have this bit set */ > if ( is_kernel_text(va) || is_kernel_inittext(va) ) > { > @@ -771,7 +771,7 @@ int init_secondary_pagetables(int cpu) > for ( i = 0; i < DOMHEAP_SECOND_PAGES; i++ ) > { > pte = mfn_to_xen_entry(virt_to_mfn(domheap+i*LPAE_ENTRIES), > - MT_WRITEALLOC); > +
Re: [Xen-devel] [PATCH v2 14/24] xen/arm: page: Use directly BUFFERABLE and drop DEV_WC
On Tue, 12 Sep 2017, Julien Grall wrote: > DEV_WC is only used for PAGE_HYPERVISOR_WC and does not bring much > improvement. > > Signed-off-by: Julien Grall> Reviewed-by: Andre Przywara > > --- > > Changes in v2: > - Remove DEV_WC from the comment as well > - Add Andre's reviewed-by > --- > xen/include/asm-arm/page.h | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > index d7939bb944..ee0422579b 100644 > --- a/xen/include/asm-arm/page.h > +++ b/xen/include/asm-arm/page.h > @@ -34,8 +34,6 @@ > * ??101 > * reserved 110 > * WRITEALLOC111 -- Write-back write-allocate > - * > - * DEV_WC001 (== BUFFERABLE) > */ > #define MAIR0VAL 0xeeaa4400 > #define MAIR1VAL 0xff04 Please update MAIR0VAL > @@ -55,11 +53,10 @@ > #define WRITEBACK 0x3 > #define DEV_SHARED0x4 > #define WRITEALLOC0x7 > -#define DEV_WCBUFFERABLE > > #define PAGE_HYPERVISOR (WRITEALLOC) > #define PAGE_HYPERVISOR_NOCACHE (DEV_SHARED) > -#define PAGE_HYPERVISOR_WC (DEV_WC) > +#define PAGE_HYPERVISOR_WC (BUFFERABLE) > > /* > * Defines for changing the hypervisor PTE .ro and .nx bits. This is only to > be > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 13/24] xen/arm: page: Remove unused attributes DEV_NONSHARED and DEV_CACHED
On Tue, 12 Sep 2017, Julien Grall wrote: > They were imported from non-LPAE Linux, but Xen is LPAE only. It is time > to do some clean-up in the memory attribute and keep only what make > sense for Xen. Follow-up patch will do more clean-up. > > Also, update the comment saying our attribute matches Linux. > > Signed-off-by: Julien Grall> Reviewed-by: Andre Przywara > > --- > Changes in v2: > - Add Andre's reviewed-by > --- > xen/include/asm-arm/page.h | 10 +++--- > 1 file changed, 3 insertions(+), 7 deletions(-) > > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > index b8d641bfaf..d7939bb944 100644 > --- a/xen/include/asm-arm/page.h > +++ b/xen/include/asm-arm/page.h > @@ -21,9 +21,9 @@ > #define LPAE_SH_OUTER 0x2 > #define LPAE_SH_INNER 0x3 > > -/* LPAE Memory region attributes, to match Linux's (non-LPAE) choices. > - * Indexed by the AttrIndex bits of a LPAE entry; > - * the 8-bit fields are packed little-endian into MAIR0 and MAIR1 > +/* > + * LPAE Memory region attributes. Indexed by the AttrIndex bits of a > + * LPAE entry; the 8-bit fields are packed little-endian into MAIR0 and > MAIR1. > * > * aiencoding > * UNCACHED 000 -- Strongly Ordered > @@ -35,9 +35,7 @@ > * reserved 110 > * WRITEALLOC111 -- Write-back write-allocate > * > - * DEV_NONSHARED 100 (== DEV_SHARED) > * DEV_WC001 (== BUFFERABLE) > - * DEV_CACHED011 (== WRITEBACK) > */ > #define MAIR0VAL 0xeeaa4400 > #define MAIR1VAL 0xff04 I am OK with removing unused memory attributes, but please update MAIR0VAL and MAIR1VAL accordingly. They still have their old values here. > @@ -57,9 +55,7 @@ > #define WRITEBACK 0x3 > #define DEV_SHARED0x4 > #define WRITEALLOC0x7 > -#define DEV_NONSHARED DEV_SHARED > #define DEV_WCBUFFERABLE > -#define DEV_CACHEDWRITEBACK > > #define PAGE_HYPERVISOR (WRITEALLOC) > #define PAGE_HYPERVISOR_NOCACHE (DEV_SHARED) > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 23/24] xen/arm: mm: Handle permission flags when adding a new mapping
On Tue, 12 Sep 2017, Julien Grall wrote: > Currently, all the new mappings will be read-write non-executable. Allow the > caller to use other permissions. > > Signed-off-by: Julien Grall> > --- > Changes in v2: > - Switch the runtime check to a BUG_ON() Since you are at it, could you please also turn the other runtime check few lines below into another BUG_ON (under MODIFY)? > --- > xen/arch/arm/mm.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 8a56f37821..a6b228ba9b 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -1022,6 +1022,9 @@ static int create_xen_entries(enum xenmap_operation op, > if ( op == RESERVE ) > break; > pte = mfn_to_xen_entry(mfn, PAGE_AI_MASK(flags)); > +pte.pt.ro = PAGE_RO_MASK(flags); > +pte.pt.xn = PAGE_XN_MASK(flags); > +BUG_ON(!pte.pt.ro && !pte.pt.xn); > pte.pt.table = 1; > write_pte(entry, pte); > break; > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 20/24] xen/arm: mm: Rename 'ai' into 'flags' in create_xen_entries
On Tue, 12 Sep 2017, Julien Grall wrote: > The parameter 'ai' is used either for attribute index or for > permissions. Follow-up patch will rework that parameters to carry more > information. So rename the parameter to 'flags'. > > Signed-off-by: Julien Grall> Reviewed-by: Andre Przywara Reviewed-by: Stefano Stabellini > --- > > Changes in v2: > - Add Andre's reviewed-by > --- > xen/arch/arm/mm.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index b3286b4a89..3379d29f8a 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -986,7 +986,7 @@ static int create_xen_entries(enum xenmap_operation op, >unsigned long virt, >mfn_t mfn, >unsigned long nr_mfns, > - unsigned int ai) > + unsigned int flags) > { > int rc; > unsigned long addr = virt, addr_end = addr + nr_mfns * PAGE_SIZE; > @@ -1021,7 +1021,7 @@ static int create_xen_entries(enum xenmap_operation op, > } > if ( op == RESERVE ) > break; > -pte = mfn_to_xen_entry(mfn, ai); > +pte = mfn_to_xen_entry(mfn, flags); > pte.pt.table = 1; > write_pte(entry, pte); > break; > @@ -1038,8 +1038,8 @@ static int create_xen_entries(enum xenmap_operation op, > else > { > pte = *entry; > -pte.pt.ro = PTE_RO_MASK(ai); > -pte.pt.xn = PTE_NX_MASK(ai); > +pte.pt.ro = PTE_RO_MASK(flags); > +pte.pt.xn = PTE_NX_MASK(flags); > if ( !pte.pt.ro && !pte.pt.xn ) > { > printk("%s: Incorrect combination for addr=%lx\n", > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 21/24] xen/arm: page: Describe the layout of flags used to update page tables
On Tue, 12 Sep 2017, Julien Grall wrote: > Currently, the flags used to update page tables (i.e PAGE_HYPERVISOR_*) > only contains the memory attribute index. Follow-up patches will add > more information in it. So document the current layout. > > At the same time introduce PAGE_AI_MASK to get the memory attribute > index easily. > > Signed-off-by: Julien Grall> Reviewed-by: Andre Przywara Reviewed-by: Stefano Stabellini > --- > Andre, I have slightly update the commit message to show that we > just describe the current layout. Hope you are fine with keeping > your reviewed-by. > > Changes in v2: > - Slightly update the commit message to specify we describe the > current layout. > - Add Andre's reviewed-by > --- > xen/arch/arm/mm.c | 2 +- > xen/include/asm-arm/page.h | 7 +++ > 2 files changed, 8 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 3379d29f8a..8a56f37821 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -1021,7 +1021,7 @@ static int create_xen_entries(enum xenmap_operation op, > } > if ( op == RESERVE ) > break; > -pte = mfn_to_xen_entry(mfn, flags); > +pte = mfn_to_xen_entry(mfn, PAGE_AI_MASK(flags)); > pte.pt.table = 1; > write_pte(entry, pte); > break; > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > index 088746828d..4022b7dc33 100644 > --- a/xen/include/asm-arm/page.h > +++ b/xen/include/asm-arm/page.h > @@ -62,6 +62,13 @@ > #define MAIR0VAL (MAIRVAL & 0x) > #define MAIR1VAL (MAIRVAL >> 32) > > +/* > + * Layout of the flags used for updating the hypervisor page tables > + * > + * [0:2] Memory Attribute Index > + */ > +#define PAGE_AI_MASK(x) ((x) & 0x7U) > + > #define PAGE_HYPERVISOR (MT_NORMAL) > #define PAGE_HYPERVISOR_NOCACHE (MT_DEVICE_nGnRE) > #define PAGE_HYPERVISOR_WC (MT_NORMAL_NC) > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 19/24] xen/arm: Switch to SYS_STATE_boot just after end_boot_allocator()
On Tue, 12 Sep 2017, Julien Grall wrote: > We should consider the early boot period to end when we stop using the > boot allocator. This is inline with x86 and will be helpful to know > whether we should allocate memory from the boot allocator or xenheap. > > Signed-off-by: Julien Grall> Reviewed-by: Andre Przywara Reviewed-by: Stefano Stabellini > --- > > Changes in v2: > - Add Andre's reviewed-by > --- > xen/arch/arm/setup.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c > index b00eebd96e..b0306a917b 100644 > --- a/xen/arch/arm/setup.c > +++ b/xen/arch/arm/setup.c > @@ -757,6 +757,12 @@ void __init start_xen(unsigned long boot_phys_offset, > > end_boot_allocator(); > > +/* > + * The memory subsystem has been initialized, we can now switch from > + * early_boot -> boot. > + */ > +system_state = SYS_STATE_boot; > + > vm_init(); > > if ( acpi_disabled ) > @@ -779,8 +785,6 @@ void __init start_xen(unsigned long boot_phys_offset, > console_init_preirq(); > console_init_ring(); > > -system_state = SYS_STATE_boot; > - > processor_id(); > > smp_init_cpus(); > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 18/24] xen/arm: mm: Rename and clarify AP[1] in the stage-1 page table
On Tue, 12 Sep 2017, Julien Grall wrote: > The description of AP[1] in Xen is based on testing rather than the ARM > ARM. > > Per the ARM ARM, on EL2 stage-1 page table, AP[1] is RES1 as the > translation regime applies to only one exception level (see D4.4.4 and > G4.6.1 in ARM DDI 0487B.a). > > Update the comment and also rename the field to match the description in > the ARM ARM. > > Signed-off-by: Julien Grall> Reviewed-by: Andre Przywara Acked-by: Stefano Stabellini > --- > > Changes in v2: > - Add Andre's reviewed-by > --- > xen/arch/arm/mm.c | 10 +- > xen/include/asm-arm/lpae.h | 2 +- > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index fc76f03526..b3286b4a89 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -273,7 +273,7 @@ static inline lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned > attr) > .table = 0, /* Set to 1 for links and 4k maps */ > .ai = attr, > .ns = 1, /* Hyp mode is in the non-secure world */ > -.user = 1,/* See below */ > +.up = 1, /* See below */ > .ro = 0, /* Assume read-write */ > .af = 1, /* No need for access tracking */ > .ng = 1, /* Makes TLB flushes easier */ > @@ -282,10 +282,10 @@ static inline lpae_t mfn_to_xen_entry(mfn_t mfn, > unsigned attr) > .avail = 0, /* Reference count for domheap mapping */ > }}; > /* > - * Setting the User bit is strange, but the ATS1H[RW] instructions > - * don't seem to work otherwise, and since we never run on Xen > - * pagetables in User mode it's OK. If this changes, remember > - * to update the hard-coded values in head.S too. > + * For EL2 stage-1 page table, up (aka AP[1]) is RES1 as the translation > + * regime applies to only one exception level (see D4.4.4 and G4.6.1 > + * in ARM DDI 0487B.a). If this changes, remember to update the > + * hard-coded values in head.S too. > */ > > switch ( attr ) > diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h > index 118ee5ae1a..b30853e79d 100644 > --- a/xen/include/asm-arm/lpae.h > +++ b/xen/include/asm-arm/lpae.h > @@ -35,7 +35,7 @@ typedef struct __packed { > */ > unsigned long ai:3; /* Attribute Index */ > unsigned long ns:1; /* Not-Secure */ > -unsigned long user:1; /* User-visible */ > +unsigned long up:1; /* Unpriviledged access */ > unsigned long ro:1; /* Read-Only */ > unsigned long sh:2; /* Shareability */ > unsigned long af:1; /* Access Flag */ > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 04/24] xen/arm: mm: Redefine mfn_to_virt to use typesafe
On Sat, 16 Sep 2017, Julien Grall wrote: > Hi Stefano, > > On 09/16/2017 12:56 AM, Stefano Stabellini wrote: > > On Tue, 12 Sep 2017, Julien Grall wrote: > > > This add a bit more safety in the memory subsystem code. > > > > > > Signed-off-by: Julien Grall> > > --- > > > xen/arch/arm/mm.c | 16 +--- > > > 1 file changed, 9 insertions(+), 7 deletions(-) > > > > > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > > > index 965d0573a4..5716ef1123 100644 > > > --- a/xen/arch/arm/mm.c > > > +++ b/xen/arch/arm/mm.c > > > @@ -47,6 +47,8 @@ struct domain *dom_xen, *dom_io, *dom_cow; > > > /* Override macros from asm/page.h to make them work with mfn_t */ > > > #undef virt_to_mfn > > > #define virt_to_mfn(va) _mfn(__virt_to_mfn(va)) > > > +#undef mfn_to_virt > > > +#define mfn_to_virt(mfn) __mfn_to_virt(mfn_x(mfn)) > > > /* Static start-of-day pagetables that we use before the allocators > > >* are up. These are used by all CPUs during bringup before switching > > > @@ -837,7 +839,7 @@ void __init setup_xenheap_mappings(unsigned long > > > base_mfn, > > >* Virtual address aligned to previous 1GB to match physical > > >* address alignment done above. > > >*/ > > > -vaddr = (vaddr_t)mfn_to_virt(base_mfn) & FIRST_MASK; > > > +vaddr = (vaddr_t)__mfn_to_virt(base_mfn) & FIRST_MASK; > > > > Don't you think it would be better to do mfn_to_virt(_mfn(base_mfn)) in > > this patch? This is just bike-shedding, but I think it would be more > > obviously consistent. Other than that, it looks good. > > Well, last time I used mfn_x/_mfn in similar condition, you requested to use > the __* version (see [1]). LOL This is a good sign: it means I am getting more familiar with the mfn_x/_mfn syntax :-D > I really don't mind which one to use. But we should stay consistent with the > macros to use for non-typesafe version. Of course. Let's keep the patch as is. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC] Unicore Subproject Proposal
On Mon, 18 Sep 2017, Felipe Huici wrote: > Hi Lars, all, > > [cc’ing authors of Erlang on Xen, HalVM and Rump]. > > Thanks everyone for all of the support and useful comments. We’ve > incorporated a number of them into a new version of the document (attached > and pasted at the bottom for convenience) and for those that didn’t make > it we’re keeping track of them. > > Lars, FYI, Simon also did a blog post regarding Unicore on unikernel.org > (https://devel.unikernel.org/t/unicore-a-new-unikernel-project/274). > > Please let us know what the next steps are. The proposal looks good to me. > Thanks, > > — Felipe > > > PROPOSAL: Unicore > = > > Roles > - > Project Leads:Simon Kuenzer> (co-lead)Felipe Huici > (co-lead)Florian Schmidt > Project Mentor: Lars Kurth > Project Sponsors: Stefano Stabellini > Wei Liu > > Background > -- > In recent years, several papers and projects dedicated to unikernels > have shown the immense potential for performance gains that these > have. By leveraging specialization and the use of minimalistic OSes, > unikernels are able to yield impressive numbers, including fast > instantiation times (tens of milliseconds or less), tiny memory > footprints (a few MBs or even KBs), high network throughput (10-40 > Gb/s), and high consolidation (e.g., being able to run thousands of > instances on a single commodity server), not to mention a reduced > attack surface and the potential for easier certification. Unikernel > projects worthy of mention include MirageOS, ClickOS, Erlang on Xen, > OSv, HALVM, and Minicache, Rump, among others. > > The fundamental drawback of unikernels is that they require that > applications be manually ported to the underlying minimalistic OS (e.g. > having to port nginx, snort, mysql or memcached to MiniOS or OSv); this > requires both expert work and often considerable amount of time. In > essence, we need to pick between either high performance > with unikernels, or no porting effort but decreased performance > and decreased efficiency with standard OS/VM images. > The goal of this proposal is to change this status quo by providing > a highly configurable unikernel code base; we call this base Unicore. > > This project also aims to concentrate the various efforts currently going > on in the Xen community regarding minimalistic OSes (essentially different > variants of MiniOS). We think that splitting the community across these > variants is counter-productive and hope that Unicore will provide a common > place for all or most improvements and customizations of minimalistic > OSes. The long term goal is to replace something like MiniOS with a tool > that can automatically build such a minimalistic OS. > > > Unicore - The "Unikernel Core" > - > The high level goal of Unicore is to be able to build unikernels targeted > at specific applications without requiring the time-consuming, expert work > that building such a unikernel requires today. An additional goal (or > hope) of Unicore is that all developers interested in unikernel > development would contribute by supplying libraries rather than working on > independent projects with different code bases as it is done now. The main > idea behind Unicore is depicted in Figure 1 and consists of two basic > components: > > > [Attachment: unicore-oneslider.pdf] > > Figure 1. Unicore Architecture. > > > Library pools would contain libraries that the user of Unicore can select > from to create the unikernel. From the bottom up, library pools are > organized into (1) the architecture library tool, containing libraries > specific to a computer architecture (e.g., x86_64, ARM32 or MIPS); (2) the > platform tool, where target platforms can be Xen, KVM, bare metal (i.e. no > virtualization) and user-space Linux; and (3) the main library pool, > containing a rich set of functionality to build the unikernel from. This > last library includes drivers (both virtual such as netback/netfront and > physical such as ixgbe), filesystems, memory allocators, schedulers, > network stacks, standard libs (e.g. libc, openssl, etc.), runtimes (e.g. a > Python interpreter and debugging and profiling tools. These pools of > libraries constitute a code base for creating unikernels. As shown, a > library can be relatively large (e.g libc) or quite small (a scheduler), > which should allow for a fair amount of customization for the unikernel. > > The Unicore build tool is in charge of compiling the application and the > selected libraries together to create a binary for a specific platform and > architecture (e.g., Xen on x86_64). The tool is currently inspired by > Linux’s kconfig system and consists of a set of Makefiles. It allows users
Re: [Xen-devel] Booting signed xen.efi through shim
On Mon, Sep 18, 2017 at 2:58 AM, Jan Beulichwrote: On 14.09.17 at 18:20, wrote: >> Of course, you can grab them from here: >> https://drive.google.com/drive/folders/0B5duyI9SzNtWaXE0cjM1QzZJbVk?usp=shar >> ing > > So the dumps of the two (using my own tool) are identical except for > the expected difference due to the certificate. In particular neither > image has any strange relocation types afaics, and both have the > sort of unexpected, but also supposedly benign > IMAGE_SCN_LNK_NRELOC_OVFL flag set for .bss. Hence I'm afraid ... > >> I've verified that xen-signed.efi boots with Secureboot enabled when >> booted directly but doesn't boot through the shim. > > ... you'll need to do some debugging in order to figure out what's > going on here. With the above the prime suspect is the shim though, > fiddling with the image after loading it into memory. So perhaps > dumping the .reloc section contents in order to compare it with > what's in the image may be a suitable approach. > > Jan Yeap, the shim pretty simply removed the .reloc section as it was marked discardable and did the relocations for Xen. So with that removed from the shim I no longer get the error and I see that the dom0 kernel gets verified using the shim lock protocol. I still didn't get dom0 to boot for some reason but that might be an unrelated issue (and I have no serial console right now). Nevertheless, progress! Tamas ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [linux-4.1 test] 113603: tolerable FAIL - PUSHED
flight 113603 linux-4.1 real [real] http://logs.test-lab.xenproject.org/osstest/logs/113603/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail blocked in 112503 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 112491 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 112491 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 112503 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 112503 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 112503 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore fail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass version targeted for testing: linux5fbef6af7dd9a92605bb7c426f26bd122fd0cd74 baseline version: linux1af952704416d76ad86963f04feb10a3da143901 Last test of basis 112503 2017-08-07 07:24:24 Z 43 days Testing same since 113603 2017-09-19 13:21:36 Z0 days1 attempts People who touched revisions under test: "Eric W. Biederman"Akinobu Mita Alan Stern Alan Swanson Alex Deucher Alex Williamson Alexander Potapenko Andrea Righi Andrew Morton Andrey Ryabinin Andrzej Hajda Anna Schumaker Anton Blanchard Ard Biesheuvel Arnaldo Carvalho de Melo Arnd Bergmann Arvind Yadav Banajit Goswami Bart Van Assche Bartlomiej Zolnierkiewicz Bjorn Andersson Bjorn Helgaas
Re: [Xen-devel] [PATCH v5 06/10] arm: smccc: handle SMCs according to SMCCC
Hi Julien, On 13.09.17 14:11, Julien Grall wrote: Hi, On 08/31/2017 09:09 PM, Volodymyr Babchuk wrote: +static void fill_uuid(struct cpu_user_regs *regs, const xen_uuid_t *u) Actually why do you pass a pointer for u? This requires every caller to introduce temporary variable because the UUID is usually a define. Hmm, another way probably is to pass a whole structure as a parameter. Are you suggesting this approach? Something like fill_uuid(regs, (xen_uuid_t)MY_UUID)? With your current solution each caller as to do: xen_uuid_t foo = MY_UUID; fill_uuid(regs, ); return true; What I suggested in the previous version is to get fill_uuid return true. So you make each caller simpler. Yes, but it will not be correct semantically. There will arise many questions: 1. Why helper function that only writes data returns bool? 2. If it returns true, can it return false? 3. Should we check its return value before passing it further? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 113602: regressions - FAIL
flight 113602 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/113602/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-credit2 15 guest-saverestorefail REGR. vs. 113387 Tests which are failing intermittently (not blocking): test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail in 113589 pass in 113602 test-armhf-armhf-xl-rtds 7 xen-boot fail in 113589 pass in 113602 test-amd64-i386-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail in 113589 pass in 113602 test-armhf-armhf-xl-credit2 17 guest-start.2fail in 113589 pass in 113602 test-amd64-amd64-xl-qemuu-ovmf-amd64 16 guest-localmigrate/x10 fail pass in 113589 Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemut-win7-amd64 17 guest-stopfail REGR. vs. 113387 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail blocked in 113387 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 113387 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 113387 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 113387 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 113387 test-amd64-amd64-xl-rtds 10 debian-install fail like 113387 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 113387 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore fail never pass test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass version targeted for testing: xen cd02f96d54813139e14e2847566d744358b55c1c baseline version: xen 16b1414de91b5a82a0996c67f6db3af7d7e32873 Last test of basis 113387 2017-09-12 23:20:09 Z6 days Failing since113430 2017-09-14 01:24:48 Z5 days 11 attempts Testing same since 113589 2017-09-19 03:22:01 Z0 days2 attempts People who touched revisions under test: Andrew CooperBhupinder Thakur
Re: [Xen-devel] [PATCH] xen, arm64: drop dummy lookup_address()
On Tue, 19 Sep 2017, Boris Ostrovsky wrote: > On 09/18/2017 06:35 PM, Tycho Andersen wrote: > > This is unused, and conflicts with the definition that we'll add for XPFO. > > > > Signed-off-by: Tycho Andersen> > CC: Boris Ostrovsky > > CC: Juergen Gross > > CC: Stefano Stabellini > > --- > > The patch this depends on is in for-linus-4.14b, so it would be easiest to > > carry this one too; Stefano can you ack it and Boris can you carry it? Reviewed-by: Stefano Stabellini > Applied to for-linus-14b. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 113610: tolerable all pass - PUSHED
flight 113610 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/113610/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass version targeted for testing: xen 5f62fb184fdf2d10e13d4bad28cbe6c8b53be784 baseline version: xen cd02f96d54813139e14e2847566d744358b55c1c Last test of basis 113584 2017-09-18 22:01:16 Z0 days Failing since113606 2017-09-19 16:01:39 Z0 days2 attempts Testing same since 113610 2017-09-19 19:02:29 Z0 days1 attempts People who touched revisions under test: Boris OstrovskyJuergen Gross Julien Grall Wei Liu jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=5f62fb184fdf2d10e13d4bad28cbe6c8b53be784 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig export PERLLIB=.:. PERLLIB=.:. +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 5f62fb184fdf2d10e13d4bad28cbe6c8b53be784 + branch=xen-unstable-smoke + revision=5f62fb184fdf2d10e13d4bad28cbe6c8b53be784 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig export PERLLIB=.:.:. PERLLIB=.:.:. +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig +++ export PERLLIB=.:.:.:. +++ PERLLIB=.:.:.:. ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.9-testing + '[' x5f62fb184fdf2d10e13d4bad28cbe6c8b53be784 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ :
[Xen-devel] [PATCH v4 0/1] netif: staging grants for I/O requests
Hey, This is v4 taking into consideration all comments received from v3 (changelog in the first patch). The specification is right after the diffstat. Reference implementation also here (on top of net-next): https://github.com/jpemartins/linux.git xen-net-stg-gnts-v3 Cheers, Joao Martins (1): public/io/netif.h: add gref mapping control messages xen/include/public/io/netif.h | 115 ++ 1 file changed, 115 insertions(+) --- % Staging grants for network I/O requests % Joao Martins <> % Revision 4 \clearpage Architecture(s): Any # Background and Motivation At the Xen hackaton '16 networking session, we spoke about having a permanently mapped region to describe header/linear region of packet buffers. This document outlines the proposal covering motivation of this and applicability for other use-cases alongside the necessary changes. The motivation of this work is to eliminate grant ops for packet I/O intensive workloads such as those observed with smaller requests size (i.e. <= 256 bytes or <= MTU). Currently on Xen, only bulk transfer (e.g. 32K..64K packets) are the only ones performing really good (up to 80 Gbit/s in few CPUs), usually backing end-hosts and server appliances. Anything that involves higher packet rates (<= 1500 MTU) or without sg, performs badly almost like a 1 Gbit/s throughput. # Proposal The proposal is to leverage the already implicit copy from and to packet linear data on netfront and netback, to be done instead from a permanently mapped region. In some (physical) NICs this is known as header/data split. Specifically some workloads (e.g. NFV) it would provide a big increase in throughput when we switch to (zero)copying in the backend/frontend, instead of the grant hypercalls. Thus this extension aims at futureproofing the netif protocol by adding the possibility of guests setting up a list of grants that are set up at device creation and revoked at device freeing - without taking too much grant entries in account for the general case (i.e. to cover only the header region <= 256 bytes, 16 grants per ring) while configurable by kernel when one wants to resort to a copy-based as opposed to grant copy/map. \clearpage # General Operation Here we describe how netback and netfront general operate, and where the proposed solution will fit. The security mechanism currently involves grants references which in essence are round-robin recycled 'tickets' stamped with the GPFNs, permission attributes, and the authorized domain: (This is an in-memory view of struct grant_entry_v1): 0 1 2 3 4 5 6 7 octet ++---++ | flags | domain id | frame | ++---++ Where there are N grant entries in a grant table, for example: @0: ++---++ | rw | 0 | 0xABCDEF | ++---++ | rw | 0 | 0xFA124| ++---++ | ro | 1 | 0xBEEF | ++---++ . @N: ++---++ | rw | 0 | 0x9923A| ++---++ Each entry consumes 8 bytes, therefore 512 entries can fit on one page. The `gnttab_max_frames` which is a default of 32 pages. Hence 16,384 grants. The ParaVirtualized (PV) drivers will use the grant reference (index in the grant table - 0 .. N) in their command ring. \clearpage ## Guest Transmit The view of the shared transmit ring is the following: 0 1 2 3 4 5 6 7 octet +++ | req_prod | req_event | +++ | rsp_prod | rsp_event | +++ | pvt| pad[44]| ++| | | [64bytes] +++-\ | gref | offset| flags | | ++---++ +-'struct | id | size | id| status | | netif_tx_sring_entry' +-+-/ |/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/| .. N +-+ Each entry consumes 16 octets therefore 256 entries
[Xen-devel] [PATCH v4 1/1] public/io/netif.h: add gref mapping control messages
Adds 3 messages to allow guest to let backend keep grants mapped, such that 1) guests allowing fast recycling of pages can avoid doing grant ops for those cases, or otherwise 2) preferring copies over grants and 3) always using a fixed set of pages for network I/O. The three control ring messages added are: - Add grefs to be mapped by backend - Remove grefs mappings (If they are not in use) - Get maximum amount of grefs kept mapped. Signed-off-by: Joao Martins--- v4: * Declare xen_netif_gref parameters are input or output. * Clarify status field and that it doesn't require to be set to zero prior to its usage. * Clarify on ADD_GREF_MAPPING is 'all or nothing' * Improve last paragraph of DEL_GREF_MAPPING v3: * Use DEL for unmapping grefs instead of PUT * Rname from xen_netif_gref_alloc to xen_netif_gref * Add 'status' field on xen_netif_gref * Clarify what 'inflight' means * Use "beginning of the page" instead of "beginning of the grant" * Mention that page needs to be r/w (as it will have to modify \.status) --- xen/include/public/io/netif.h | 123 ++ 1 file changed, 123 insertions(+) diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h index ca0061410d..2454448baa 100644 --- a/xen/include/public/io/netif.h +++ b/xen/include/public/io/netif.h @@ -353,6 +353,9 @@ struct xen_netif_ctrl_request { #define XEN_NETIF_CTRL_TYPE_SET_HASH_MAPPING_SIZE 5 #define XEN_NETIF_CTRL_TYPE_SET_HASH_MAPPING 6 #define XEN_NETIF_CTRL_TYPE_SET_HASH_ALGORITHM7 +#define XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE 8 +#define XEN_NETIF_CTRL_TYPE_ADD_GREF_MAPPING 9 +#define XEN_NETIF_CTRL_TYPE_DEL_GREF_MAPPING 10 uint32_t data[3]; }; @@ -391,6 +394,44 @@ struct xen_netif_ctrl_response { }; /* + * Static Grants (struct xen_netif_gref) + * = + * + * A frontend may provide a fixed set of grant references to be mapped on + * the backend. The message of type XEN_NETIF_CTRL_TYPE_ADD_GREF_MAPPING + * prior its usage in the command ring allows for creation of these mappings. + * The backend will maintain a fixed amount of these mappings. + * + * XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE lets a frontend query how many + * of these mappings can be kept. + * + * Each entry in the XEN_NETIF_CTRL_TYPE_{ADD,DEL}_GREF_MAPPING input table has + * the following format: + * + *0 1 2 3 4 5 6 7 octet + * +-+-+-+-+-+-+-+-+ + * | grant ref | flags| status | + * +-+-+-+-+-+-+-+-+ + * + * grant ref: grant reference (IN) + * flags: flags describing the control operation (IN) + * status: XEN_NETIF_CTRL_STATUS_* (OUT) + * + * 'status' is an output parameter which does not require to be set to zero + * prior to its usage in the corresponding control messages. + */ + +struct xen_netif_gref { + grant_ref_t ref; + uint16_t flags; + +#define _XEN_NETIF_CTRLF_GREF_readonly0 +#define XEN_NETIF_CTRLF_GREF_readonly(1U<<_XEN_NETIF_CTRLF_GREF_readonly) + + uint16_t status; +}; + +/* * Control messages * * @@ -609,6 +650,88 @@ struct xen_netif_ctrl_response { * invalidate any table data outside that range. * The grant reference may be read-only and must remain valid until * the response has been processed. + * + * XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE + * - + * + * This is sent by the frontend to fetch the number of grefs that can be kept + * mapped in the backend. + * + * Request: + * + * type= XEN_NETIF_CTRL_TYPE_GET_GREF_MAPPING_SIZE + * data[0] = queue index (assumed 0 for single queue) + * data[1] = 0 + * data[2] = 0 + * + * Response: + * + * status = XEN_NETIF_CTRL_STATUS_NOT_SUPPORTED - Operation not + * supported + * XEN_NETIF_CTRL_STATUS_INVALID_PARAMETER - The queue index is + * out of range + * XEN_NETIF_CTRL_STATUS_SUCCESS - Operation successful + * data = maximum number of entries allowed in the gref mapping table + * (if operation was successful) or zero if it is not supported. + * + * XEN_NETIF_CTRL_TYPE_ADD_GREF_MAPPING + * + * + * This is sent by the frontend for backend to map a list of grant + * references. + * + * Request: + * + * type= XEN_NETIF_CTRL_TYPE_ADD_GREF_MAPPING + * data[0] = queue index + * data[1] = grant reference of page containing the mapping list + *(r/w and assumed to start at beginning of page) + * data[2] = size of list in entries + * + * Response: + * + * status = XEN_NETIF_CTRL_STATUS_NOT_SUPPORTED - Operation not + * supported + *
Re: [Xen-devel] [PATCH] MAINTAINERS: Add public/arch-arm.h under the ARM subsystem
On Tue, 19 Sep 2017, Wei Liu wrote: > On Tue, Sep 19, 2017 at 12:25:53PM +0100, Julien Grall wrote: > > The header public/arch-arm.h contains mostly ARM specific code. Avoid CC > > the "THE REST" maintainers on it. > > > > Signed-off-by: Julien Grall> > Acked-by: Wei Liu Reviewed-by: Stefano Stabellini ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [ovmf baseline-only test] 72127: all pass
This run is configured for baseline tests only. flight 72127 ovmf real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/72127/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf a3a4737051010a94832f7bceaa1fa414d7259da0 baseline version: ovmf 91cc526b15ffbbbdec5a57906596f37e059f80be Last test of basis72125 2017-09-19 08:19:58 Z0 days Testing same since72127 2017-09-19 16:20:35 Z0 days1 attempts People who touched revisions under test: Hao WuYonghong Zhu jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. commit a3a4737051010a94832f7bceaa1fa414d7259da0 Author: Yonghong Zhu Date: Fri Sep 15 16:14:17 2017 +0800 BaseTools: Fix a bug to correct SourceFileList We met a case that use two microcode files in the Microcode.inf file, one is .mcb file, another is .txt file. then it cause build failure because the SourceFileList include the .txt file's output file, while this output file is still not be generated, so it cause GetFileDependency report failure. Cc: Liming Gao Contributed-under: TianoCore Contribution Agreement 1.1 Signed-off-by: Yonghong Zhu Reviewed-by: Liming Gao commit 880ec68338541b63672ea521c0dffee181df8ede Author: Hao Wu Date: Fri Sep 15 08:57:40 2017 +0800 MdeModulePkg/UdfDxe: Refine enum member naming style Similar to the naming style for variables, it's better for the name of members in a enum type to avoid using only upper-case letters. Cc: Paulo Alcantara Cc: Ruiyu Ni Cc: Star Zeng Cc: Eric Dong Cc: Dandan Bi Contributed-under: TianoCore Contribution Agreement 1.1 Signed-off-by: Hao Wu Reviewed-by: Paulo Alcantara Reviewed-by: Star Zeng commit 3f92b104930ba582924da578a12ee0062881ab7b Author: Hao Wu Date: Fri Sep 15 08:43:22 2017 +0800 MdeModulePkg/Udf: Avoid declaring and initializing local GUID variable The local GUID variable 'UdfDevPathGuid', it has been initialized during its declaration. For better coding style, this commit uses a global variable instead. Cc: Paulo Alcantara Cc: Ruiyu Ni Cc: Star Zeng Cc: Eric Dong Cc: Dandan Bi Contributed-under: TianoCore Contribution Agreement 1.1 Signed-off-by: Hao Wu Reviewed-by: Paulo Alcantara Reviewed-by: Star Zeng commit 32492fee2d7e7116b5970266915285d043e0c0e1 Author: Hao Wu Date: Thu Sep 14 16:02:42 2017 +0800 MdeModulePkg/UdfDxe: Avoid short (single character) variable name In ResolveSymlink(), replace the following variable: CHAR16 *C; with: CHAR16 *Char; Cc: Paulo Alcantara Cc: Ruiyu Ni Cc: Star Zeng Cc: Eric Dong Cc: Dandan Bi Contributed-under: TianoCore Contribution Agreement 1.1 Signed-off-by: Hao Wu Reviewed-by: Paulo Alcantara Reviewed-by: Star Zeng commit 077f8c4372cc68efea91243dd1fe77d41315444d Author: Hao Wu Date: Thu Sep 14 13:19:13 2017 +0800 MdeModulePkg/Udf:
[Xen-devel] [xen-unstable-smoke test] 113606: trouble: broken/pass
flight 113606 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/113606/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl broken test-armhf-armhf-xl 4 host-install(4)broken REGR. vs. 113584 Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass version targeted for testing: xen f2e3b3b2e97bbea607983e44de8dbd023cc3bce3 baseline version: xen cd02f96d54813139e14e2847566d744358b55c1c Last test of basis 113584 2017-09-18 22:01:16 Z0 days Testing same since 113606 2017-09-19 16:01:39 Z0 days1 attempts People who touched revisions under test: Boris OstrovskyJuergen Gross jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl broken test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary broken-job test-armhf-armhf-xl broken broken-step test-armhf-armhf-xl host-install(4) Not pushing. commit f2e3b3b2e97bbea607983e44de8dbd023cc3bce3 Author: Juergen Gross Date: Tue Sep 19 17:48:23 2017 +0200 correct gnttab_get_status_frames() In gnttab_get_status_frames() all accesses to nr_status_frames should be done with the grant table lock held. While at it correct coding style: labels should be indented by one space. Signed-off-by: Juergen Gross Reviewed-by: Paul Durrant Reviewed-by: Wei Liu commit 7d190bdde9fd070b4b98bad33a1f68e3b02952c5 Author: Boris Ostrovsky Date: Tue Sep 19 17:47:47 2017 +0200 mm: scrub pages returned back to heap if MEMF_no_scrub is set Set free_heap_pages()'s need_scrub to true if alloc_domheap_pages() returns pages back to heap as result of assign_pages() error when those pages were requested with MEMF_no_scrub flag. We need to do this because there is a possibility that alloc_heap_pages() might clear buddy's PGC_need_scrubs flag without actually clearing the page. Signed-off-by: Boris Ostrovsky Reviewed-by: Jan Beulich (qemu changes not included) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 3/3] RFC: migration: defer precopy policy to libxl
Provide an implementation of the old policy as a callback in libxl and plumb it through the IPC machinery to libxc. This serves as an example for defining a libxl policy, and provides no advantage over the default policy in libxc. Signed-off-by: Joshua Otto--- I have included this patch, as rfc, as requested by Ian, to show how libxl can provide a migration precopy policy. This was part of the same larger patch from Joshua - I have not changed or tested it. tools/libxl/libxl_dom_save.c | 20 tools/libxl/libxl_save_msgs_gen.pl | 4 +++- 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c index 77fe30e..6d28cce 100644 --- a/tools/libxl/libxl_dom_save.c +++ b/tools/libxl/libxl_dom_save.c @@ -328,6 +328,25 @@ int libxl__save_emulator_xenstore_data(libxl__domain_save_state *dss, return rc; } +/* + * This is the live migration precopy policy - it's called periodically during + * the precopy phase of live migrations, and is responsible for deciding when + * the precopy phase should terminate and what should be done next. + * + * The policy implemented here behaves identically to the policy previously + * hard-coded into xc_domain_save() - it proceeds to the stop-and-copy phase of + * the live migration when there are either fewer than 50 dirty pages, or more + * than 5 precopy rounds have completed. + */ +static int libxl__save_live_migration_simple_precopy_policy( +struct precopy_stats stats, void *user) +{ +return ((stats.dirty_count >= 0 && stats.dirty_count < 50) || +stats.iteration >= 5) +? XGS_POLICY_STOP_AND_COPY +: XGS_POLICY_CONTINUE_PRECOPY; +} + /*- main code for saving, in order of execution -*/ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss) @@ -401,6 +420,7 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss) if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_NONE) callbacks->suspend = libxl__domain_suspend_callback; +callbacks->precopy_policy = libxl__save_live_migration_simple_precopy_policy; callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty; dss->sws.ao = dss->ao; diff --git a/tools/libxl/libxl_save_msgs_gen.pl b/tools/libxl/libxl_save_msgs_gen.pl index 3ae7373..bb1d4e9 100755 --- a/tools/libxl/libxl_save_msgs_gen.pl +++ b/tools/libxl/libxl_save_msgs_gen.pl @@ -33,6 +33,7 @@ our @msgs = ( 'xen_pfn_t', 'console_gfn'] ], [ 9, 'srW',"complete", [qw(int retval int errnoval)] ], +[ 10, 'scxW', "precopy_policy", ['struct precopy_stats', 'stats'] ] ); # @@ -141,7 +142,8 @@ static void bytes_put(unsigned char *const buf, int *len, END -foreach my $simpletype (qw(int uint16_t uint32_t unsigned), 'unsigned long', 'xen_pfn_t') { +foreach my $simpletype (qw(int uint16_t uint32_t unsigned), +'unsigned long', 'xen_pfn_t', 'struct precopy_stats') { my $typeid = typeid($simpletype); $out_body{'callout'} .= <
[Xen-devel] [PATCH v2 2/3] Introduce migration precopy policy
This Patch allows a migration precopy policy to be specified. The precopy phase of the xc_domain_save() live migration algorithm has historically been implemented to run until either a) (almost) no pages are dirty or b) some fixed, hard-coded maximum number of precopy iterations has been exceeded. This policy and its implementation are less than ideal for a few reasons: - the logic of the policy is intertwined with the control flow of the mechanism of the precopy stage - it can't take into account facts external to the immediate migration context, such external state transfer state, interactive user input, or the passage of wall-clock time. - it does not permit the user to change their mind, over time, about what to do at the end of the precopy (they get an unconditional transition into the stop-and-copy phase of the migration) To permit callers to implement arbitrary higher-level policies governing when the live migration precopy phase should end, and what should be done next: - add a precopy_policy() callback to the xc_domain_save() user-supplied callbacks - during the precopy phase of live migrations, consult this policy after each batch of pages transmitted and take the dictated action, which may be to a) abort the migration entirely, b) continue with the precopy, or c) proceed to the stop-and-copy phase. - provide an implementation of the old policy, used when precopy_policy callback is not provided. Signed-off-by: Jennifer HerbertSigned-off-by: Joshua Otto --- v2: Have made a few formatting corrections, added typedef as suggested. v1: This is updated/modified subset of patch 7/20, part of Joshua Otto's "Add postcopy live migration support." patch, dated 27th March 2017. As indicated on the original thread, I wish to make use of this this within the XenServer product. I hope this will aid Josh in pushing the remainder of his series. --- tools/libxc/include/xenguest.h | 31 -- tools/libxc/xc_sr_common.h | 6 +-- tools/libxc/xc_sr_save.c | 97 +- 3 files changed, 97 insertions(+), 37 deletions(-) diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h index 6626f0c..a2a654c 100644 --- a/tools/libxc/include/xenguest.h +++ b/tools/libxc/include/xenguest.h @@ -39,6 +39,16 @@ */ struct xenevtchn_handle; +/* For save's precopy_policy(). */ +struct precopy_stats +{ +unsigned int iteration; +unsigned int total_written; +long dirty_count; /* -1 if unknown */ +}; + +typedef int (*precopy_policy_t)(struct precopy_stats, void *); + /* callbacks provided by xc_domain_save */ struct save_callbacks { /* Called after expiration of checkpoint interval, @@ -46,7 +56,20 @@ struct save_callbacks { */ int (*suspend)(void* data); -/* Called after the guest's dirty pages have been +/* + * Called after every batch of page data sent during the precopy + * phase of a live migration to ask the caller what to do next + * based on the current state of the precopy migration. + */ +#define XGS_POLICY_ABORT (-1) /* Abandon the migration entirely +* and tidy up. */ +#define XGS_POLICY_CONTINUE_PRECOPY 0 /* Remain in the precopy phase. */ +#define XGS_POLICY_STOP_AND_COPY1 /* Immediately suspend and transmit the +* remaining dirty pages. */ +precopy_policy_t precopy_policy; + +/* + * Called after the guest's dirty pages have been * copied into an output buffer. * Callback function resumes the guest & the device model, * returns to xc_domain_save. @@ -55,7 +78,8 @@ struct save_callbacks { */ int (*postcopy)(void* data); -/* Called after the memory checkpoint has been flushed +/* + * Called after the memory checkpoint has been flushed * out into the network. Typical actions performed in this * callback include: * (a) send the saved device model state (for HVM guests), @@ -65,7 +89,8 @@ struct save_callbacks { * * returns: * 0: terminate checkpointing gracefully - * 1: take another checkpoint */ + * 1: take another checkpoint + */ int (*checkpoint)(void* data); /* diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h index a83f22a..3635704 100644 --- a/tools/libxc/xc_sr_common.h +++ b/tools/libxc/xc_sr_common.h @@ -198,12 +198,10 @@ struct xc_sr_context /* Further debugging information in the stream. */ bool debug; -/* Parameters for tweaking live migration. */ -unsigned max_iterations; -unsigned dirty_threshold; - unsigned long p2m_size; +struct precopy_stats stats; + xen_pfn_t *batch_pfns; unsigned nr_batch_pfns; unsigned long *deferred_pages; diff --git
[Xen-devel] [PATCH v2 1/3] Tidy libxc xc_domain_save
Tidy up libxc's xc_domain_save, removing unused paramaters max_iters and max_factor, making matching changes to libxl. Signed-off-by: Joshua OttoSigned-off-by: Jennifer Herbert Reviewed-by: Paul Durrant Acked-by: Wei Liu --- tools/libxc/include/xenguest.h | 4 ++-- tools/libxc/xc_nomigrate.c | 3 +-- tools/libxc/xc_sr_save.c | 8 +++- tools/libxl/libxl_save_callout.c | 4 ++-- tools/libxl/libxl_save_helper.c | 7 ++- 5 files changed, 10 insertions(+), 16 deletions(-) diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h index 5cd8111..6626f0c 100644 --- a/tools/libxc/include/xenguest.h +++ b/tools/libxc/include/xenguest.h @@ -100,8 +100,8 @@ typedef enum { *doesn't use checkpointing * @return 0 on success, -1 on failure */ -int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters, - uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */, +int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, + uint32_t flags /* XCFLAGS_xxx */, struct save_callbacks* callbacks, int hvm, xc_migration_stream_t stream_type, int recv_fd); diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c index 317c8ce..fe8f68c 100644 --- a/tools/libxc/xc_nomigrate.c +++ b/tools/libxc/xc_nomigrate.c @@ -20,8 +20,7 @@ #include #include -int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters, - uint32_t max_factor, uint32_t flags, +int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t flags, struct save_callbacks* callbacks, int hvm, xc_migration_stream_t stream_type, int recv_fd) { diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c index ca6913b..1e7502d 100644 --- a/tools/libxc/xc_sr_save.c +++ b/tools/libxc/xc_sr_save.c @@ -916,9 +916,8 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type) }; int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, - uint32_t max_iters, uint32_t max_factor, uint32_t flags, - struct save_callbacks* callbacks, int hvm, - xc_migration_stream_t stream_type, int recv_fd) + uint32_t flags, struct save_callbacks* callbacks, + int hvm, xc_migration_stream_t stream_type, int recv_fd) { struct xc_sr_context ctx = { @@ -955,8 +954,7 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, if ( ctx.save.checkpointed == XC_MIG_STREAM_COLO ) assert(callbacks->wait_checkpoint); -DPRINTF("fd %d, dom %u, max_iters %u, max_factor %u, flags %u, hvm %d", -io_fd, dom, max_iters, max_factor, flags, hvm); +DPRINTF("fd %d, dom %u, flags %u, hvm %d", io_fd, dom, flags, hvm); if ( xc_domain_getinfo(xch, dom, 1, ) != 1 ) { diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c index 891c669..6452d70 100644 --- a/tools/libxl/libxl_save_callout.c +++ b/tools/libxl/libxl_save_callout.c @@ -89,8 +89,8 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss, libxl__srm_callout_enumcallbacks_save(>callbacks.save.a); const unsigned long argnums[] = { -dss->domid, 0, 0, dss->xcflags, dss->hvm, -cbflags, dss->checkpointed_stream, +dss->domid, dss->xcflags, dss->hvm, cbflags, +dss->checkpointed_stream, }; shs->ao = ao; diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c index 1dece23..38089a0 100644 --- a/tools/libxl/libxl_save_helper.c +++ b/tools/libxl/libxl_save_helper.c @@ -251,8 +251,6 @@ int main(int argc, char **argv) io_fd = atoi(NEXTARG); recv_fd = atoi(NEXTARG); uint32_t dom = strtoul(NEXTARG,0,10); -uint32_t max_iters =strtoul(NEXTARG,0,10); -uint32_t max_factor = strtoul(NEXTARG,0,10); uint32_t flags =strtoul(NEXTARG,0,10); int hvm = atoi(NEXTARG); unsigned cbflags = strtoul(NEXTARG,0,10); @@ -264,9 +262,8 @@ int main(int argc, char **argv) startup("save"); setup_signals(save_signal_handler); -r = xc_domain_save(xch, io_fd, dom, max_iters, max_factor, flags, - _save_callbacks, hvm, stream_type, - recv_fd); +r = xc_domain_save(xch, io_fd, dom, flags, _save_callbacks, + hvm, stream_type, recv_fd); complete(r); } else if (!strcmp(mode,"--restore-domain")) { -- 1.8.3.1 ___ Xen-devel mailing list
[Xen-devel] [PATCH v2 0/3] Introduce migration precopy policy
Hi all, v2: Have tidied some formatting, s.o.b Joshua, and added a RFC patch showing how to use this in libxl. v1: Here I present a updated/modified subset of patch 7/20, part of Joshua Otto's "Add postcopy live migration support." patch, dated 27th March 2017. As indicated on the original thread, I wish to make use of this this within the XenServer product, and hence am trying to get this subset pulled in now. I also hope this will aid Josh in pushing the remainder of his series. Here I present two patches, the first which does some tidy up, removing unused and unhelpful paramaters to xc_domain_save(), and the second which allows a precopy callback to be specified, providing the test for when to end the live phase of migration should end. If none is provided, a default policy of the current behaviour is used. Jennifer Herbert (3): Tidy libxc xc_domain_save Introduce migration precopy policy RFC: migration: defer precopy policy to libxl tools/libxc/include/xenguest.h | 35 +++-- tools/libxc/xc_nomigrate.c | 3 +- tools/libxc/xc_sr_common.h | 6 +-- tools/libxc/xc_sr_save.c | 105 - tools/libxl/libxl_dom_save.c | 20 +++ tools/libxl/libxl_save_callout.c | 4 +- tools/libxl/libxl_save_helper.c| 7 +-- tools/libxl/libxl_save_msgs_gen.pl | 4 +- 8 files changed, 130 insertions(+), 54 deletions(-) -- 1.8.3.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [distros-debian-snapshot test] 72123: regressions - trouble: blocked/broken/fail/pass
flight 72123 distros-debian-snapshot real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/72123/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-amd64-daily-netboot-pvgrub 10 debian-di-install fail REGR. vs. 72095 test-amd64-amd64-i386-daily-netboot-pygrub 10 debian-di-install fail REGR. vs. 72095 test-amd64-i386-i386-daily-netboot-pvgrub 10 debian-di-install fail REGR. vs. 72095 test-amd64-i386-amd64-daily-netboot-pygrub 10 debian-di-install fail REGR. vs. 72095 Tests which did not succeed, but are not blocking: test-arm64-arm64-armhf-daily-netboot-pygrub 1 build-check(1) blocked n/a build-arm64-pvops 2 hosts-allocate broken like 72095 build-arm64 2 hosts-allocate broken like 72095 build-arm64-pvops 3 capture-logs broken like 72095 build-arm64 3 capture-logs broken like 72095 test-armhf-armhf-armhf-daily-netboot-pygrub 7 xen-bootfail like 72095 test-amd64-amd64-amd64-current-netinst-pygrub 10 debian-di-install fail like 72095 test-amd64-amd64-i386-weekly-netinst-pygrub 10 debian-di-install fail like 72095 test-amd64-amd64-amd64-weekly-netinst-pygrub 10 debian-di-install fail like 72095 test-amd64-i386-i386-weekly-netinst-pygrub 10 debian-di-install fail like 72095 test-amd64-i386-i386-current-netinst-pygrub 10 debian-di-install fail like 72095 test-amd64-i386-amd64-current-netinst-pygrub 10 debian-di-install fail like 72095 test-amd64-i386-amd64-weekly-netinst-pygrub 10 debian-di-install fail like 72095 test-amd64-amd64-i386-current-netinst-pygrub 10 debian-di-install fail like 72095 baseline version: flight 72095 jobs: build-amd64 pass build-arm64 broken build-armhf pass build-i386 pass build-amd64-pvopspass build-arm64-pvopsbroken build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-amd64-daily-netboot-pvgrub fail test-amd64-i386-i386-daily-netboot-pvgrubfail test-amd64-i386-amd64-daily-netboot-pygrub fail test-arm64-arm64-armhf-daily-netboot-pygrub blocked test-armhf-armhf-armhf-daily-netboot-pygrub fail test-amd64-amd64-i386-daily-netboot-pygrub fail test-amd64-amd64-amd64-current-netinst-pygrubfail test-amd64-i386-amd64-current-netinst-pygrub fail test-amd64-amd64-i386-current-netinst-pygrub fail test-amd64-i386-i386-current-netinst-pygrub fail test-amd64-amd64-amd64-weekly-netinst-pygrub fail test-amd64-i386-amd64-weekly-netinst-pygrub fail test-amd64-amd64-i386-weekly-netinst-pygrub fail test-amd64-i386-i386-weekly-netinst-pygrub fail sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Next Xen ARM community call - Wednesday 20th September 2017
Hi all, Quick reminder, the call will be tomorrow (Wednesday 20th) at 5pm BST. The details to join the call are: Call+44 1223 406065 (Local dial in) and enter the access code below followed by # key. Participant code: 4915191 Mobile Auto Dial: VoIP: voip://+441223406065;4915191# iOS devices: +44 1223 406065,4915191 and press # Other devices: +44 1223 406065x4915191# Additional Calling Information: UK +44 1142828002 US CA +1 4085761502 US TX +1 5123141073 JP +81 453455355 DE +49 8945604050 NO +47 73187518 SE +46 46313131 FR +33 497235101 TW +886 35657119 HU +36 13275600 IE +353 91337900 Toll Free UK 0800 1412084 US +1 8668801148 CN +86 4006782367 IN 0008009868365 IN +918049282778 TW 08000 22065 HU 0680981587 IE 1800800022 KF +972732558877 Cheers, On 11/09/17 11:13, Julien Grall wrote: Hi all, This call will be moved by a week as requested by Stefano. The next call will be on Wednesday 20th September 2017 5pm BST. Do you have any specific topic you would like to discuss? Cheers, On 25/08/17 11:42, Julien Grall wrote: Hi all, I would suggest to have the next community call on Wednesday 13th September 2017 5pm BST. Does it sound good? Do you have any specific topic you would like to discuss? Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6 01/11] pci: introduce a type to store a SBDF
>>> On 19.09.17 at 17:40,wrote: >> --- a/xen/arch/x86/hvm/io.c >> +++ b/xen/arch/x86/hvm/io.c >> @@ -257,17 +257,11 @@ void register_g2m_portio_handler(struct domain *d) >> } >> >> unsigned int hvm_pci_decode_addr(unsigned int cf8, unsigned int addr, >> - unsigned int *bus, unsigned int *slot, >> - unsigned int *func) >> + pci_sbdf_t *bdf) > > I'd prefer the pointer name to be 'sbdf' rather than 'bdf', but otherwise... Indeed. Or have a sub-type "struct pci_bdf_t", as the segment (sadly) isn't relevant yet. >> { >> -unsigned int bdf; >> - >> ASSERT(CF8_ENABLED(cf8)); >> >> -bdf = CF8_BDF(cf8); >> -*bus = PCI_BUS(bdf); >> -*slot = PCI_SLOT(bdf); >> -*func = PCI_FUNC(bdf); >> +bdf->sbdf = CF8_BDF(cf8); Filling ->bdf here and setting ->seg explicitly with zero may also make the current limitation more obvious. >> --- a/xen/arch/x86/hvm/ioreq.c >> +++ b/xen/arch/x86/hvm/ioreq.c >> @@ -1177,17 +1177,15 @@ struct hvm_ioreq_server >> *hvm_select_ioreq_server(struct domain *d, >> (p->addr & ~3) == 0xcfc && >> CF8_ENABLED(cf8) ) >> { >> -uint32_t sbdf, x86_fam; >> -unsigned int bus, slot, func, reg; >> +uint32_t x86_fam; >> +pci_sbdf_t bdf; >> +unsigned int reg; >> >> -reg = hvm_pci_decode_addr(cf8, p->addr, , , ); >> +reg = hvm_pci_decode_addr(cf8, p->addr, ); >> >> /* PCI config data cycle */ >> - >> -sbdf = XEN_DMOP_PCI_SBDF(0, bus, slot, func); >> - >> type = XEN_DMOP_IO_RANGE_PCI; >> -addr = ((uint64_t)sbdf << 32) | reg; >> +addr = ((uint64_t)bdf.bdf << 32) | reg; I also wonder why the field used here is bdf instead of sbdf. It would make for less future changes if you used .sbdf here right away. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 03/16] xen: clean up grant_table.h
On 19/09/17 17:55, Jan Beulich wrote: On 19.09.17 at 11:58,wrote: >> --- a/xen/common/grant_table.c >> +++ b/xen/common/grant_table.c >> @@ -40,6 +40,45 @@ >> #include >> #include >> >> +/* Per-domain grant information. */ >> +struct grant_table { >> +/* >> + * Lock protecting updates to grant table state (version, active >> + * entry list, etc.) >> + */ >> +percpu_rwlock_t lock; >> +/* Lock protecting the maptrack limit */ >> +spinlock_tmaptrack_lock; > > Hmm, I'm not sure about putting two locks so obviously close to one > another. But then again the structure doesn't look to be larger than > a cache line anyway, so moving it wouldn't be any win as it seems. Additionally not many domains need both locks frequently: driver domains (including dom0) use the maptrack_lock mostly, all other domains won't use it at all. So I assume conflicts should be really very very rare. > >> @@ -1580,7 +1659,7 @@ gnttab_unpopulate_status_frames(struct domain *d, >> struct grant_table *gt) >> * Grow the grant table. The caller must hold the grant table's >> * write lock before calling this function. >> */ >> -int >> +static int >> gnttab_grow_table(struct domain *d, unsigned int req_nr_frames) >> { > > Wouldn't this better be part of patch 2? But no need to resend > because of this unless v8 becomes necessary anyway. Hmm, true. I wanted to send V8 tomorrow due to patch 4 (some leftovers from patch development in include/public/domctl.h). I'll do the change. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 03/16] xen: clean up grant_table.h
>>> On 19.09.17 at 11:58,wrote: > --- a/xen/common/grant_table.c > +++ b/xen/common/grant_table.c > @@ -40,6 +40,45 @@ > #include > #include > > +/* Per-domain grant information. */ > +struct grant_table { > +/* > + * Lock protecting updates to grant table state (version, active > + * entry list, etc.) > + */ > +percpu_rwlock_t lock; > +/* Lock protecting the maptrack limit */ > +spinlock_tmaptrack_lock; Hmm, I'm not sure about putting two locks so obviously close to one another. But then again the structure doesn't look to be larger than a cache line anyway, so moving it wouldn't be any win as it seems. > @@ -1580,7 +1659,7 @@ gnttab_unpopulate_status_frames(struct domain *d, > struct grant_table *gt) > * Grow the grant table. The caller must hold the grant table's > * write lock before calling this function. > */ > -int > +static int > gnttab_grow_table(struct domain *d, unsigned int req_nr_frames) > { Wouldn't this better be part of patch 2? But no need to resend because of this unless v8 becomes necessary anyway. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [ovmf baseline-only test] 72125: all pass
This run is configured for baseline tests only. flight 72125 ovmf real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/72125/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 91cc526b15ffbbbdec5a57906596f37e059f80be baseline version: ovmf 7f2f96f1a8af3c22bdf5d4dccb020846799f7be0 Last test of basis72121 2017-09-18 09:49:09 Z1 days Testing same since72125 2017-09-19 08:19:58 Z0 days1 attempts People who touched revisions under test: Pankaj Bansaljobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. commit 91cc526b15ffbbbdec5a57906596f37e059f80be Author: Pankaj Bansal Date: Mon Sep 18 15:42:45 2017 +0800 MdeModulePkg/SerialDxe: Fix not able to change serial attributes Issue : When try to change serial attributes using sermode command, the default values are set with the execute flow as below. The sermode command calls SerialSetAttributes, which sets H/W attributes of Serial device. After that the SerialIo protocol is reinstalled, which causes MdeModulePkg/Universal/Console/TerminalDxe and MdeModulePkg/Universal/Console/ConPlatformDxe drivers' bindings to stop and then start. This in turn calls SerialReset, which undoes changes of SerialSetAttributes. Cause : The SerialReset command resets the attributes' values to default. Fix : Serial Reset command should set the attributes which have been changed by user after calling SerialSetAttributes. Contributed-under: TianoCore Contribution Agreement 1.1 Signed-off-by: Pankaj Bansal Regression-tested-by: Laszlo Ersek Reviewed-by: Star Zeng ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [qemu-mainline test] 113596: regressions - trouble: blocked/broken/fail/pass
flight 113596 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/113596/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf-pvopsbroken build-armhf-pvops 5 host-build-prep fail REGR. vs. 113302 test-amd64-amd64-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail in 113586 REGR. vs. 113302 build-armhf-xsm 6 xen-build fail in 113586 REGR. vs. 113302 Tests which are failing intermittently (not blocking): test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail pass in 113586 Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stopfail REGR. vs. 113302 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-xl-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl-vhd 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 13 saverestore-support-check fail in 113586 like 113302 test-armhf-armhf-libvirt 14 saverestore-support-check fail in 113586 like 113302 test-armhf-armhf-xl-arndale 13 migrate-support-check fail in 113586 never pass test-armhf-armhf-xl-arndale 14 saverestore-support-check fail in 113586 never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-check fail in 113586 never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-check fail in 113586 never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-check fail in 113586 never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-check fail in 113586 never pass test-armhf-armhf-xl-credit2 13 migrate-support-check fail in 113586 never pass test-armhf-armhf-xl-credit2 14 saverestore-support-check fail in 113586 never pass test-armhf-armhf-libvirt-raw 12 migrate-support-check fail in 113586 never pass test-armhf-armhf-xl-rtds13 migrate-support-check fail in 113586 never pass test-armhf-armhf-xl-rtds 14 saverestore-support-check fail in 113586 never pass test-armhf-armhf-xl 13 migrate-support-check fail in 113586 never pass test-armhf-armhf-xl 14 saverestore-support-check fail in 113586 never pass test-armhf-armhf-xl-vhd 12 migrate-support-check fail in 113586 never pass test-armhf-armhf-xl-vhd 13 saverestore-support-check fail in 113586 never pass test-armhf-armhf-libvirt13 migrate-support-check fail in 113586 never pass test-amd64-amd64-xl-rtds 10 debian-install fail like 113302 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass version targeted for testing: qemuua9158a5cba955b79d580a252cc58ff44d154e370 baseline version: qemuua6e8c1dacfd37d34542e33600dcc50b7683b735a Last test of basis 113302 2017-09-11 10:18:16 Z8 days Failing since113345 2017-09-12 00:21:07 Z7 days 14 attempts Testing same since 113580 2017-09-18 13:19:38 Z1 days3 attempts People who touched revisions under test: Alex BennéeAlexander Graf Alexey Kardashevskiy Alistair Francis Amador Pahim
Re: [Xen-devel] [PATCH v6 01/11] pci: introduce a type to store a SBDF
> -Original Message- > From: Roger Pau Monne [mailto:roger@citrix.com] > Sent: 19 September 2017 16:29 > To: xen-de...@lists.xenproject.org > Cc: konrad.w...@oracle.com; boris.ostrov...@oracle.com; Roger Pau Monne >; Paul Durrant ; Jan > Beulich ; Andrew Cooper > ; George Dunlap > ; Ian Jackson ; > Stefano Stabellini ; Tim (Xen.org) ; > Wei Liu > Subject: [PATCH v6 01/11] pci: introduce a type to store a SBDF > > That provides direct access to all the members that constitute a SBDF. > The only function switched to use it is hvm_pci_decode_addr, because > it makes following patches simpler. > > Suggested-by: Andrew Cooper > Signed-off-by: Roger Pau Monné > --- > Cc: Paul Durrant > Cc: Jan Beulich > Cc: Andrew Cooper > Cc: George Dunlap > Cc: Ian Jackson > Cc: Konrad Rzeszutek Wilk > Cc: Stefano Stabellini > Cc: Tim Deegan > Cc: Wei Liu > --- > Changes since v5: > - New in this version. > --- > xen/arch/x86/hvm/io.c| 10 ++ > xen/arch/x86/hvm/ioreq.c | 12 +--- > xen/include/asm-x86/hvm/io.h | 4 ++-- > xen/include/xen/pci.h| 20 > 4 files changed, 29 insertions(+), 17 deletions(-) > > diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c > index bf41954f59..4e49e59012 100644 > --- a/xen/arch/x86/hvm/io.c > +++ b/xen/arch/x86/hvm/io.c > @@ -257,17 +257,11 @@ void register_g2m_portio_handler(struct domain > *d) > } > > unsigned int hvm_pci_decode_addr(unsigned int cf8, unsigned int addr, > - unsigned int *bus, unsigned int *slot, > - unsigned int *func) > + pci_sbdf_t *bdf) I'd prefer the pointer name to be 'sbdf' rather than 'bdf', but otherwise... Reviewed-by: Paul Durrant > { > -unsigned int bdf; > - > ASSERT(CF8_ENABLED(cf8)); > > -bdf = CF8_BDF(cf8); > -*bus = PCI_BUS(bdf); > -*slot = PCI_SLOT(bdf); > -*func = PCI_FUNC(bdf); > +bdf->sbdf = CF8_BDF(cf8); > /* > * NB: the lower 2 bits of the register address are fetched from the > * offset into the 0xcfc register when reading/writing to it. > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c > index 752976d16d..3e7a88e053 100644 > --- a/xen/arch/x86/hvm/ioreq.c > +++ b/xen/arch/x86/hvm/ioreq.c > @@ -1177,17 +1177,15 @@ struct hvm_ioreq_server > *hvm_select_ioreq_server(struct domain *d, > (p->addr & ~3) == 0xcfc && > CF8_ENABLED(cf8) ) > { > -uint32_t sbdf, x86_fam; > -unsigned int bus, slot, func, reg; > +uint32_t x86_fam; > +pci_sbdf_t bdf; > +unsigned int reg; > > -reg = hvm_pci_decode_addr(cf8, p->addr, , , ); > +reg = hvm_pci_decode_addr(cf8, p->addr, ); > > /* PCI config data cycle */ > - > -sbdf = XEN_DMOP_PCI_SBDF(0, bus, slot, func); > - > type = XEN_DMOP_IO_RANGE_PCI; > -addr = ((uint64_t)sbdf << 32) | reg; > +addr = ((uint64_t)bdf.bdf << 32) | reg; > /* AMD extended configuration space access? */ > if ( CF8_ADDR_HI(cf8) && > d->arch.cpuid->x86_vendor == X86_VENDOR_AMD && > diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h > index 51659b6c7f..2ff1c96883 100644 > --- a/xen/include/asm-x86/hvm/io.h > +++ b/xen/include/asm-x86/hvm/io.h > @@ -20,6 +20,7 @@ > #define __ASM_X86_HVM_IO_H__ > > #include > +#include > #include > #include > #include > @@ -151,8 +152,7 @@ extern void hvm_dpci_msi_eoi(struct domain *d, int > vector); > > /* Decode a PCI port IO access into a bus/slot/func/reg. */ > unsigned int hvm_pci_decode_addr(unsigned int cf8, unsigned int addr, > - unsigned int *bus, unsigned int *slot, > - unsigned int *func); > + pci_sbdf_t *bdf); > > /* > * HVM port IO handler that performs forwarding of guest IO ports into > machine > diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h > index 43f21251a5..dd5ec43a70 100644 > --- a/xen/include/xen/pci.h > +++ b/xen/include/xen/pci.h > @@ -38,6 +38,26 @@ > #define PCI_SBDF2(s,bdf) s) & 0x) << 16) | ((bdf) & 0x)) > #define PCI_SBDF3(s,b,df) s) & 0x) << 16) | PCI_BDF2(b, df)) > > +typedef union { > +uint32_t sbdf; > +struct { > +union { > +uint16_t bdf; > +struct { > +union { > +
Re: [Xen-devel] [PATCH] x86/domctl: Don't pause the whole domain if only getting vcpu state
>>> On 19.09.17 at 17:28,wrote: > On Ma, 2017-09-19 at 00:11 -0600, Jan Beulich wrote: >> > > > Razvan Cojocaru 09/18/17 7:05 PM >> > On 09/18/2017 06:35 PM, Jan Beulich wrote: >> > > > > > On 12.09.17 at 15:53, wrote: >> > > > --- a/xen/arch/x86/domctl.c >> > > > +++ b/xen/arch/x86/domctl.c >> > > > @@ -625,6 +625,26 @@ long arch_do_domctl( >> > > > !is_hvm_domain(d) ) >> > > > break; >> > > > >> > > > +if ( domctl->u.hvmcontext_partial.type == >> > > > HVM_SAVE_CODE(CPU) && >> > > > + domctl->u.hvmcontext_partial.instance < d- >> > > > >max_vcpus ) >> > > I have to admit that I'm not in favor of such special casing, >> > > even >> > > less so without any code comment saying why this is so special. >> > > What if someone else wanted some other piece of vCPU state >> > > without pausing the entire domain? Wouldn't it be possible to >> > > generalize this to cover all such state elements? >> > There's no reason why all the other cases where this would the >> > possible >> > shouldn't be optimized. What has made this one stand out for us is >> > that >> > we're using it a lot with introspection, and the optimization >> > counts. >> > >> > But judging by the code reorganization (the addition of >> > hvm_save_one_cpu_ctxt()), the changes would need to be done on a >> > one-by-one case anyway (different queries may require different >> > ways of >> > chaging the code). >> But this function addition is precisely what I'd like to avoid in >> favor of >> an extension to the existing mechanism using the registered function >> pointers. >> > What will be a suitable extend of the current call back system? I'm not sure what you expect as an answer here. Something following the current model, but skipping everything that's not per-vCPU, and for everything being per-vCPU handling just the single vCPU of interest. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH V3 2/3] Tool/ACPI: DSDT extension to support more vcpus
On Tue, Sep 19, 2017 at 09:02:06AM -0600, Jan Beulich wrote: > >>> On 19.09.17 at 16:13,wrote: > > We need to be careful to not create local APIC entries with either > > APIC or ACPI ID equal to 255 (and to also not create Processor objects > > with ACPI ID of 255). > > Why? An ACPI or APIC ID is still fine as long as it does only occur > in x2APIC contexts. That's what I was trying to reference to with "local APIC entries" and "Processor objects" as opposed to "x2APIC entries" and "Processor Devices", which are x2APIC contexts. AFAICT we are talking about the same. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 02/16] xen: move XENMAPSPACE_grant_table code into grant_table.c
>>> On 19.09.17 at 12:08,wrote: >> -Original Message- >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of >> Juergen Gross >> Sent: 19 September 2017 10:59 >> To: xen-de...@lists.xenproject.org >> Cc: Juergen Gross ; sstabell...@kernel.org; Wei Liu >> ; George Dunlap ; >> Andrew Cooper ; Ian Jackson >> ; Tim (Xen.org) ; >> julien.gr...@arm.com; jbeul...@suse.com; dgde...@tycho.nsa.gov >> Subject: [Xen-devel] [PATCH v7 02/16] xen: move >> XENMAPSPACE_grant_table code into grant_table.c >> >> The x86 and arm versions of XENMAPSPACE_grant_table handling are nearly >> identical. Move the code into a function in grant_table.c and add an >> architecture dependant hook to handle the differences. >> >> Switch to mfn_t in order to be more type safe. >> >> Signed-off-by: Juergen Gross > > Reviewed-by: Paul Durrant Acked-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Ping: [PATCH 2/2] public/sysctl: drop unnecessary typedefs and handles
On September 19, 2017 11:31:40 AM EDT, Jan Beulichwrote: On 12.09.17 at 17:10, wrote: >> --- a/xen/common/livepatch.c >> +++ b/xen/common/livepatch.c >> @@ -104,7 +104,7 @@ static struct livepatch_work livepatch_w >> */ >> static DEFINE_PER_CPU(bool_t, work_to_do); >> >> -static int get_name(const xen_livepatch_name_t *name, char *n) >> +static int get_name(const struct xen_livepatch_name *name, char *n) >> { >> if ( !name->size || name->size > XEN_LIVEPATCH_NAME_SIZE ) >> return -EINVAL; >> @@ -121,7 +121,7 @@ static int get_name(const xen_livepatch_ >> return 0; >> } >> >> -static int verify_payload(const xen_sysctl_livepatch_upload_t >*upload, char *n) >> +static int verify_payload(const struct xen_sysctl_livepatch_upload >*upload, char *n) >> { >> if ( get_name(>name, n) ) >> return -EINVAL; >> @@ -897,7 +897,7 @@ static int load_payload_data(struct payl >> return rc; >> } >> >> -static int livepatch_upload(xen_sysctl_livepatch_upload_t *upload) >> +static int livepatch_upload(struct xen_sysctl_livepatch_upload >*upload) >> { >> struct payload *data, *found; >> char n[XEN_LIVEPATCH_NAME_SIZE]; >> @@ -954,7 +954,7 @@ static int livepatch_upload(xen_sysctl_l >> return rc; >> } >> >> -static int livepatch_get(xen_sysctl_livepatch_get_t *get) >> +static int livepatch_get(struct xen_sysctl_livepatch_get *get) >> { >> struct payload *data; >> int rc; >> @@ -985,9 +985,9 @@ static int livepatch_get(xen_sysctl_live >> return 0; >> } >> >> -static int livepatch_list(xen_sysctl_livepatch_list_t *list) >> +static int livepatch_list(struct xen_sysctl_livepatch_list *list) >> { >> -xen_livepatch_status_t status; >> +struct xen_livepatch_status status; >> struct payload *data; >> unsigned int idx = 0, i = 0; >> int rc = 0; >> @@ -1451,7 +1451,7 @@ static int build_id_dep(struct payload * >> return 0; >> } >> >> -static int livepatch_action(xen_sysctl_livepatch_action_t *action) >> +static int livepatch_action(struct xen_sysctl_livepatch_action >*action) >> { >> struct payload *data; >> char n[XEN_LIVEPATCH_NAME_SIZE]; >> @@ -1560,7 +1560,7 @@ static int livepatch_action(xen_sysctl_l >> return rc; >> } >> >> -int livepatch_op(xen_sysctl_livepatch_op_t *livepatch) >> +int livepatch_op(struct xen_sysctl_livepatch_op *livepatch) >> { >> int rc; >> > >Konrad, Ross? Reviewed-by: Konrad Rzeszutek Wilk > >> --- a/xen/common/sched_arinc653.c >> +++ b/xen/common/sched_arinc653.c >> @@ -694,7 +694,7 @@ static int >> a653sched_adjust_global(const struct scheduler *ops, >> struct xen_sysctl_scheduler_op *sc) >> { >> -xen_sysctl_arinc653_schedule_t local_sched; >> +struct xen_sysctl_arinc653_schedule local_sched; >> int rc = -EINVAL; >> >> switch ( sc->cmd ) > >Robert, Josh? > >> --- a/xen/common/trace.c >> +++ b/xen/common/trace.c >> @@ -367,9 +367,9 @@ void __init init_trace_bufs(void) >> >> /** >> * tb_control - sysctl operations on trace buffers. >> - * @tbc: a pointer to a xen_sysctl_tbuf_op_t to be filled out >> + * @tbc: a pointer to a struct xen_sysctl_tbuf_op to be filled out >> */ >> -int tb_control(xen_sysctl_tbuf_op_t *tbc) >> +int tb_control(struct xen_sysctl_tbuf_op *tbc) >> { >> static DEFINE_SPINLOCK(lock); >> int rc = 0; > >George? > >Jan Thanks! ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v6 11/11] vpci/msix: add MSI-X handlers
Add handlers for accesses to the MSI-X message control field on the PCI configuration space, and traps for accesses to the memory region that contains the MSI-X table and PBA. This traps detect attempts from the guest to configure MSI-X interrupts and properly sets them up. Note that accesses to the Table Offset, Table BIR, PBA Offset and PBA BIR are not trapped by Xen at the moment. Finally, turn the panic in the Dom0 PVH builder into a warning. Signed-off-by: Roger Pau Monné--- Cc: Jan Beulich Cc: Andrew Cooper --- Changes since v5: - Update lock usage. - Unbind/unmap PIRQs when MSIX is disabled. - Share the arch-specific MSIX code with the MSI functions. - Do not reference the MSIX memory areas from the PCI BARs fields, instead fetch the BIR and offset each time needed. - Add the '_entry' suffix to the MSIX arch functions. - Prefix the vMSIX macros with 'V'. - s/gdprintk/gprintk/ in msix.c - Make vpci_msix_access_check return bool, and change it's name to vpci_msix_access_allowed. - Join the first two ifs in vpci_msix_{read/write} into a single one. - Allow Dom0 to write to the PBA area. - Add a note that reads from the PBA area will need to be translated if the PBA it's not identity mapped. Changes since v4: - Remove parentheses around offsetof. - Add "being" to MSI-X enabling comment. - Use INVALID_PIRQ. - Add a simple sanity check to vpci_msix_arch_enable in order to detect wrong MSI-X entries more quickly. - Constify vpci_msix_arch_print entry argument. - s/cpu/fixed/ in vpci_msix_arch_print. - Dump the MSI-X info together with the MSI info. - Fix vpci_msix_control_write to take into account changes to the address and data fields when switching the function mask bit. - Only disable/enable the entries if the address or data fields have been updated. - Usew the BAR enable field to check if a BAR is mapped or not (instead of reading the command register for each device). - Fix error path in vpci_msix_read to set the return data to ~0. - Simplify mask usage in vpci_msix_write. - Cast data to uint64_t when shifting it 32 bits. - Fix writes to the table entry control register to take into account if the mask-all bit is set. - Add some comments to clarify the intended behavior of the code. - Align the PBA size to 64-bits. - Remove the error label in vpci_init_msix. - Try to compact the layout of the vpci_msix structure. - Remove the local table_bar and pba_bar variables from vpci_init_msix, they are used only once. Changes since v3: - Propagate changes from previous versions: remove xen_ prefix, use the new fields in vpci_val and remove the return value from handlers. - Remove the usage of GENMASK. - Mave the arch-specific parts of the dump routine to the x86/hvm/vmsi.c dump handler. - Chain the MSI-X dump handler to the 'M' debug key. - Fix the header BAR mappings so that the MSI-X regions inside of BARs are unmapped from the domain p2m in order for the handlers to work properly. - Unconditionally trap and forward accesses to the PBA MSI-X area. - Simplify the conditionals in vpci_msix_control_write. - Fix vpci_msix_accept to use a bool type. - Allow all supported accesses as described in the spec to the MSI-X table. - Truncate the returned address when the access is a 32b read. - Always return X86EMUL_OKAY from the handlers, returning ~0 in the read case if the access is not supported, or ignoring writes. - Do not check that max_entries is != 0 in the init handler. - Use trylock in the dump handler. Changes since v2: - Split out arch-specific code. This patch has been tested with devices using both a single MSI-X entry and multiple ones. --- xen/arch/x86/hvm/dom0_build.c| 2 +- xen/arch/x86/hvm/hvm.c | 1 + xen/arch/x86/hvm/vmsi.c | 133 -- xen/drivers/vpci/Makefile| 2 +- xen/drivers/vpci/header.c| 16 ++ xen/drivers/vpci/msi.c | 22 +- xen/drivers/vpci/msix.c | 506 +++ xen/include/asm-x86/hvm/domain.h | 3 + xen/include/asm-x86/hvm/io.h | 5 + xen/include/xen/vpci.h | 45 10 files changed, 705 insertions(+), 30 deletions(-) create mode 100644 xen/drivers/vpci/msix.c diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c index 17d77137d6..8fa92bc5b6 100644 --- a/xen/arch/x86/hvm/dom0_build.c +++ b/xen/arch/x86/hvm/dom0_build.c @@ -,7 +,7 @@ int __init dom0_construct_pvh(struct domain *d, const module_t *image, pvh_setup_mmcfg(d); -panic("Building a PVHv2 Dom0 is not yet supported."); +printk("WARNING: PVH is an experimental mode with limited functionality\n"); return 0; } diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index b1064413fc..042b7c6a31 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -585,6 +585,7 @@
[Xen-devel] [PATCH v6 09/11] vpci/msi: add MSI handlers
Add handlers for the MSI control, address, data and mask fields in order to detect accesses to them and setup the interrupts as requested by the guest. Note that the pending register is not trapped, and the guest can freely read/write to it. Signed-off-by: Roger Pau MonnéReviewed-by: Paul Durrant --- Cc: Jan Beulich Cc: Andrew Cooper Cc: Paul Durrant --- Changes since v5: - Update to new lock usage. - Change handlers to match the new type. - s/msi_flags/msi_gflags/, remove the local variables and use the new DOMCTL_VMSI_* defines. - Change the MSI arch function to take a vpci_msi instead of a vpci_arch_msi as parameter. - Fix the calculation of the guest vector for MSI injection to take into account the number of bits that can be modified. - Use INVALID_PIRQ everywhere. - Simplify exit path of vpci_msi_disable. - Remove the conditional when setting address64 and masking fields. - Add a process_pending_softirqs to the MSI dump loop. - Place the prototypes for the MSI arch-specific functions in xen/vpci.h. - Add parentheses around the INVALID_PIRQ definition. Changes since v4: - Fix commit message. - Change the ASSERTs in vpci_msi_arch_mask into ifs. - Introduce INVALID_PIRQ. - Destroy the partially created bindings in case of failure in vpci_msi_arch_enable. - Just take the pcidevs lock once in vpci_msi_arch_disable. - Print an error message in case of failure of pt_irq_destroy_bind. - Make vpci_msi_arch_init return void. - Constify the arch parameter of vpci_msi_arch_print. - Use fixed instead of cpu for msi redirection. - Separate the header includes in vpci/msi.c between xen and asm. - Store the number of configured vectors even if MSI is not enabled and always return it in vpci_msi_control_read. - Fix/add comments in vpci_msi_control_write to clarify intended behavior. - Simplify usage of masks in vpci_msi_address_{upper_}write. - Add comment to vpci_msi_mask_{read/write}. - Don't use MASK_EXTR in vpci_msi_mask_write. - s/msi_offset/pos/ in vpci_init_msi. - Move control variable setup closer to it's usage. - Use d%d in vpci_dump_msi. - Fix printing of bitfield mask in vpci_dump_msi. - Fix definition of MSI_ADDR_REDIRECTION_MASK. - Shuffle the layout of vpci_msi to minimize gaps. - Remove the error label in vpci_init_msi. Changes since v3: - Propagate changes from previous versions: drop xen_ prefix, drop return value from handlers, use the new vpci_val fields. - Use MASK_EXTR. - Remove the usage of GENMASK. - Add GFLAGS_SHIFT_DEST_ID and use it in msi_flags. - Add "arch" to the MSI arch specific functions. - Move the dumping of vPCI MSI information to dump_msi (key 'M'). - Remove the guest_vectors field. - Allow the guest to change the number of active vectors without having to disable and enable MSI. - Check the number of active vectors when parsing the disable mask. - Remove the debug messages from vpci_init_msi. - Move the arch-specific part of the dump handler to x86/hvm/vmsi.c. - Use trylock in the dump handler to get the vpci lock. Changes since v2: - Add an arch-specific abstraction layer. Note that this is only implemented for x86 currently. - Add a wrapper to detect MSI enabling for vPCI. NB: I've only been able to test this with devices using a single MSI interrupt and no mask register. I will try to find hardware that supports the mask register and more than one vector, but I cannot make any promises. If there are doubts about the untested parts we could always force Xen to report no per-vector masking support and only 1 available vector, but I would rather avoid doing it. --- xen/arch/x86/hvm/vmsi.c | 153 ++ xen/arch/x86/msi.c | 3 + xen/drivers/vpci/Makefile| 2 +- xen/drivers/vpci/msi.c | 366 +++ xen/include/asm-x86/hvm/io.h | 5 + xen/include/asm-x86/msi.h| 1 + xen/include/xen/irq.h| 1 + xen/include/xen/vpci.h | 35 + 8 files changed, 565 insertions(+), 1 deletion(-) create mode 100644 xen/drivers/vpci/msi.c diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c index 9b35e9b696..3dcde3d882 100644 --- a/xen/arch/x86/hvm/vmsi.c +++ b/xen/arch/x86/hvm/vmsi.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include #include @@ -621,3 +622,155 @@ void msix_write_completion(struct vcpu *v) if ( msixtbl_write(v, ctrl_address, 4, 0) != X86EMUL_OKAY ) gdprintk(XENLOG_WARNING, "MSI-X write completion failure\n"); } + +static unsigned int msi_gflags(uint16_t data, uint64_t addr) +{ +/* + * We need to use the DOMCTL constants here because the output of this + * function is used as input to pt_irq_create_bind, which also takes the + * input from the DOMCTL itself. + */ +return
[Xen-devel] [PATCH v6 05/11] pci: split code to size BARs from pci_add_device
So that it can be called from outside in order to get the size of regular PCI BARs. This will be required in order to map the BARs from PCI devices into PVH Dom0 p2m. Signed-off-by: Roger Pau MonnéReviewed-by: Jan Beulich --- Cc: Jan Beulich --- Changes since v5: - Introduce a flags field for pci_size_mem_bar. - Use pci_sbdf_t. Changes since v4: - Restore printing whether the BAR is from a vf. - Make the psize pointer parameter not optional. - s/u64/uint64_t. - Remove some unneeded parentheses. - Assert the return value is never 0. - Use the newly introduced pci_sbdf_t type. Changes since v3: - Rename function to size BARs to pci_size_mem_bar. - Change the parameters passed to the function. Pass the position and whether the BAR is the last one, instead of the (base, max_bars, *index) tuple. - Make the function return the number of BARs consumed (1 for 32b, 2 for 64b BARs). - Change the dprintk back to printk. - Do not log another error message in pci_add_device in case pci_size_mem_bar fails. --- xen/drivers/passthrough/pci.c | 98 --- xen/include/xen/pci.h | 4 ++ 2 files changed, 68 insertions(+), 34 deletions(-) diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c index 975485fe05..ba58b4d0cc 100644 --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -603,6 +603,56 @@ static int iommu_add_device(struct pci_dev *pdev); static int iommu_enable_device(struct pci_dev *pdev); static int iommu_remove_device(struct pci_dev *pdev); +int pci_size_mem_bar(pci_sbdf_t sbdf, unsigned int pos, bool last, + uint64_t *paddr, uint64_t *psize, unsigned int flags) +{ +uint32_t hi = 0, bar = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, + sbdf.func, pos); +uint64_t addr, size; +bool vf = flags & PCI_BAR_VF; + +ASSERT((bar & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_MEMORY); +pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos, ~0); +if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == + PCI_BASE_ADDRESS_MEM_TYPE_64 ) +{ +if ( last ) +{ +printk(XENLOG_WARNING + "%sdevice %04x:%02x:%02x.%u with 64-bit %sBAR in last slot\n", + vf ? "SR-IOV " : "", sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, + vf ? "vf " : ""); +return -EINVAL; +} +hi = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos + 4); +pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos + 4, ~0); +} +size = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos) & + PCI_BASE_ADDRESS_MEM_MASK; +if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == + PCI_BASE_ADDRESS_MEM_TYPE_64 ) +{ +size |= (uint64_t)pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, + sbdf.func, pos + 4) << 32; +pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos + 4, hi); +} +else if ( size ) +size |= (uint64_t)~0 << 32; +pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos, bar); +size = -size; +addr = (bar & PCI_BASE_ADDRESS_MEM_MASK) | ((uint64_t)hi << 32); + +if ( paddr ) +*paddr = addr; +*psize = size; + +if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == + PCI_BASE_ADDRESS_MEM_TYPE_64 ) +return 2; + +return 1; +} + int pci_add_device(u16 seg, u8 bus, u8 devfn, const struct pci_dev_info *info, nodeid_t node) { @@ -674,11 +724,16 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, unsigned int i; BUILD_BUG_ON(ARRAY_SIZE(pdev->vf_rlen) != PCI_SRIOV_NUM_BARS); -for ( i = 0; i < PCI_SRIOV_NUM_BARS; ++i ) +for ( i = 0; i < PCI_SRIOV_NUM_BARS; ) { unsigned int idx = pos + PCI_SRIOV_BAR + i * 4; u32 bar = pci_conf_read32(seg, bus, slot, func, idx); -u32 hi = 0; +pci_sbdf_t sbdf = { +.seg = seg, +.bus = bus, +.dev = slot, +.func = func, +}; if ( (bar & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_IO ) @@ -689,38 +744,13 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, seg, bus, slot, func, i); continue; } -pci_conf_write32(seg, bus, slot, func, idx, ~0); -if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == - PCI_BASE_ADDRESS_MEM_TYPE_64 ) -{ -if ( i >= PCI_SRIOV_NUM_BARS ) -{ -printk(XENLOG_WARNING -
[Xen-devel] [PATCH v6 07/11] xen: introduce rangeset_consume_ranges
This function allows to iterate over a rangeset while removing the processed regions. It will be used by the following patches in order to store memory regions in rangesets, and remove them while iterating. Signed-off-by: Roger Pau Monné--- Cc: George Dunlap Cc: Ian Jackson Cc: Jan Beulich Cc: Konrad Rzeszutek Wilk Cc: Stefano Stabellini Cc: Tim Deegan Cc: Wei Liu --- Changes since v5: - New in this version. --- xen/common/rangeset.c | 28 xen/include/xen/rangeset.h | 4 2 files changed, 32 insertions(+) diff --git a/xen/common/rangeset.c b/xen/common/rangeset.c index 6c6293c15c..fd4a6b3384 100644 --- a/xen/common/rangeset.c +++ b/xen/common/rangeset.c @@ -298,6 +298,34 @@ int rangeset_report_ranges( return rc; } +int rangeset_consume_ranges( +struct rangeset *r, +int (*cb)(unsigned long s, unsigned long e, void *, unsigned long *c), +void *ctxt) +{ +int rc = 0; + +write_lock(>lock); +while ( !rangeset_is_empty(r) ) +{ +unsigned long consumed = 0; +struct range *x = first_range(r); + +rc = cb(x->s, x->e, ctxt, ); + +ASSERT(consumed <= x->e - x->s + 1); +x->s += consumed; +if ( x->s > x->e ) +destroy_range(r, x); + +if ( rc ) +break; +} +write_unlock(>lock); + +return rc; +} + int rangeset_add_singleton( struct rangeset *r, unsigned long s) { diff --git a/xen/include/xen/rangeset.h b/xen/include/xen/rangeset.h index aa6408248b..dfdb193800 100644 --- a/xen/include/xen/rangeset.h +++ b/xen/include/xen/rangeset.h @@ -67,6 +67,10 @@ bool_t __must_check rangeset_overlaps_range( int rangeset_report_ranges( struct rangeset *r, unsigned long s, unsigned long e, int (*cb)(unsigned long s, unsigned long e, void *), void *ctxt); +int rangeset_consume_ranges( +struct rangeset *r, +int (*cb)(unsigned long s, unsigned long e, void *, unsigned long *c), +void *ctxt); /* Add/remove/query a single number. */ int __must_check rangeset_add_singleton( -- 2.11.0 (Apple Git-81) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v6 01/11] pci: introduce a type to store a SBDF
That provides direct access to all the members that constitute a SBDF. The only function switched to use it is hvm_pci_decode_addr, because it makes following patches simpler. Suggested-by: Andrew CooperSigned-off-by: Roger Pau Monné --- Cc: Paul Durrant Cc: Jan Beulich Cc: Andrew Cooper Cc: George Dunlap Cc: Ian Jackson Cc: Konrad Rzeszutek Wilk Cc: Stefano Stabellini Cc: Tim Deegan Cc: Wei Liu --- Changes since v5: - New in this version. --- xen/arch/x86/hvm/io.c| 10 ++ xen/arch/x86/hvm/ioreq.c | 12 +--- xen/include/asm-x86/hvm/io.h | 4 ++-- xen/include/xen/pci.h| 20 4 files changed, 29 insertions(+), 17 deletions(-) diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c index bf41954f59..4e49e59012 100644 --- a/xen/arch/x86/hvm/io.c +++ b/xen/arch/x86/hvm/io.c @@ -257,17 +257,11 @@ void register_g2m_portio_handler(struct domain *d) } unsigned int hvm_pci_decode_addr(unsigned int cf8, unsigned int addr, - unsigned int *bus, unsigned int *slot, - unsigned int *func) + pci_sbdf_t *bdf) { -unsigned int bdf; - ASSERT(CF8_ENABLED(cf8)); -bdf = CF8_BDF(cf8); -*bus = PCI_BUS(bdf); -*slot = PCI_SLOT(bdf); -*func = PCI_FUNC(bdf); +bdf->sbdf = CF8_BDF(cf8); /* * NB: the lower 2 bits of the register address are fetched from the * offset into the 0xcfc register when reading/writing to it. diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c index 752976d16d..3e7a88e053 100644 --- a/xen/arch/x86/hvm/ioreq.c +++ b/xen/arch/x86/hvm/ioreq.c @@ -1177,17 +1177,15 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d, (p->addr & ~3) == 0xcfc && CF8_ENABLED(cf8) ) { -uint32_t sbdf, x86_fam; -unsigned int bus, slot, func, reg; +uint32_t x86_fam; +pci_sbdf_t bdf; +unsigned int reg; -reg = hvm_pci_decode_addr(cf8, p->addr, , , ); +reg = hvm_pci_decode_addr(cf8, p->addr, ); /* PCI config data cycle */ - -sbdf = XEN_DMOP_PCI_SBDF(0, bus, slot, func); - type = XEN_DMOP_IO_RANGE_PCI; -addr = ((uint64_t)sbdf << 32) | reg; +addr = ((uint64_t)bdf.bdf << 32) | reg; /* AMD extended configuration space access? */ if ( CF8_ADDR_HI(cf8) && d->arch.cpuid->x86_vendor == X86_VENDOR_AMD && diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h index 51659b6c7f..2ff1c96883 100644 --- a/xen/include/asm-x86/hvm/io.h +++ b/xen/include/asm-x86/hvm/io.h @@ -20,6 +20,7 @@ #define __ASM_X86_HVM_IO_H__ #include +#include #include #include #include @@ -151,8 +152,7 @@ extern void hvm_dpci_msi_eoi(struct domain *d, int vector); /* Decode a PCI port IO access into a bus/slot/func/reg. */ unsigned int hvm_pci_decode_addr(unsigned int cf8, unsigned int addr, - unsigned int *bus, unsigned int *slot, - unsigned int *func); + pci_sbdf_t *bdf); /* * HVM port IO handler that performs forwarding of guest IO ports into machine diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h index 43f21251a5..dd5ec43a70 100644 --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -38,6 +38,26 @@ #define PCI_SBDF2(s,bdf) s) & 0x) << 16) | ((bdf) & 0x)) #define PCI_SBDF3(s,b,df) s) & 0x) << 16) | PCI_BDF2(b, df)) +typedef union { +uint32_t sbdf; +struct { +union { +uint16_t bdf; +struct { +union { +struct { +uint8_t func : 3, +dev : 5; +}; +uint8_t extfunc; +}; +uint8_t bus; +}; +}; +uint16_tseg; +}; +} pci_sbdf_t; + struct pci_dev_info { /* * VF's 'is_extfn' field is used to indicate whether its PF is an extended -- 2.11.0 (Apple Git-81) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v6 03/11] x86/mmcfg: add handlers for the PVH Dom0 MMCFG areas
Introduce a set of handlers for the accesses to the MMCFG areas. Those areas are setup based on the contents of the hardware MMCFG tables, and the list of handled MMCFG areas is stored inside of the hvm_domain struct. The read/writes are forwarded to the generic vpci handlers once the address is decoded in order to obtain the device and register the guest is trying to access. Signed-off-by: Roger Pau MonnéReviewed-by: Paul Durrant --- Cc: Jan Beulich Cc: Andrew Cooper Cc: Paul Durrant --- Changes since v5: - Switch to use pci_sbdf_t. - Switch to the new per vpci locks. - Move the mmcfg related external definitions to asm-x86/pci.h Changes since v4: - Change the attribute of pvh_setup_mmcfg to __hwdom_init. - Try to add as many MMCFG regions as possible, even if one fails to add. - Change some fields of the hvm_mmcfg struct: turn size into a unsigned int, segment into uint16_t and bus into uint8_t. - Convert some address parameters from unsigned long to paddr_t for consistency. - Make vpci_mmcfg_decode_addr return the decoded register in the return of the function. - Introduce a new macro to convert a MMCFG address into a BDF, and use it in vpci_mmcfg_decode_addr to clarify the logic. - In vpci_mmcfg_{read/write} unify the logic for 8B accesses and smaller ones. - Add the __hwdom_init attribute to register_vpci_mmcfg_handler. - Test that reg + size doesn't cross a device boundary. Changes since v3: - Propagate changes from previous patches: drop xen_ prefix for vpci functions, pass slot and func instead of devfn and fix the error paths of the MMCFG handlers. - s/ecam/mmcfg/. - Move the destroy code to a separate function, so the hvm_mmcfg struct can be private to hvm/io.c. - Constify the return of vpci_mmcfg_find. - Use d instead of v->domain in vpci_mmcfg_accept. - Allow 8byte accesses to the mmcfg. Changes since v1: - Added locking. --- xen/arch/x86/hvm/dom0_build.c| 21 + xen/arch/x86/hvm/hvm.c | 4 + xen/arch/x86/hvm/io.c| 174 ++- xen/arch/x86/x86_64/mmconfig.h | 4 - xen/include/asm-x86/hvm/domain.h | 4 + xen/include/asm-x86/hvm/io.h | 7 ++ xen/include/asm-x86/pci.h| 6 ++ 7 files changed, 215 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c index 020c355faf..17d77137d6 100644 --- a/xen/arch/x86/hvm/dom0_build.c +++ b/xen/arch/x86/hvm/dom0_build.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -1048,6 +1049,24 @@ static int __init pvh_setup_acpi(struct domain *d, paddr_t start_info) return 0; } +static void __hwdom_init pvh_setup_mmcfg(struct domain *d) +{ +unsigned int i; +int rc; + +for ( i = 0; i < pci_mmcfg_config_num; i++ ) +{ +rc = register_vpci_mmcfg_handler(d, pci_mmcfg_config[i].address, + pci_mmcfg_config[i].start_bus_number, + pci_mmcfg_config[i].end_bus_number, + pci_mmcfg_config[i].pci_segment); +if ( rc ) +printk("Unable to setup MMCFG handler at %#lx for segment %u\n", + pci_mmcfg_config[i].address, + pci_mmcfg_config[i].pci_segment); +} +} + int __init dom0_construct_pvh(struct domain *d, const module_t *image, unsigned long image_headroom, module_t *initrd, @@ -1090,6 +1109,8 @@ int __init dom0_construct_pvh(struct domain *d, const module_t *image, return rc; } +pvh_setup_mmcfg(d); + panic("Building a PVHv2 Dom0 is not yet supported."); return 0; } diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index cc73df8dc7..b1064413fc 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -581,8 +581,10 @@ int hvm_domain_initialise(struct domain *d, unsigned long domcr_flags, spin_lock_init(>arch.hvm_domain.irq_lock); spin_lock_init(>arch.hvm_domain.uc_lock); spin_lock_init(>arch.hvm_domain.write_map.lock); +rwlock_init(>arch.hvm_domain.mmcfg_lock); INIT_LIST_HEAD(>arch.hvm_domain.write_map.list); INIT_LIST_HEAD(>arch.hvm_domain.g2m_ioport_list); +INIT_LIST_HEAD(>arch.hvm_domain.mmcfg_regions); rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, NULL, NULL); if ( rc ) @@ -728,6 +730,8 @@ void hvm_domain_destroy(struct domain *d) list_del(>list); xfree(ioport); } + +destroy_vpci_mmcfg(>arch.hvm_domain.mmcfg_regions); } static int hvm_save_tsc_adjust(struct domain *d, hvm_domain_context_t *h) diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c index 6f9cd1f19e..7ee20eb5d4 100644 --- a/xen/arch/x86/hvm/io.c +++
[Xen-devel] [PATCH v6 08/11] vpci/bars: add handlers to map the BARs
Introduce a set of handlers that trap accesses to the PCI BARs and the command register, in order to snoop BAR sizing and BAR relocation. The command handler is used to detect changes to bit 2 (response to memory space accesses), and maps/unmaps the BARs of the device into the guest p2m. A rangeset is used in order to figure out which memory to map/unmap. This makes it easier to keep track of the possible overlaps with other BARs, and will also simplify MSI-X support, where certain regions of a BAR might be used for the MSI-X table or PBA. The BAR register handlers are used to detect attempts by the guest to size or relocate the BARs. Note that the long running BAR mapping and unmapping operations are deferred to be performed by hvm_io_pending, so that they can be safely preempted. Signed-off-by: Roger Pau Monné--- Cc: Andrew Cooper Cc: George Dunlap Cc: Ian Jackson Cc: Jan Beulich Cc: Konrad Rzeszutek Wilk Cc: Stefano Stabellini Cc: Tim Deegan Cc: Wei Liu --- Changes since v5: - Switch to the new handler type. - Use pci_sbdf_t to size the BARs. - Use a single return for vpci_modify_bar. - Do not return an error code from vpci_modify_bars, just log the failure. - Remove the 'sizing' parameter. Instead just let the guest write directly to the BAR, and read the value back. This simplifies the BAR register handlers, specially the read one. - Ignore ROM BAR writes with memory decoding enabled and ROM enabled. - Do not propagate failures to setup the ROM BAR in vpci_init_bars. - Add preemption support to the BAR mapping/unmapping operations. Changes since v4: - Expand commit message to mention the reason behind the usage of rangesets. - Fix comment related to the inclusiveness of rangesets. - Fix off-by-one error in the calculation of the end of memory regions. - Store the state of the BAR (mapped/unmapped) in the vpci_bar enabled field, previously was only used by ROMs. - Fix double negation of return code. - Modify vpci_cmd_write so it has a single call to pci_conf_write16. - Print a warning when trying to write to the BAR with memory decoding enabled (and ignore the write). - Remove header_type local variable, it's used only once. - Move the read of the command register. - Restore previous command register value in the exit paths. - Only set address to INVALID_PADDR if the initial BAR value matches ~0 & PCI_BASE_ADDRESS_MEM_MASK. - Don't disable the enabled bit in the expansion ROM register, memory decoding is already disabled and takes precedence. - Don't use INVALID_PADDR, just set the initial BAR address to the value found in the hardware. - Introduce rom_enabled to store the status of the PCI_ROM_ADDRESS_ENABLE bit. - Reorder fields of the structure to prevent holes. Changes since v3: - Propagate previous changes: drop xen_ prefix and use u8/u16/u32 instead of the previous half_word/word/double_word. - Constify some of the paramerters. - s/VPCI_BAR_MEM/VPCI_BAR_MEM32/. - Simplify the number of fields stored for each BAR, a single address field is stored and contains the address of the BAR both on Xen and in the guest. - Allow the guest to move the BARs around in the physical memory map. - Add support for expansion ROM BARs. - Do not cache the value of the command register. - Remove a label used in vpci_cmd_write. - Fix the calculation of the sizing mask in vpci_bar_write. - Check the memory decode bit in order to decide if a BAR is positioned or not. - Disable memory decoding before sizing the BARs in Xen. - When mapping/unmapping BARs check if there's overlap between BARs, in order to avoid unmapping memory required by another BAR. - Introduce a macro to check whether a BAR is mappable or not. - Add a comment regarding the lack of support for SR-IOV. - Remove the usage of the GENMASK macro. Changes since v2: - Detect unset BARs and allow the hardware domain to position them. --- xen/arch/x86/hvm/ioreq.c | 4 + xen/drivers/vpci/Makefile | 2 +- xen/drivers/vpci/header.c | 478 ++ xen/include/xen/sched.h | 8 + xen/include/xen/vpci.h| 41 5 files changed, 532 insertions(+), 1 deletion(-) create mode 100644 xen/drivers/vpci/header.c diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c index 3e7a88e053..f6588ceab4 100644 --- a/xen/arch/x86/hvm/ioreq.c +++ b/xen/arch/x86/hvm/ioreq.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include @@ -48,6 +49,9 @@ bool_t hvm_io_pending(struct vcpu *v) struct domain *d = v->domain; struct hvm_ioreq_server *s; +if ( has_vpci(v->domain) && vpci_check_pending(v) ) + return 1; + list_for_each_entry ( s,
[Xen-devel] [PATCH v6 06/11] pci: add support to size ROM BARs to pci_size_mem_bar
Signed-off-by: Roger Pau Monné--- Cc: Jan Beulich --- Changes since v5: - Use the flags field. - Introduce a mask local variable. - Simplify return. Changes since v4: - New in this version. --- xen/drivers/passthrough/pci.c | 29 +++-- xen/include/xen/pci.h | 2 ++ 2 files changed, 17 insertions(+), 14 deletions(-) diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c index ba58b4d0cc..92c1f9354a 100644 --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -610,11 +610,17 @@ int pci_size_mem_bar(pci_sbdf_t sbdf, unsigned int pos, bool last, sbdf.func, pos); uint64_t addr, size; bool vf = flags & PCI_BAR_VF; - -ASSERT((bar & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_MEMORY); +bool rom = flags & PCI_BAR_ROM; +bool is64bits = !rom && (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == +PCI_BASE_ADDRESS_MEM_TYPE_64; +uint32_t mask = rom ? (uint32_t)PCI_ROM_ADDRESS_MASK +: (uint32_t)PCI_BASE_ADDRESS_MEM_MASK; + +ASSERT(!(rom && vf)); +ASSERT(rom || + (bar & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_MEMORY); pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos, ~0); -if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == - PCI_BASE_ADDRESS_MEM_TYPE_64 ) +if ( is64bits ) { if ( last ) { @@ -627,10 +633,9 @@ int pci_size_mem_bar(pci_sbdf_t sbdf, unsigned int pos, bool last, hi = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos + 4); pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos + 4, ~0); } -size = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos) & - PCI_BASE_ADDRESS_MEM_MASK; -if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == - PCI_BASE_ADDRESS_MEM_TYPE_64 ) +size = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, + pos) & mask; +if ( is64bits ) { size |= (uint64_t)pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos + 4) << 32; @@ -640,17 +645,13 @@ int pci_size_mem_bar(pci_sbdf_t sbdf, unsigned int pos, bool last, size |= (uint64_t)~0 << 32; pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, pos, bar); size = -size; -addr = (bar & PCI_BASE_ADDRESS_MEM_MASK) | ((uint64_t)hi << 32); +addr = (bar & mask) | ((uint64_t)hi << 32); if ( paddr ) *paddr = addr; *psize = size; -if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == - PCI_BASE_ADDRESS_MEM_TYPE_64 ) -return 2; - -return 1; +return is64bits ? 2 : 1; } int pci_add_device(u16 seg, u8 bus, u8 devfn, diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h index 2bee6a3247..4489edf9b5 100644 --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -191,6 +191,8 @@ const char *parse_pci_seg(const char *, unsigned int *seg, unsigned int *bus, #define _PCI_BAR_VF 0 #define PCI_BAR_VF (1u << _PCI_BAR_VF) +#define _PCI_BAR_ROM1 +#define PCI_BAR_ROM (1u << _PCI_BAR_ROM) int pci_size_mem_bar(pci_sbdf_t sbdf, unsigned int pos, bool last, uint64_t *addr, uint64_t *size, unsigned int flags); -- 2.11.0 (Apple Git-81) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v6 10/11] vpci: add a priority parameter to the vPCI register initializer
This is needed for MSI-X, since MSI-X will need to be initialized before parsing the BARs, so that the header BAR handlers are aware of the MSI-X related holes and make sure they are not mapped in order for the trap handlers to work properly. Signed-off-by: Roger Pau MonnéReviewed-by: Jan Beulich --- Cc: Jan Beulich Cc: Andrew Cooper --- Changes since v4: - Add a middle priority and add the PCI header to it. Changes since v3: - Add a numerial suffix to the section used to store the pointer to each initializer function, and sort them at link time. --- xen/arch/arm/xen.lds.S| 4 ++-- xen/arch/x86/xen.lds.S| 4 ++-- xen/drivers/vpci/header.c | 2 +- xen/drivers/vpci/msi.c| 2 +- xen/include/xen/vpci.h| 8 ++-- 5 files changed, 12 insertions(+), 8 deletions(-) diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S index eb14909645..4a08435f7e 100644 --- a/xen/arch/arm/xen.lds.S +++ b/xen/arch/arm/xen.lds.S @@ -68,7 +68,7 @@ SECTIONS #if defined(CONFIG_HAS_PCI) && defined(CONFIG_LATE_HWDOM) __start_vpci_array = .; - *(.data.vpci) + *(SORT(.data.vpci.*)) __end_vpci_array = .; #endif } :text @@ -182,7 +182,7 @@ SECTIONS #if defined(CONFIG_HAS_PCI) && !defined(CONFIG_LATE_HWDOM) __start_vpci_array = .; - *(.data.vpci) + *(SORT(.data.vpci.*)) __end_vpci_array = .; #endif } :text diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S index 61775953d6..3c44fb410e 100644 --- a/xen/arch/x86/xen.lds.S +++ b/xen/arch/x86/xen.lds.S @@ -127,7 +127,7 @@ SECTIONS #if defined(CONFIG_HAS_PCI) && defined(CONFIG_LATE_HWDOM) __start_vpci_array = .; - *(.data.vpci) + *(SORT(.data.vpci.*)) __end_vpci_array = .; #endif } :text @@ -222,7 +222,7 @@ SECTIONS #if defined(CONFIG_HAS_PCI) && !defined(CONFIG_LATE_HWDOM) __start_vpci_array = .; - *(.data.vpci) + *(SORT(.data.vpci.*)) __end_vpci_array = .; #endif } :text diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c index c0d38c8b91..07a6bbf0be 100644 --- a/xen/drivers/vpci/header.c +++ b/xen/drivers/vpci/header.c @@ -465,7 +465,7 @@ static int vpci_init_bars(struct pci_dev *pdev) return 0; } -REGISTER_VPCI_INIT(vpci_init_bars); +REGISTER_VPCI_INIT(vpci_init_bars, VPCI_PRIORITY_MIDDLE); /* * Local variables: diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c index 933adba0ff..7a0b0521c5 100644 --- a/xen/drivers/vpci/msi.c +++ b/xen/drivers/vpci/msi.c @@ -307,7 +307,7 @@ static int vpci_init_msi(struct pci_dev *pdev) return 0; } -REGISTER_VPCI_INIT(vpci_init_msi); +REGISTER_VPCI_INIT(vpci_init_msi, VPCI_PRIORITY_LOW); void vpci_dump_msi(void) { diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h index 5b582b8012..c6913631c0 100644 --- a/xen/include/xen/vpci.h +++ b/xen/include/xen/vpci.h @@ -13,9 +13,13 @@ typedef void vpci_write_t(const struct pci_dev *pdev, unsigned int reg, typedef int vpci_register_init_t(struct pci_dev *dev); -#define REGISTER_VPCI_INIT(x) \ +#define VPCI_PRIORITY_HIGH "1" +#define VPCI_PRIORITY_MIDDLE"5" +#define VPCI_PRIORITY_LOW "9" + +#define REGISTER_VPCI_INIT(x, p)\ static vpci_register_init_t *const x##_entry \ - __used_section(".data.vpci") = x + __used_section(".data.vpci." p) = x /* Add vPCI handlers to device. */ int __must_check vpci_add_handlers(struct pci_dev *dev); -- 2.11.0 (Apple Git-81) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v6 00/11] vpci: PCI config space emulation
Hello, The following series contain an implementation of handlers for the PCI configuration space inside of Xen. This allows Xen to detect accesses to the PCI configuration space and react accordingly. Why is this needed? IMHO, there are two main points of doing all this emulation inside of Xen, the first one is to prevent adding a bunch of duplicated Xen PV specific code to each OS we want to support in PVH mode. This just promotes Xen code duplication amongst OSes, which leads to a higher maintainership burden. The second reason would be that this code (or it's functionality to be more precise) already exists in QEMU (and pciback to a degree), and it's code that we already support and maintain. By moving it into the hypervisor itself every guest type can make use of it, and should be shared between them all. I know that the code in this series is not yet suitable for DomU HVM guests in it's current state, but it should be in due time. As usual, each patch contains a changeset summary between versions, I'm not going to copy the list of changes here. Patch 1 modifies a function to decode a PCI IO port access into pci_sbdf_t and register (which is shared with the ioreq code). Patch 2 implements the generic handlers for accesses to the PCI configuration space together with a minimal user-space test harness that I've used during development. Currently a per-device linked list is used in order to store the list of handlers, and they are sorted based on their offset inside of the configuration space. Patch 2 also adds the x86 port IO traps and wires them into the newly introduced vPCI dispatchers. Patch 3 and 4 adds handlers for the MMCFG areas (as found on the MMCFG ACPI table). Patches 5, 6 and 7 are mostly code moment/refactoring in order to implement support for BAR mapping in patch 8. Finally patches 9 and 11 add support for trapping accesses to the MSI and MSI-X capabilities respectively, so that interrupts are properly setup on behalf of Dom0. The most noticeable functional difference from previous versions is that this version supports preemptive BAR mapping and unmapping. The branch containing the patches can be found at: git://xenbits.xen.org/people/royger/xen.git vpci_v6 Note that this is only safe to use for the hardware domain (that's trusted), any non-trusted domain will need a lot more of traps before it can freely access the PCI configuration space. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v6 02/11] vpci: introduce basic handlers to trap accesses to the PCI config space
This functionality is going to reside in vpci.c (and the corresponding vpci.h header), and should be arch-agnostic. The handlers introduced in this patch setup the basic functionality required in order to trap accesses to the PCI config space, and allow decoding the address and finding the corresponding handler that should handle the access (although no handlers are implemented). Note that the traps to the PCI IO ports registers (0xcf8/0xcfc) are setup inside of a x86 HVM file, since that's not shared with other arches. A new XEN_X86_EMU_VPCI x86 domain flag is added in order to signal Xen whether a domain should use the newly introduced vPCI handlers, this is only enabled for PVH Dom0 at the moment. A very simple user-space test is also provided, so that the basic functionality of the vPCI traps can be asserted. This has been proven quite helpful during development, since the logic to handle partial accesses or accesses that expand across multiple registers is not trivial. The handlers for the registers are added to a linked list that's keep sorted at all times. Both the read and write handlers support accesses that expand across multiple emulated registers and contain gaps not emulated. Signed-off-by: Roger Pau Monné--- Cc: Ian Jackson Cc: Wei Liu Cc: Jan Beulich Cc: Andrew Cooper Cc: Paul Durrant --- Changes since v5: - Use a spinlock per pci device. - Use the recently introduced pci_sbdf_t type. - Fix test harness to use the right handler type and the newly introduced lock. - Move the position of the vpci sections in the linker scripts. - Constify domain and pci_dev in vpci_{read/write}. - Fix typos in comments. - Use _XEN_VPCI_H_ as header guard. Changes since v4: * User-space test harness: - Do not redirect the output of the test. - Add main.c and emul.h as dependencies of the Makefile target. - Use the same rule to modify the vpci and list headers. - Remove underscores from local macro variables. - Add _check suffix to the test harness multiread function. - Change the value written by every different size in the multiwrite test. - Use { } to initialize the r16 and r20 arrays (instead of { 0 }). - Perform some of the read checks with the local variable directly. - Expand some comments. - Implement a dummy rwlock. * Hypervisor code: - Guard the linker script changes with CONFIG_HAS_PCI. - Rename vpci_access_check to vpci_access_allowed and make it return bool. - Make hvm_pci_decode_addr return the register as return value. - Use ~3 instead of 0xfffc to remove the register offset when checking accesses to IO ports. - s/head/prev in vpci_add_register. - Add parentheses around & in vpci_add_register. - Fix register removal. - Change the BUGs in vpci_{read/write}_hw helpers to ASSERT_UNREACHABLE. - Make merge_result static and change the computation of the mask to avoid using a uint64_t. - Modify vpci_read to only read from hardware the not-emulated gaps. - Remove the vpci_val union and use a uint32_t instead. - Change handler read type to return a uint32_t instead of modifying a variable passed by reference. - Constify the data opaque parameter of read handlers. - Change the size parameter of the vpci_{read/write} functions to unsigned int. - Place the array of initialization handlers in init.rodata or .rodata depending on whether late-hwdom is enabled. - Remove the pci_devs lock, assume the Dom0 is well behaved and won't remove the device while trying to access it. - Change the recursive spinlock into a rw lock for performance reasons. Changes since v3: * User-space test harness: - Fix spaces in container_of macro. - Implement a dummy locking functions. - Remove 'current' macro make current a pointer to the statically allocated vpcu. - Remove unneeded parentheses in the pci_conf_readX macros. - Fix the name of the write test macro. - Remove the dummy EXPORT_SYMBOL macro (this was needed by the RB code only). - Import the max macro. - Test all possible read/write size combinations with all possible emulated register sizes. - Introduce a test for register removal. * Hypervisor code: - Use a sorted list in order to store the config space handlers. - Remove some unneeded 'else' branches. - Make the IO port handlers always return X86EMUL_OKAY, and set the data to all 1's in case of read failure (write are simply ignored). - In hvm_select_ioreq_server reuse local variables when calling XEN_DMOP_PCI_SBDF. - Store the pointers to the initialization functions in the .rodata section. - Do not ignore the return value of xen_vpci_add_handlers in setup_one_hwdom_device. - Remove the vpci_init macro. - Do not hide the pointers inside of the vpci_{read/write}_t typedefs. - Rename priv_data to private in vpci_register. - Simplify checking for register
[Xen-devel] [PATCH v6 04/11] x86/physdev: enable PHYSDEVOP_pci_mmcfg_reserved for PVH Dom0
So that MMCFG regions not present in the MCFG ACPI table can be added at run time by the hardware domain. Signed-off-by: Roger Pau Monné--- Cc: Jan Beulich Cc: Andrew Cooper --- Changes since v5: - Check for has_vpci before calling register_vpci_mmcfg_handler instead of checking for is_hvm_domain. Changes since v4: - Change the hardware_domain check in hvm_physdev_op to a vpci check. - Only register the MMCFG area, but don't scan it. Changes since v3: - New in this version. --- xen/arch/x86/hvm/hypercall.c | 4 xen/arch/x86/hvm/io.c| 7 +++ xen/arch/x86/physdev.c | 11 +++ 3 files changed, 18 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c index 5742dd1797..d81160c1f7 100644 --- a/xen/arch/x86/hvm/hypercall.c +++ b/xen/arch/x86/hvm/hypercall.c @@ -89,6 +89,10 @@ static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) if ( !has_pirq(curr->domain) ) return -ENOSYS; break; +case PHYSDEVOP_pci_mmcfg_reserved: +if ( !has_vpci(curr->domain) ) +return -ENOSYS; +break; } if ( !curr->hcall_compat ) diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c index 7ee20eb5d4..ff167bdfc7 100644 --- a/xen/arch/x86/hvm/io.c +++ b/xen/arch/x86/hvm/io.c @@ -496,10 +496,9 @@ static const struct hvm_mmio_ops vpci_mmcfg_ops = { .write = vpci_mmcfg_write, }; -int __hwdom_init register_vpci_mmcfg_handler(struct domain *d, paddr_t addr, - unsigned int start_bus, - unsigned int end_bus, - unsigned int seg) +int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr, +unsigned int start_bus, unsigned int end_bus, +unsigned int seg) { struct hvm_mmcfg *mmcfg; diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c index 0eb409758f..b36add32f1 100644 --- a/xen/arch/x86/physdev.c +++ b/xen/arch/x86/physdev.c @@ -559,6 +559,17 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) ret = pci_mmcfg_reserved(info.address, info.segment, info.start_bus, info.end_bus, info.flags); +if ( !ret && has_vpci(currd) ) +{ +/* + * For HVM (PVH) domains try to add the newly found MMCFG to the + * domain. + */ +ret = register_vpci_mmcfg_handler(currd, info.address, + info.start_bus, info.end_bus, + info.segment); +} + break; } -- 2.11.0 (Apple Git-81) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Feature control on PV devices
On 09/18/2017 08:59 PM, Konrad Rzeszutek Wilk wrote: > On Thu, Sep 14, 2017 at 05:08:18PM +0100, Joao Martins wrote: >> [ Realized that I didn't CC the maintainers, >> so doing that now, +Linux folks +PV interfaces czar >> Sorry for the noise! ] >> >> On 09/08/2017 09:49 AM, Joao Martins wrote: >>> [Forgot two important details regarding Xenbus states] >>> On 09/07/2017 05:53 PM, Joao Martins wrote: Hey! We wanted to brought up this small proposal regarding the lack of parameterization on PV devices on Xen. Currently users don't have a way for enforce and control what features/queues/etc the backend provides. So far there's only global parameters on backends, and specs do not mention anything in this regard. > > How would this scale with say FreeBSD backends? > This is per-device parameter configuration support, based on xenstore entries. All backend needs to understand is that the request/XXX xenstore entries and superseed whatever global defaults were defined by backend (after validation). So what I am proposing here makes no OS assumptions and should work for FreeBSD or any other. > And I am assuming you are > also thinking about device driver backends - where you can't easily > get access to the backend and change the SysFS parameters (if they have > it all)? > Yeah - Provided that the xenstore entries will be created with permissions for toolstack domain and the backend domain then backends other than Dom0 should work too. Note that this is device setup (e.g. domain create time), i.e. the configuration of what the frontend is allowed to see/use. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] Ping: [PATCH 2/2] public/sysctl: drop unnecessary typedefs and handles
>>> On 12.09.17 at 17:10,wrote: > --- a/xen/common/livepatch.c > +++ b/xen/common/livepatch.c > @@ -104,7 +104,7 @@ static struct livepatch_work livepatch_w > */ > static DEFINE_PER_CPU(bool_t, work_to_do); > > -static int get_name(const xen_livepatch_name_t *name, char *n) > +static int get_name(const struct xen_livepatch_name *name, char *n) > { > if ( !name->size || name->size > XEN_LIVEPATCH_NAME_SIZE ) > return -EINVAL; > @@ -121,7 +121,7 @@ static int get_name(const xen_livepatch_ > return 0; > } > > -static int verify_payload(const xen_sysctl_livepatch_upload_t *upload, char > *n) > +static int verify_payload(const struct xen_sysctl_livepatch_upload *upload, > char *n) > { > if ( get_name(>name, n) ) > return -EINVAL; > @@ -897,7 +897,7 @@ static int load_payload_data(struct payl > return rc; > } > > -static int livepatch_upload(xen_sysctl_livepatch_upload_t *upload) > +static int livepatch_upload(struct xen_sysctl_livepatch_upload *upload) > { > struct payload *data, *found; > char n[XEN_LIVEPATCH_NAME_SIZE]; > @@ -954,7 +954,7 @@ static int livepatch_upload(xen_sysctl_l > return rc; > } > > -static int livepatch_get(xen_sysctl_livepatch_get_t *get) > +static int livepatch_get(struct xen_sysctl_livepatch_get *get) > { > struct payload *data; > int rc; > @@ -985,9 +985,9 @@ static int livepatch_get(xen_sysctl_live > return 0; > } > > -static int livepatch_list(xen_sysctl_livepatch_list_t *list) > +static int livepatch_list(struct xen_sysctl_livepatch_list *list) > { > -xen_livepatch_status_t status; > +struct xen_livepatch_status status; > struct payload *data; > unsigned int idx = 0, i = 0; > int rc = 0; > @@ -1451,7 +1451,7 @@ static int build_id_dep(struct payload * > return 0; > } > > -static int livepatch_action(xen_sysctl_livepatch_action_t *action) > +static int livepatch_action(struct xen_sysctl_livepatch_action *action) > { > struct payload *data; > char n[XEN_LIVEPATCH_NAME_SIZE]; > @@ -1560,7 +1560,7 @@ static int livepatch_action(xen_sysctl_l > return rc; > } > > -int livepatch_op(xen_sysctl_livepatch_op_t *livepatch) > +int livepatch_op(struct xen_sysctl_livepatch_op *livepatch) > { > int rc; > Konrad, Ross? > --- a/xen/common/sched_arinc653.c > +++ b/xen/common/sched_arinc653.c > @@ -694,7 +694,7 @@ static int > a653sched_adjust_global(const struct scheduler *ops, > struct xen_sysctl_scheduler_op *sc) > { > -xen_sysctl_arinc653_schedule_t local_sched; > +struct xen_sysctl_arinc653_schedule local_sched; > int rc = -EINVAL; > > switch ( sc->cmd ) Robert, Josh? > --- a/xen/common/trace.c > +++ b/xen/common/trace.c > @@ -367,9 +367,9 @@ void __init init_trace_bufs(void) > > /** > * tb_control - sysctl operations on trace buffers. > - * @tbc: a pointer to a xen_sysctl_tbuf_op_t to be filled out > + * @tbc: a pointer to a struct xen_sysctl_tbuf_op to be filled out > */ > -int tb_control(xen_sysctl_tbuf_op_t *tbc) > +int tb_control(struct xen_sysctl_tbuf_op *tbc) > { > static DEFINE_SPINLOCK(lock); > int rc = 0; George? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] Ping: [PATCH 1/2] public/domctl: drop unnecessary typedefs and handles
>>> On 12.09.17 at 17:08,wrote: > --- a/xen/arch/x86/mm/hap/hap.c > +++ b/xen/arch/x86/mm/hap/hap.c > @@ -608,8 +608,8 @@ out: > paging_unlock(d); > } > > -int hap_domctl(struct domain *d, xen_domctl_shadow_op_t *sc, > - XEN_GUEST_HANDLE_PARAM(void) u_domctl) > +int hap_domctl(struct domain *d, struct xen_domctl_shadow_op *sc, > + XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > { > int rc; > bool preempted = false; George (also parts further down)? > --- a/xen/arch/x86/mm/mem_sharing.c > +++ b/xen/arch/x86/mm/mem_sharing.c > @@ -1606,7 +1606,7 @@ out: > return rc; > } > > -int mem_sharing_domctl(struct domain *d, xen_domctl_mem_sharing_op_t *mec) > +int mem_sharing_domctl(struct domain *d, struct xen_domctl_mem_sharing_op > *mec) > { > int rc; > Tamas (plus the corresponding header change)? > --- a/xen/arch/x86/mm/shadow/common.c > +++ b/xen/arch/x86/mm/shadow/common.c > @@ -3809,8 +3809,8 @@ out: > /* Shadow-control XEN_DOMCTL dispatcher */ > > int shadow_domctl(struct domain *d, > - xen_domctl_shadow_op_t *sc, > - XEN_GUEST_HANDLE_PARAM(void) u_domctl) > + struct xen_domctl_shadow_op *sc, > + XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > { > int rc; > bool preempted = false; Tim (plus the corresponding header change)? Thanks, Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/domctl: Don't pause the whole domain if only getting vcpu state
On Ma, 2017-09-19 at 00:11 -0600, Jan Beulich wrote: > > > > > > > > > > > > > Razvan Cojocaru09/18/17 7:05 PM > > > > >>> > > On 09/18/2017 06:35 PM, Jan Beulich wrote: > > > > > > > > > > > > > > > > > > > > > > > > On 12.09.17 at 15:53, wrote: > > > > --- a/xen/arch/x86/domctl.c > > > > +++ b/xen/arch/x86/domctl.c > > > > @@ -625,6 +625,26 @@ long arch_do_domctl( > > > > !is_hvm_domain(d) ) > > > > break; > > > > > > > > +if ( domctl->u.hvmcontext_partial.type == > > > > HVM_SAVE_CODE(CPU) && > > > > + domctl->u.hvmcontext_partial.instance < d- > > > > >max_vcpus ) > > > I have to admit that I'm not in favor of such special casing, > > > even > > > less so without any code comment saying why this is so special. > > > What if someone else wanted some other piece of vCPU state > > > without pausing the entire domain? Wouldn't it be possible to > > > generalize this to cover all such state elements? > > There's no reason why all the other cases where this would the > > possible > > shouldn't be optimized. What has made this one stand out for us is > > that > > we're using it a lot with introspection, and the optimization > > counts. > > > > But judging by the code reorganization (the addition of > > hvm_save_one_cpu_ctxt()), the changes would need to be done on a > > one-by-one case anyway (different queries may require different > > ways of > > chaging the code). > But this function addition is precisely what I'd like to avoid in > favor of > an extension to the existing mechanism using the registered function > pointers. > What will be a suitable extend of the current call back system? Regards, Alex This email was scanned by Bitdefender ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v11 5/5] x86emul: Raise #UD when emulating an unimplemented instruction.
>>> On 12.09.17 at 16:32,wrote: > Modified the behavior of hvm_emulate_one_insn and > vmx_realmode_emulate_one to generate an Invalid Opcode trap when > X86EMUL_UNIMPLEMENTED is returned by the emulator instead of just > crashing the domain. Along the lines of my comments on the earlier patch, I think you really mean X86EMUL_UNRECOGNIZED here as well as in the changes you make. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v11 3/5] x86emul: Add return code information to error messages
>>> On 12.09.17 at 16:32,wrote: > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -2055,7 +2055,7 @@ int hvm_emulate_one_mmio(unsigned long mfn, unsigned > long gla) > { > case X86EMUL_UNHANDLEABLE: > case X86EMUL_UNIMPLEMENTED: > -hvm_dump_emulation_state(XENLOG_G_WARNING, "MMCFG", ); > +hvm_dump_emulation_state(XENLOG_G_WARNING, "MMCFG", , rc); > break; At the example of this one I think it is pretty clear that the order of patches would be the other way around. But I won't insist. > @@ -2242,16 +2242,17 @@ static const char *guest_x86_mode_to_str(int mode) > } > > void hvm_dump_emulation_state(const char *loglvl, const char *prefix, > - struct hvm_emulate_ctxt *hvmemul_ctxt) > + struct hvm_emulate_ctxt *hvmemul_ctxt, int rc) > { > struct vcpu *curr = current; > const char *mode_str = guest_x86_mode_to_str(hvm_guest_x86_mode(curr)); > const struct segment_register *cs = > hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt); > > -printk("%s%s emulation failed: %pv %s @ %04x:%08lx -> %*ph\n", > - loglvl, prefix, curr, mode_str, cs->sel, > hvmemul_ctxt->insn_buf_eip, > - hvmemul_ctxt->insn_buf_bytes, hvmemul_ctxt->insn_buf); > +printk("%s%s emulation failed (rc=%d): %pv %s @ %04x:%08lx -> %*ph\n", Please try to keep log messages short (but without losing relevant information). In the case here the "rc=" is unnecessary. With it dropped Reviewed-by: Jan Beulich Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 11/13] xen/pvcalls: implement poll command
Hi Stefano, On Fri, Sep 15, 2017 at 04:00:38PM -0700, Stefano Stabellini wrote: > For active sockets, check the indexes and use the inflight_conn_req > waitqueue to wait. > > For passive sockets if an accept is outstanding > (PVCALLS_FLAG_ACCEPT_INFLIGHT), check if it has been answered by looking > at bedata->rsp[req_id]. If so, return POLLIN. Otherwise use the > inflight_accept_req waitqueue. > > If no accepts are inflight, send PVCALLS_POLL to the backend. If we have > outstanding POLL requests awaiting for a response use the inflight_req > waitqueue: inflight_req is awaken when a new response is received; on > wakeup we check whether the POLL response is arrived by looking at the > PVCALLS_FLAG_POLL_RET flag. We set the flag from > pvcalls_front_event_handler, if the response was for a POLL command. > > In pvcalls_front_event_handler, get the struct sock_mapping from the > poll id (we previously converted struct sock_mapping* to uint64_t and > used it as id). > > Signed-off-by: Stefano Stabellini> CC: boris.ostrov...@oracle.com > CC: jgr...@suse.com > --- > drivers/xen/pvcalls-front.c | 144 > +--- > drivers/xen/pvcalls-front.h | 3 + > 2 files changed, 138 insertions(+), 9 deletions(-) > > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c > index 01a5a69..8a90213 100644 > --- a/drivers/xen/pvcalls-front.c > +++ b/drivers/xen/pvcalls-front.c > @@ -85,6 +85,8 @@ struct sock_mapping { >* Only one poll operation can be inflight for a given socket. >*/ > #define PVCALLS_FLAG_ACCEPT_INFLIGHT 0 > +#define PVCALLS_FLAG_POLL_INFLIGHT 1 > +#define PVCALLS_FLAG_POLL_RET2 > uint8_t flags; > uint32_t inflight_req_id; > struct sock_mapping *accept_map; > @@ -155,15 +157,32 @@ static irqreturn_t pvcalls_front_event_handler(int irq, > void *dev_id) > rsp = RING_GET_RESPONSE(>ring, bedata->ring.rsp_cons); > > req_id = rsp->req_id; > - dst = (uint8_t *)>rsp[req_id] + sizeof(rsp->req_id); > - src = (uint8_t *)rsp + sizeof(rsp->req_id); > - memcpy(dst, src, sizeof(*rsp) - sizeof(rsp->req_id)); > - /* > - * First copy the rest of the data, then req_id. It is > - * paired with the barrier when accessing bedata->rsp. > - */ > - smp_wmb(); > - WRITE_ONCE(bedata->rsp[req_id].req_id, rsp->req_id); > + if (rsp->cmd == PVCALLS_POLL) { > + struct sock_mapping *map = (struct sock_mapping *) > +rsp->u.poll.id; > + > + set_bit(PVCALLS_FLAG_POLL_RET, > + (void *)>passive.flags); > + /* > + * Set RET, then clear INFLIGHT. It pairs with > + * the checks at the beginning of > + * pvcalls_front_poll_passive. > + */ > + smp_wmb(); pvcalls_front_poll_passive() seems to first check RET, then INFLIGHT (no "crossing of mem. locations"): can you elaborate here? > + clear_bit(PVCALLS_FLAG_POLL_INFLIGHT, > + (void *)>passive.flags); > + } else { > + dst = (uint8_t *)>rsp[req_id] + > + sizeof(rsp->req_id); > + src = (uint8_t *)rsp + sizeof(rsp->req_id); > + memcpy(dst, src, sizeof(*rsp) - sizeof(rsp->req_id)); > + /* > + * First copy the rest of the data, then req_id. It is > + * paired with the barrier when accessing bedata->rsp. > + */ > + smp_wmb(); Would you point me to the "pairing barrier"? (not sure I understand the logic here...) > + WRITE_ONCE(bedata->rsp[req_id].req_id, rsp->req_id); Could this be rewritten as WRITE_ONCE(bedata->rsp[req_id].req_id, req_id); > + } > > done = 1; > bedata->ring.rsp_cons++; > @@ -834,6 +853,113 @@ int pvcalls_front_accept(struct socket *sock, struct > socket *newsock, int flags) > return ret; > } > > +static unsigned int pvcalls_front_poll_passive(struct file *file, > +struct pvcalls_bedata *bedata, > +struct sock_mapping *map, > +poll_table *wait) > +{ > + int notify, req_id, ret; > + struct xen_pvcalls_request *req; > + > + if (test_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT, > + (void *)>passive.flags)) { > + uint32_t req_id = READ_ONCE(map->passive.inflight_req_id); > + > + if (req_id != PVCALLS_INVALID_ID && > +
Re: [Xen-devel] [PATCH v11 2/5] x86emul: New return code for unimplemented instruction
>>> On 12.09.17 at 16:32,wrote: > Enforce the distinction between an instruction not implemented by the > emulator and the failure to emulate that instruction by defining a new > return code, X86EMUL_UNIMPLEMENTED. > > This value should only be returned by the core emulator only if it fails to > properly decode the current instruction's opcode, and not by any of other > functions, such as the x86_emulate_ops or the hvm_io_ops callbacks. > > e.g. hvm_process_io_intercept should not return X86EMUL_UNIMPLEMENTED. > The return value of this function depends on either the return code of > one of the hvm_io_ops handlers (read/write) or the value returned by > hvm_copy_guest_from_phys / hvm_copy_to_guest_phys. > > Similary, none of this functions should not return X86EMUL_UNIMPLEMENTED. I think someone had already pointed out the strange double negation here. > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -192,6 +192,8 @@ static int hvmemul_do_io( > ASSERT(p.count <= *reps); > *reps = vio->io_req.count = p.count; > > +ASSERT(rc != X86EMUL_UNIMPLEMENTED); > + > switch ( rc ) > { > case X86EMUL_OKAY: The assertion want to move into the switch(), making use of ASSERT_UNREACHABLE(). > @@ -2045,6 +2054,7 @@ int hvm_emulate_one_mmio(unsigned long mfn, unsigned > long gla) > switch ( rc ) > { > case X86EMUL_UNHANDLEABLE: > +case X86EMUL_UNIMPLEMENTED: > hvm_dump_emulation_state(XENLOG_G_WARNING, "MMCFG", ); > break; I would have preferred if, just like you do here, ... > @@ -2102,6 +2112,7 @@ void hvm_emulate_one_vm_event(enum emul_kind kind, > unsigned int trapnr, > * consistent with X86EMUL_RETRY. > */ > return; > +case X86EMUL_UNIMPLEMENTED: > case X86EMUL_UNHANDLEABLE: > hvm_dump_emulation_state(XENLOG_G_DEBUG, "Mem event", ); ... you had added the new case label below existing ones uniformly. But anyway. > @@ -2585,7 +2586,7 @@ x86_decode( > d = twobyte_table[0x3a].desc; > break; > default: > -rc = X86EMUL_UNHANDLEABLE; > +rc = X86EMUL_UNIMPLEMENTED; > goto done; > } > } > @@ -2599,7 +2600,7 @@ x86_decode( > } > else > { > -rc = X86EMUL_UNHANDLEABLE; > +rc = X86EMUL_UNIMPLEMENTED; > goto done; At least these two should be "unrecognized" now. > @@ -2879,7 +2880,7 @@ x86_decode( > > default: > ASSERT_UNREACHABLE(); > -return X86EMUL_UNHANDLEABLE; > +return X86EMUL_UNIMPLEMENTED; > } This one, otoh, is probably fine this way for now. > @@ -6195,7 +6196,7 @@ x86_emulate( > /* vpsll{w,d} $imm8,{x,y}mm,{x,y}mm */ > break; > default: > -goto cannot_emulate; > +goto unimplemented_insn; > } This again wants to be "unrecognized". > @@ -6243,7 +6244,7 @@ x86_emulate( > case 6: /* psllq $imm8,mm */ > goto simd_0f_shift_imm; > } > -goto cannot_emulate; > +goto unimplemented_insn; And this too. Together with previous discussion I think you should now see the pattern for everything further down from here. > --- a/xen/arch/x86/x86_emulate/x86_emulate.h > +++ b/xen/arch/x86/x86_emulate/x86_emulate.h > @@ -133,6 +133,18 @@ struct x86_emul_fpu_aux { >* Undefined behavior when used anywhere else. >*/ > #define X86EMUL_DONE 4 > + /* > + * Current instruction is not implemented by the emulator. > + * This value should only be returned by the core emulator if decode fails Why "if decode fails"? In that case it's more "unrecognized" than "unimplemented"; the latter can only ever arise (long term, i.e. once we have proper distinction of the two) if we successfully decoded an insn, but have no code to actually handle it. > + * and not by any of the x86_emulate_ops callbacks. > + * If this error code is returned by a function, an #UD trap should be > + * raised by the final consumer of it. This last sentence would now really belong to X86EMUL_UNRECOGNIZED. As explained earlier, raising #UD for unimplemented is precisely the wrong choice architecturally, we merely tolerate doing so for the time being. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/5] xen/livepatch/ARM32: Don't load and crash on livepatches loaded with wrong alignment.
>>> On 18.09.17 at 21:37,wrote: > On Tue, Sep 12, 2017 at 02:57:04AM -0600, Jan Beulich wrote: >> >>> On 12.09.17 at 02:22, wrote: >> > If I compile the test-case under ARM32 it works OK (as the >> > .livepatch.depends ends up being aligned to four bytes). >> >> So why is that? What entity is creating this section (or the >> directive(s) to create it)? > > gcc > > Looking at the xen_bye_world.o produced by cross-compiler: > > xen_bye_world.o: file format elf32-littlearm > > Contents of section .rodata: > 78656e5f 65787472 615f7665 7273696f xen_extra_versio > 0010 6e00 n. > > And native: > > xen_bye_world.o: file format elf32-littlearm > > Contents of section .rodata: > 78656e5f 65787472 615f7665 7273696f xen_extra_versio > 0010 6e00 n... This may rather be a gas than a gcc behavioral difference. What's the alignment of .rodata in both cases? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH V3 2/3] Tool/ACPI: DSDT extension to support more vcpus
>>> On 19.09.17 at 16:13,wrote: > On Tue, Sep 19, 2017 at 07:55:32AM -0600, Jan Beulich wrote: >> >>> On 19.09.17 at 15:48, wrote: >> > On Tue, Sep 19, 2017 at 07:44:21AM -0600, Jan Beulich wrote: >> >> >>> On 19.09.17 at 15:29, wrote: >> >> > On Wed, Sep 13, 2017 at 12:52:48AM -0400, Lan Tianyu wrote: >> >> >> +if ( apic_id > 254 ) >> >> > >> >> > 255? An APIC ID of 255 should still be fine. >> >> >> >> Wasn't it you who (validly) asked for the boundary to be 254, due >> >> to 0xff being the broadcast value? >> > >> > But that's the ACPI ID, not the APIC ID. >> >> The code above says "apic_id" - is the variable mis-named? Or am >> I reading your reply the wrong way round, in which case the question >> would be why an ACPI ID could ever express something like >> "broadcast"? > > Yes, sorry I got messed up. This is indeed fine, as a local APIC ID > of 255 is the broadcast ID. But this also applies to the ACPI ID, > since an ACPI ID of 255 is also the broadcast ID for local APIC > entries in the MADT. For example a Local APIC NMI Structure with an > ACPI ID of 255 applies to all local APICs. Indeed. > We need to be careful to not create local APIC entries with either > APIC or ACPI ID equal to 255 (and to also not create Processor objects > with ACPI ID of 255). Why? An ACPI or APIC ID is still fine as long as it does only occur in x2APIC contexts. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [linux-linus test] 113594: tolerable FAIL - PUSHED
flight 113594 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/113594/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-amd64-i386-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail in 113583 pass in 113594 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail in 113583 pass in 113594 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail in 113583 pass in 113594 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail pass in 113583 test-armhf-armhf-xl-credit2 16 guest-start/debian.repeat fail pass in 113583 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail REGR. vs. 113497 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 18 guest-start/win.repeat fail in 113583 blocked in 113497 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 113497 test-amd64-i386-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail like 113497 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 113497 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 113497 test-amd64-amd64-xl-rtds 10 debian-install fail like 113497 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail like 113497 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 113497 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore fail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass version targeted for testing: linuxebb2c2437d8008d46796902ff390653822af6cc4 baseline version: linux7318413077a5141a50a753b1fab687b7907eef16 Last test of basis 113497 2017-09-16 05:31:48 Z3 days Failing since113516 2017-09-16 19:00:20 Z2 days7 attempts Testing same since 113583 2017-09-18 19:54:12 Z0 days2 attempts People who touched revisions under test: Adrian HunterAlexei Starovoitov Andrey Konovalov
Re: [Xen-devel] [PATCH v3 3/3] x86/hvm: Implement hvmemul_write() using real mappings
>>> On 19.09.17 at 16:39,wrote: >> +static void *hvmemul_map_linear_addr( >> +unsigned long linear, unsigned int bytes, uint32_t pfec, >> +struct hvm_emulate_ctxt *hvmemul_ctxt) >> +{ >> +struct vcpu *curr = current; >> +void *err, *mapping; >> + >> +/* First and final gfns which need mapping. */ >> +unsigned long frame = linear >> PAGE_SHIFT, first = frame; >> +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT; >> + >> +/* >> + * mfn points to the next free slot. All used slots have a page >> reference >> + * held on them. >> + */ >> +mfn_t *mfn = _ctxt->mfn[0]; >> + >> +/* >> + * The caller has no legitimate reason for trying a zero-byte write, but >> + * final is calculate to fail safe in release builds. >> + * >> + * The maximum write size depends on the number of adjacent mfns[] >> which >> + * can be vmap()'d, accouting for possible misalignment within the >> region. >> + * The higher level emulation callers are responsible for ensuring that >> + * mfns[] is large enough for the requested write size. >> + */ >> +if ( bytes == 0 || >> + final - first > ARRAY_SIZE(hvmemul_ctxt->mfn) - 1 ) > >>=, rather than the -1? Yeah, I had pointed out that one too earlier on. Andrew gave a reason that didn't really convince me, but which also made me go silent despite >> +{ >> +ASSERT_UNREACHABLE(); >> +goto unhandleable; >> +} >> + >> +do { >> +enum hvm_translation_result res; >> +struct page_info *page; >> +pagefault_info_t pfinfo; >> +p2m_type_t p2mt; >> + >> +/* Error checking. Confirm that the current slot is clean. */ >> +ASSERT(mfn_x(*mfn) == 0); >> + >> +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec, >> + , , NULL, ); >> + >> +switch ( res ) >> +{ >> +case HVMTRANS_okay: >> +break; >> + >> +case HVMTRANS_bad_linear_to_gfn: >> +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, >> _ctxt->ctxt); >> +err = ERR_PTR(~(long)X86EMUL_EXCEPTION); > > Still the mysterious cast to long here and below that Jan pointed out. Oh, I've even managed to overlook that. Without an explanation why this is needed I would withdraw my R-b. >> +goto out; >> + >> +case HVMTRANS_bad_gfn_to_mfn: >> +err = NULL; >> +goto out; >> + >> +case HVMTRANS_gfn_paged_out: >> +case HVMTRANS_gfn_shared: >> +err = ERR_PTR(~(long)X86EMUL_RETRY); >> +goto out; >> + >> +default: >> +goto unhandleable; >> +} >> + >> +*mfn++ = _mfn(page_to_mfn(page)); >> +frame++; > > Increment still done here rather than being co-located with test below. Indeed - if you dislike it going into the while(), at least put it right ahead of it. Yet then again it sitting next to the mfn increment doesn't look that bad either, and the mfn increment clearly can't be moved down. >> +static void hvmemul_unmap_linear_addr( >> +void *mapping, unsigned long linear, unsigned int bytes, >> +struct hvm_emulate_ctxt *hvmemul_ctxt) >> +{ >> +struct domain *currd = current->domain; >> +unsigned long frame = linear >> PAGE_SHIFT; >> +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT; >> +mfn_t *mfn = _ctxt->mfn[0]; >> + >> +ASSERT(bytes > 0); >> + >> +if ( frame == final ) >> +unmap_domain_page(mapping); >> +else >> +vunmap(mapping); >> + >> +do >> +{ >> +ASSERT(mfn_valid(*mfn)); >> +paging_mark_dirty(currd, *mfn); >> +put_page(mfn_to_page(mfn_x(*mfn))); >> + >> +frame++; > > Again, increment should be co-located with test IMO. > > Paul > >> +*mfn++ = _mfn(0); /* Clean slot for map()'s error checking. */ >> + >> +} while ( frame < final ); Well, here they're at least only spaced apart by another related operation. It certainly would be nice if the two lines were at least swapped. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/21] libxl: add is_default checkers for string and timer_mode types
Roger Pau Monne writes ("[PATCH v2 01/21] libxl: add is_default checkers for string and timer_mode types"): > Those types are missing a helper to check whether a definition of the > type holds the default value. This will be required by a later patch > that will implement deprecation of fields inside of a libxl type. > > Signed-off-by: Roger Pau MonnéAcked-by: Ian Jackson ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result
>>> On 19.09.17 at 16:39,wrote: On 19.09.17 at 16:14, wrote: >> From: Andrew Cooper >> >> Signed-off-by: Andrew Cooper >> >> --- >> Acked-by: Jan Beulich >> Reviewed-by: Kevin Tian >> Acked-by: George Dunlap > > Please avoid such misplaced tags in the future - the committer will > need to remember to remove the first --- separator in order for them > to not get lost while committing. Additionally it looks like you've lost Tim's ack for the shadow parts. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/5] xen/livepatch/ARM32: Don't load and crash on livepatches loaded with wrong alignment.
On Tue, Sep 12, 2017 at 02:57:04AM -0600, Jan Beulich wrote: > >>> On 12.09.17 at 02:22,wrote: > > On Mon, Sep 11, 2017 at 03:01:15AM -0600, Jan Beulich wrote: > >> Hmm, as long as the relocation isn't required to be against aligned > >> fields only (mandated by the processor ABI) I think the code doing > >> the relocations would instead need to split the access, rather than > >> calling the section misaligned or increasing alignment beyond what > >> the ELF section headers say. > > > > Maybe the serial log would explain this better: > > > > xend_config_format : 4 > > Executing: '(set -e;cd /root/test/livepatch;xen-livepatch load > > xen_bye_world.livepatch)' ..(XEN) livepatch.c:413: livepatch: > > xen_bye_world: Loaded .note.gnu.build-id at 00a08000 > > (XEN) livepatch.c:413: livepatch: xen_bye_world: Loaded .text at 00a06000 > > (XEN) livepatch.c:413: livepatch: xen_bye_world: Loaded .rodata at 00a08024 > > (XEN) livepatch.c:413: livepatch: xen_bye_world: Loaded .rodata.str1.4 at > > 00a08038 > > (XEN) livepatch.c:413: livepatch: xen_bye_world: Loaded .livepatch.depends > > at 00a08043 > >[...] > > Keep in mind that this only happens if I cross-compile ARM32 under x86. > > That would suggest a build environment / build tools issue then: > Cross builds aren't supposed to produce binaries different from > native builds. Hm, the gcc parameters on both native and cross compiler have same args: konrad@osstest:/srv/cubietruck/source$ diff native.invocation /tmp/cross.invocation 1c1 < gcc -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O1 -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -pipe -g -D__XEN__ -include /source/xen.orig.git/xen/include/xen/config.h '-D__OBJECT_FILE__="xen_bye_world.o"' -Wa,--strip-local-absolute -fno-omit-frame-pointer -MMD -MF ./.xen_bye_world.o.d -msoft-float -mcpu=cortex-a15 -I/source/xen.orig.git/xen/include -fno-stack-protector -fno-exceptions -Wnested-externs -DGCC_HAS_VISIBILITY_ATTRIBUTE -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -c xen_bye_world.c -o xen_bye_world.o --- > arm-linux-gnu-gcc -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall > -Wstrict-prototypes -Wdeclaration-after-statement > -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O1 -nostdinc > -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -pipe > -g -D__XEN__ -include /home/konrad/A20/xen.git/xen/include/xen/config.h > '-D__OBJECT_FILE__="xen_bye_world.o"' -Wa,--strip-local-absolute > -fno-omit-frame-pointer -MMD -MF ./.xen_bye_world.o.d -msoft-float > -mcpu=cortex-a15 -I/home/konrad/A20/xen.git/xen/include -fno-stack-protector > -fno-exceptions -Wnested-externs -DGCC_HAS_VISIBILITY_ATTRIBUTE -marm > -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes > -Wdeclaration-after-statement -Wno-unused-but-set-variable > -Wno-unused-local-typedefs -c xen_bye_world.c -o xen_bye_world.o > > > If I compile the test-case under ARM32 it works OK (as the > > .livepatch.depends ends up being aligned to four bytes). > > So why is that? What entity is creating this section (or the > directive(s) to create it)? gcc Looking at the xen_bye_world.o produced by cross-compiler: xen_bye_world.o: file format elf32-littlearm Contents of section .rodata: 78656e5f 65787472 615f7665 7273696f xen_extra_versio 0010 6e00 n. And native: xen_bye_world.o: file format elf32-littlearm Contents of section .rodata: 78656e5f 65787472 615f7665 7273696f xen_extra_versio 0010 6e00 n... (The cross compiler is 7.0.1, while native is 4.9.2). ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/27 v9] xen/arm: vpl011: Define common ring buffer helper functions in console.h
On Mon, Sep 18, 2017 at 04:01:45PM +0530, Bhupinder Thakur wrote: > DEFINE_XEN_FLEX_RING(xencons) defines common helper functions such as > xencons_queued() to tell the current size of the ring buffer, > xencons_mask() to mask off the index, which are useful helper functions. > pl011 emulation code will use these helper functions. > > io/console.h includes io/ring.h which defines DEFINE_XEN_FLEX_RING. > > In console/daemon/io.c, string.h had to be included before io/console.h > because ring.h uses string functions. > > Signed-off-by: Bhupinder Thakur> Reviewed-by: Stefano Stabellini > Acked-by: Wei Liu Acked-by: Konrad Rzeszutek Wilk > --- > CC: Ian Jackson > CC: Wei Liu > CC: Konrad Rzeszutek Wilk > CC: Stefano Stabellini > CC: Julien Grall > > Changes since v4: > - Split this change in a separate patch. > > tools/console/daemon/io.c | 2 +- > xen/include/public/io/console.h | 4 > 2 files changed, 5 insertions(+), 1 deletion(-) > > tools/console/daemon/io.c | 2 +- > xen/include/public/io/console.h | 4 > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/tools/console/daemon/io.c b/tools/console/daemon/io.c > index 7e474bb..e8033d2 100644 > --- a/tools/console/daemon/io.c > +++ b/tools/console/daemon/io.c > @@ -21,6 +21,7 @@ > > #include "utils.h" > #include "io.h" > +#include > #include > #include > #include > @@ -29,7 +30,6 @@ > > #include > #include > -#include > #include > #include > #include > diff --git a/xen/include/public/io/console.h b/xen/include/public/io/console.h > index e2cd97f..5e45e1c 100644 > --- a/xen/include/public/io/console.h > +++ b/xen/include/public/io/console.h > @@ -27,6 +27,8 @@ > #ifndef __XEN_PUBLIC_IO_CONSOLE_H__ > #define __XEN_PUBLIC_IO_CONSOLE_H__ > > +#include "ring.h" > + > typedef uint32_t XENCONS_RING_IDX; > > #define MASK_XENCONS_IDX(idx, ring) ((idx) & (sizeof(ring)-1)) > @@ -38,6 +40,8 @@ struct xencons_interface { > XENCONS_RING_IDX out_cons, out_prod; > }; > > +DEFINE_XEN_FLEX_RING(xencons); > + > #endif /* __XEN_PUBLIC_IO_CONSOLE_H__ */ > > /* > -- > 2.7.4 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Feature control on PV devices
On Thu, Sep 14, 2017 at 05:08:18PM +0100, Joao Martins wrote: > [ Realized that I didn't CC the maintainers, > so doing that now, +Linux folks +PV interfaces czar > Sorry for the noise! ] > > On 09/08/2017 09:49 AM, Joao Martins wrote: > > [Forgot two important details regarding Xenbus states] > > On 09/07/2017 05:53 PM, Joao Martins wrote: > >> Hey! > >> > >> We wanted to brought up this small proposal regarding the lack of > >> parameterization on PV devices on Xen. > >> > >> Currently users don't have a way for enforce and control what > >> features/queues/etc the backend provides. So far there's only global > >> parameters > >> on backends, and specs do not mention anything in this regard. How would this scale with say FreeBSD backends? And I am assuming you are also thinking about device driver backends - where you can't easily get access to the backend and change the SysFS parameters (if they have it all)? > >> > >> The most obvious example is netback/blkback max_queues module parameter > >> where it > >> sets the limit the maximum queues for all devices which is not that > >> flexible. > >> Other examples include controlling offloads visible by the NIC (e.g. > >> disabling > >> checksum offload, disabling scather-gather), others more about I/O path > >> (e.g. > >> disable blkif indirect descriptors, limit number of pages for the ring), > >> or less > >> grant usage by minimizing number of queues/descriptors. > >> > >> Of course there could be more examples, as this seems to be ortoghonal to > >> the > >> kinds of PV backends we have. And seems like all features appear to be > >> published > >> on the same xenbus state? > >> > >> The idea to address this would be very simple: > >> > >> - Toolstack when initializing device paths, writes additional entries in > >> the > >> form of 'request-' = . These entries are only > >> visible by the backend and toolstack; > >> > > And after that we switch the device state to XenbusStateInitialising as > > usual. > > > >> > >> - Backend reads this entries and uses as the value of > >> , which will then be visible on the frontend. > >> > > And after that we switch state to XenbusStateInitWait as usual. No changes > > are > > involved in xenbus state changes other than reading what the toolstack had > > written in "request-*" and seed accordingly. Backends without support would > > simply ignore these new entries. > > > >> [ Removal of the 'request-*' xenstore entries could represent a feedback > >> look > >> that the backend indeed read and used the value. Or else it could simply > >> be > >> ignored. ] > >> > >> And that's it. > >> > >> In pratice user would do: E.g. > >> > >> domain.cfg: > >> ... > >> name = "guest" > >> kernel = "bzImage" > >> vif = ["bridge=br0,queues=2"] > >> disk = [ > >> "format=raw,vdev=hda,access=rw,backendtype=phy,target=/dev/HostVG/XenGuest2,queues=1,max-ring-page-order=0" > >> ] > >> ... > >> > >> Toolstack writes: > >> > >> /local/domain/0/backend/vif/8/0/request-multi-queue-max-queues = 2 > >> /local/domain/0/backend/vbd/8/51713/request-multi-queue-max-queues = 2 > >> /local/domain/0/backend/vbd/8/51713/request-max-ring-page-order = 0 > > > > /local/domain/0/backend/vbd/8/51713/state = 1 (XenbusStateInitialising) > > > >> > >> Backends reads and seeds with (and assuming it passes backend validation > >> ofc): > >> > >> /local/domain/0/backend/vif/8/0/multi-queue-max-queues = 2 > >> /local/domain/0/backend/vbd/8/51713/multi-queue-max-queues = 2 > >> /local/domain/0/backend/vbd/8/51713/max-ring-page-order = 0 > >> > > /local/domain/0/backend/vbd/8/51713/state = 2 (XenbusStateInitWait) > > > >> The XL configuration entry for controlling these tunable are just examples > >> it's > >> not clear the general preference for this. An alternative could be: > >> > >> vif = ["bridge=br0,features=queues:2\\;max-ring-page-order:0"] > >> > >> Which lets us have more generic feature control, without sticking to > >> particular > >> features names. > >> > >> Naturally libvirt could be a consumer of this (as it already has the > >> 'queues' > >> and host 'tso4', 'tso6', etc in their XML schemas) > >> > >> Thoughts? Do folks think the correct way of handling this? > >> > >> Cheers, > >> Joao > >> > >> [0] https://github.com/qemu/qemu/blob/master/hw/net/virtio-net.c#L2102 > >> ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 3/3] x86/hvm: Implement hvmemul_write() using real mappings
> -Original Message- > From: Alexandru Isaila [mailto:aisa...@bitdefender.com] > Sent: 19 September 2017 15:14 > To: xen-devel@lists.xen.org > Cc: Tim (Xen.org); George Dunlap > ; jbeul...@suse.com; Andrew Cooper > ; Ian Jackson ; > konrad.w...@oracle.com; sstabell...@kernel.org; Wei Liu > ; Paul Durrant ; > boris.ostrov...@oracle.com; suravee.suthikulpa...@amd.com; > jun.nakaj...@intel.com; Kevin Tian ; Alexandru Isaila > > Subject: [PATCH v3 3/3] x86/hvm: Implement hvmemul_write() using real > mappings > > From: Andrew Cooper > > An access which crosses a page boundary is performed atomically by x86 > hardware, albeit with a severe performance penalty. An important corner > case > is when a straddled access hits two pages which differ in whether a > translation exists, or in net access rights. > > The use of hvm_copy*() in hvmemul_write() is problematic, because it > performs > a translation then completes the partial write, before moving onto the next > translation. > > If an individual emulated write straddles two pages, the first of which is > writable, and the second of which is not, the first half of the write will > complete before #PF is raised from the second half. > > This results in guest state corruption as a side effect of emulation, which > has been observed to cause windows to crash while under introspection. > > Introduce the hvmemul_{,un}map_linear_addr() helpers, which translate an > entire contents of a linear access, and vmap() the underlying frames to > provide a contiguous virtual mapping for the emulator to use. This is the > same mechanism as used by the shadow emulation code. > > This will catch any translation issues and abort the emulation before any > modifications occur. > > Signed-off-by: Andrew Cooper > Signed-off-by: Alexandru Isaila > > --- > Changes since V2: > - Added linear & ~PAGE_MASK to return statement > - Modified mfn - hvmemul_ctxt->mfn to final - first + 1 > - Remove useless else statement > --- > xen/arch/x86/hvm/emulate.c| 177 > ++ > xen/include/asm-x86/hvm/emulate.h | 7 ++ > 2 files changed, 167 insertions(+), 17 deletions(-) > > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > index cc874ce..5574698 100644 > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -498,6 +498,159 @@ static int hvmemul_do_mmio_addr(paddr_t > mmio_gpa, > } > > /* > + * Map the frame(s) covering an individual linear access, for writeable > + * access. May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other > errors > + * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings. > + * > + * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is > + * clean before use, and poisions unused slots with INVALID_MFN. > + */ > +static void *hvmemul_map_linear_addr( > +unsigned long linear, unsigned int bytes, uint32_t pfec, > +struct hvm_emulate_ctxt *hvmemul_ctxt) > +{ > +struct vcpu *curr = current; > +void *err, *mapping; > + > +/* First and final gfns which need mapping. */ > +unsigned long frame = linear >> PAGE_SHIFT, first = frame; > +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT; > + > +/* > + * mfn points to the next free slot. All used slots have a page > reference > + * held on them. > + */ > +mfn_t *mfn = _ctxt->mfn[0]; > + > +/* > + * The caller has no legitimate reason for trying a zero-byte write, but > + * final is calculate to fail safe in release builds. > + * > + * The maximum write size depends on the number of adjacent mfns[] > which > + * can be vmap()'d, accouting for possible misalignment within the > region. > + * The higher level emulation callers are responsible for ensuring that > + * mfns[] is large enough for the requested write size. > + */ > +if ( bytes == 0 || > + final - first > ARRAY_SIZE(hvmemul_ctxt->mfn) - 1 ) >=, rather than the -1? > +{ > +ASSERT_UNREACHABLE(); > +goto unhandleable; > +} > + > +do { > +enum hvm_translation_result res; > +struct page_info *page; > +pagefault_info_t pfinfo; > +p2m_type_t p2mt; > + > +/* Error checking. Confirm that the current slot is clean. */ > +ASSERT(mfn_x(*mfn) == 0); > + > +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec, > + , , NULL, ); > + > +switch ( res ) > +{ > +case HVMTRANS_okay: > +break; > + > +case HVMTRANS_bad_linear_to_gfn: > +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, >
Re: [Xen-devel] [PATCH v3 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result
>>> On 19.09.17 at 16:14,wrote: > From: Andrew Cooper > > Signed-off-by: Andrew Cooper > > --- > Acked-by: Jan Beulich > Reviewed-by: Kevin Tian > Acked-by: George Dunlap Please avoid such misplaced tags in the future - the committer will need to remember to remove the first --- separator in order for them to not get lost while committing. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 3/3] x86/hvm: Implement hvmemul_write() using real mappings
>>> On 19.09.17 at 16:14,wrote: > From: Andrew Cooper > > An access which crosses a page boundary is performed atomically by x86 > hardware, albeit with a severe performance penalty. An important corner > case > is when a straddled access hits two pages which differ in whether a > translation exists, or in net access rights. > > The use of hvm_copy*() in hvmemul_write() is problematic, because it > performs > a translation then completes the partial write, before moving onto the next > translation. > > If an individual emulated write straddles two pages, the first of which is > writable, and the second of which is not, the first half of the write will > complete before #PF is raised from the second half. > > This results in guest state corruption as a side effect of emulation, which > has been observed to cause windows to crash while under introspection. > > Introduce the hvmemul_{,un}map_linear_addr() helpers, which translate an > entire contents of a linear access, and vmap() the underlying frames to > provide a contiguous virtual mapping for the emulator to use. This is the > same mechanism as used by the shadow emulation code. > > This will catch any translation issues and abort the emulation before any > modifications occur. > > Signed-off-by: Andrew Cooper > Signed-off-by: Alexandru Isaila Reviewed-by: Jan Beulich despite me being unhappy about ... > +static void hvmemul_unmap_linear_addr( > +void *mapping, unsigned long linear, unsigned int bytes, ... "mapping" still being non-const here. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 0/3] docs: convert manpages to pod
Olaf Hering writes ("[PATCH v3 0/3] docs: convert manpages to pod"): > To remove the buildtime dependency to pandoc/ghc some manpages are > converted from markdown to pod format. This will provide more manpages > which are referenced in xl(1) and xl.cfg(5). > > This series does not cover xen-vbd-interface.7 because converting the > lists used in this manpage was not straight forward. So, thanks for making the changes I asked for. I don't intend to rereview these in detail, although I wouldn't discourage others from doing so. I see Dario has already reviewed one. All three: Acked-by: Ian Jackson___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/3] x86/hvm: Break out __hvm_copy()'s translation logic
>>> On 19.09.17 at 16:27,wrote: >> From: Alexandru Isaila [mailto:aisa...@bitdefender.com] >> Sent: 19 September 2017 15:14 >> Subject: [PATCH v3 2/3] x86/hvm: Break out __hvm_copy()'s translation logic >> >> From: Andrew Cooper >> >> It will be reused by later changes. >> >> Signed-off-by: Andrew Cooper >> Signed-off-by: Alexandru Isaila > > Reviewed-by: Paul Durrant Acked-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/3] x86/hvm: Break out __hvm_copy()'s translation logic
> -Original Message- > From: Alexandru Isaila [mailto:aisa...@bitdefender.com] > Sent: 19 September 2017 15:14 > To: xen-devel@lists.xen.org > Cc: Tim (Xen.org); George Dunlap > ; jbeul...@suse.com; Andrew Cooper > ; Ian Jackson ; > konrad.w...@oracle.com; sstabell...@kernel.org; Wei Liu > ; Paul Durrant ; > boris.ostrov...@oracle.com; suravee.suthikulpa...@amd.com; > jun.nakaj...@intel.com; Kevin Tian ; Alexandru Isaila > > Subject: [PATCH v3 2/3] x86/hvm: Break out __hvm_copy()'s translation logic > > From: Andrew Cooper > > It will be reused by later changes. > > Signed-off-by: Andrew Cooper > Signed-off-by: Alexandru Isaila Reviewed-by: Paul Durrant > > --- > Changes since V2: > - Changed _gfn() to gaddr_to_gfn > - Changed gfn_x to gfn_to_gaddr > --- > xen/arch/x86/hvm/hvm.c| 144 +++--- > > xen/include/asm-x86/hvm/support.h | 12 > 2 files changed, 98 insertions(+), 58 deletions(-) > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index 488acbf..93394c1 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -3069,6 +3069,83 @@ void hvm_task_switch( > hvm_unmap_entry(nptss_desc); > } > > +enum hvm_translation_result hvm_translate_get_page( > +struct vcpu *v, unsigned long addr, bool linear, uint32_t pfec, > +pagefault_info_t *pfinfo, struct page_info **page_p, > +gfn_t *gfn_p, p2m_type_t *p2mt_p) > +{ > +struct page_info *page; > +p2m_type_t p2mt; > +gfn_t gfn; > + > +if ( linear ) > +{ > +gfn = _gfn(paging_gva_to_gfn(v, addr, )); > + > +if ( gfn_eq(gfn, INVALID_GFN) ) > +{ > +if ( pfec & PFEC_page_paged ) > +return HVMTRANS_gfn_paged_out; > + > +if ( pfec & PFEC_page_shared ) > +return HVMTRANS_gfn_shared; > + > +if ( pfinfo ) > +{ > +pfinfo->linear = addr; > +pfinfo->ec = pfec & ~PFEC_implicit; > +} > + > +return HVMTRANS_bad_linear_to_gfn; > +} > +} > +else > +{ > +gfn = gaddr_to_gfn(addr); > +ASSERT(!pfinfo); > +} > + > +/* > + * No need to do the P2M lookup for internally handled MMIO, benefiting > + * - 32-bit WinXP (& older Windows) on AMD CPUs for LAPIC accesses, > + * - newer Windows (like Server 2012) for HPET accesses. > + */ > +if ( v == current > + && !nestedhvm_vcpu_in_guestmode(v) > + && hvm_mmio_internal(gfn_to_gaddr(gfn)) ) > +return HVMTRANS_bad_gfn_to_mfn; > + > +page = get_page_from_gfn(v->domain, gfn_x(gfn), , > P2M_UNSHARE); > + > +if ( !page ) > +return HVMTRANS_bad_gfn_to_mfn; > + > +if ( p2m_is_paging(p2mt) ) > +{ > +put_page(page); > +p2m_mem_paging_populate(v->domain, gfn_x(gfn)); > +return HVMTRANS_gfn_paged_out; > +} > +if ( p2m_is_shared(p2mt) ) > +{ > +put_page(page); > +return HVMTRANS_gfn_shared; > +} > +if ( p2m_is_grant(p2mt) ) > +{ > +put_page(page); > +return HVMTRANS_unhandleable; > +} > + > +*page_p = page; > +if ( gfn_p ) > +*gfn_p = gfn; > +if ( p2mt_p ) > +*p2mt_p = p2mt; > + > +return HVMTRANS_okay; > +} > + > #define HVMCOPY_from_guest (0u<<0) > #define HVMCOPY_to_guest (1u<<0) > #define HVMCOPY_phys (0u<<2) > @@ -3077,7 +3154,7 @@ static enum hvm_translation_result __hvm_copy( > void *buf, paddr_t addr, int size, struct vcpu *v, unsigned int flags, > uint32_t pfec, pagefault_info_t *pfinfo) > { > -unsigned long gfn; > +gfn_t gfn; > struct page_info *page; > p2m_type_t p2mt; > char *p; > @@ -3103,65 +3180,15 @@ static enum hvm_translation_result > __hvm_copy( > > while ( todo > 0 ) > { > +enum hvm_translation_result res; > paddr_t gpa = addr & ~PAGE_MASK; > > count = min_t(int, PAGE_SIZE - gpa, todo); > > -if ( flags & HVMCOPY_linear ) > -{ > -gfn = paging_gva_to_gfn(v, addr, ); > -if ( gfn == gfn_x(INVALID_GFN) ) > -{ > -if ( pfec & PFEC_page_paged ) > -return HVMTRANS_gfn_paged_out; > -if ( pfec & PFEC_page_shared ) > -return HVMTRANS_gfn_shared; > -if ( pfinfo ) > -{ > -pfinfo->linear = addr; > -pfinfo->ec = pfec & ~PFEC_implicit; > -} > -return HVMTRANS_bad_linear_to_gfn; > -} > -gpa |= (paddr_t)gfn <<
Re: [Xen-devel] [PATCH v3 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result
> -Original Message- > From: Alexandru Isaila [mailto:aisa...@bitdefender.com] > Sent: 19 September 2017 15:14 > To: xen-devel@lists.xen.org > Cc: Tim (Xen.org); George Dunlap > ; jbeul...@suse.com; Andrew Cooper > ; Ian Jackson ; > konrad.w...@oracle.com; sstabell...@kernel.org; Wei Liu > ; Paul Durrant ; > boris.ostrov...@oracle.com; suravee.suthikulpa...@amd.com; > jun.nakaj...@intel.com; Kevin Tian > Subject: [PATCH v3 1/3] x86/hvm: Rename enum hvm_copy_result to > hvm_translation_result > > From: Andrew Cooper > > Signed-off-by: Andrew Cooper > > --- > Acked-by: Jan Beulich > Reviewed-by: Kevin Tian > Acked-by: George Dunlap Reviewed-by: Paul Durrant > --- > xen/arch/x86/hvm/dom0_build.c | 2 +- > xen/arch/x86/hvm/emulate.c| 40 ++-- > xen/arch/x86/hvm/hvm.c| 56 +++-- > -- > xen/arch/x86/hvm/intercept.c | 20 +++--- > xen/arch/x86/hvm/svm/nestedsvm.c | 5 ++-- > xen/arch/x86/hvm/svm/svm.c| 2 +- > xen/arch/x86/hvm/viridian.c | 2 +- > xen/arch/x86/hvm/vmsi.c | 2 +- > xen/arch/x86/hvm/vmx/realmode.c | 2 +- > xen/arch/x86/hvm/vmx/vvmx.c | 14 +- > xen/arch/x86/mm/shadow/common.c | 12 - > xen/common/libelf/libelf-loader.c | 4 +-- > xen/include/asm-x86/hvm/support.h | 40 ++-- > 13 files changed, 101 insertions(+), 100 deletions(-) > > diff --git a/xen/arch/x86/hvm/dom0_build.c > b/xen/arch/x86/hvm/dom0_build.c > index 020c355..e8f746c 100644 > --- a/xen/arch/x86/hvm/dom0_build.c > +++ b/xen/arch/x86/hvm/dom0_build.c > @@ -238,7 +238,7 @@ static int __init > pvh_setup_vmx_realmode_helpers(struct domain *d) > if ( !pvh_steal_ram(d, HVM_VM86_TSS_SIZE, 128, GB(4), ) ) > { > if ( hvm_copy_to_guest_phys(gaddr, NULL, HVM_VM86_TSS_SIZE, v) != > - HVMCOPY_okay ) > + HVMTRANS_okay ) > printk("Unable to zero VM86 TSS area\n"); > d->arch.hvm_domain.params[HVM_PARAM_VM86_TSS_SIZED] = > VM86_TSS_UPDATED | ((uint64_t)HVM_VM86_TSS_SIZE << 32) | > gaddr; > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > index 54811c1..cc874ce 100644 > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -100,7 +100,7 @@ static int ioreq_server_read(const struct > hvm_io_handler *io_handler, > uint32_t size, > uint64_t *data) > { > -if ( hvm_copy_from_guest_phys(data, addr, size) != HVMCOPY_okay ) > +if ( hvm_copy_from_guest_phys(data, addr, size) != HVMTRANS_okay ) > return X86EMUL_UNHANDLEABLE; > > return X86EMUL_OKAY; > @@ -893,18 +893,18 @@ static int __hvmemul_read( > > switch ( rc ) > { > -case HVMCOPY_okay: > +case HVMTRANS_okay: > break; > -case HVMCOPY_bad_gva_to_gfn: > +case HVMTRANS_bad_linear_to_gfn: > x86_emul_pagefault(pfinfo.ec, pfinfo.linear, _ctxt->ctxt); > return X86EMUL_EXCEPTION; > -case HVMCOPY_bad_gfn_to_mfn: > +case HVMTRANS_bad_gfn_to_mfn: > if ( access_type == hvm_access_insn_fetch ) > return X86EMUL_UNHANDLEABLE; > > return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, > hvmemul_ctxt, 0); > -case HVMCOPY_gfn_paged_out: > -case HVMCOPY_gfn_shared: > +case HVMTRANS_gfn_paged_out: > +case HVMTRANS_gfn_shared: > return X86EMUL_RETRY; > default: > return X86EMUL_UNHANDLEABLE; > @@ -1012,15 +1012,15 @@ static int hvmemul_write( > > switch ( rc ) > { > -case HVMCOPY_okay: > +case HVMTRANS_okay: > break; > -case HVMCOPY_bad_gva_to_gfn: > +case HVMTRANS_bad_linear_to_gfn: > x86_emul_pagefault(pfinfo.ec, pfinfo.linear, _ctxt->ctxt); > return X86EMUL_EXCEPTION; > -case HVMCOPY_bad_gfn_to_mfn: > +case HVMTRANS_bad_gfn_to_mfn: > return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, > hvmemul_ctxt, 0); > -case HVMCOPY_gfn_paged_out: > -case HVMCOPY_gfn_shared: > +case HVMTRANS_gfn_paged_out: > +case HVMTRANS_gfn_shared: > return X86EMUL_RETRY; > default: > return X86EMUL_UNHANDLEABLE; > @@ -1384,7 +1384,7 @@ static int hvmemul_rep_movs( > return rc; > } > > -rc = HVMCOPY_okay; > +rc = HVMTRANS_okay; > } > else > /* > @@ -1394,16 +1394,16 @@ static int hvmemul_rep_movs( > */ > rc = hvm_copy_from_guest_phys(buf, sgpa, bytes); > > -if ( rc == HVMCOPY_okay ) > +if ( rc == HVMTRANS_okay ) > rc =
[Xen-devel] [PATCH v3 3/3] x86/hvm: Implement hvmemul_write() using real mappings
From: Andrew CooperAn access which crosses a page boundary is performed atomically by x86 hardware, albeit with a severe performance penalty. An important corner case is when a straddled access hits two pages which differ in whether a translation exists, or in net access rights. The use of hvm_copy*() in hvmemul_write() is problematic, because it performs a translation then completes the partial write, before moving onto the next translation. If an individual emulated write straddles two pages, the first of which is writable, and the second of which is not, the first half of the write will complete before #PF is raised from the second half. This results in guest state corruption as a side effect of emulation, which has been observed to cause windows to crash while under introspection. Introduce the hvmemul_{,un}map_linear_addr() helpers, which translate an entire contents of a linear access, and vmap() the underlying frames to provide a contiguous virtual mapping for the emulator to use. This is the same mechanism as used by the shadow emulation code. This will catch any translation issues and abort the emulation before any modifications occur. Signed-off-by: Andrew Cooper Signed-off-by: Alexandru Isaila --- Changes since V2: - Added linear & ~PAGE_MASK to return statement - Modified mfn - hvmemul_ctxt->mfn to final - first + 1 - Remove useless else statement --- xen/arch/x86/hvm/emulate.c| 177 ++ xen/include/asm-x86/hvm/emulate.h | 7 ++ 2 files changed, 167 insertions(+), 17 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index cc874ce..5574698 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -498,6 +498,159 @@ static int hvmemul_do_mmio_addr(paddr_t mmio_gpa, } /* + * Map the frame(s) covering an individual linear access, for writeable + * access. May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors + * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings. + * + * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is + * clean before use, and poisions unused slots with INVALID_MFN. + */ +static void *hvmemul_map_linear_addr( +unsigned long linear, unsigned int bytes, uint32_t pfec, +struct hvm_emulate_ctxt *hvmemul_ctxt) +{ +struct vcpu *curr = current; +void *err, *mapping; + +/* First and final gfns which need mapping. */ +unsigned long frame = linear >> PAGE_SHIFT, first = frame; +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT; + +/* + * mfn points to the next free slot. All used slots have a page reference + * held on them. + */ +mfn_t *mfn = _ctxt->mfn[0]; + +/* + * The caller has no legitimate reason for trying a zero-byte write, but + * final is calculate to fail safe in release builds. + * + * The maximum write size depends on the number of adjacent mfns[] which + * can be vmap()'d, accouting for possible misalignment within the region. + * The higher level emulation callers are responsible for ensuring that + * mfns[] is large enough for the requested write size. + */ +if ( bytes == 0 || + final - first > ARRAY_SIZE(hvmemul_ctxt->mfn) - 1 ) +{ +ASSERT_UNREACHABLE(); +goto unhandleable; +} + +do { +enum hvm_translation_result res; +struct page_info *page; +pagefault_info_t pfinfo; +p2m_type_t p2mt; + +/* Error checking. Confirm that the current slot is clean. */ +ASSERT(mfn_x(*mfn) == 0); + +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec, + , , NULL, ); + +switch ( res ) +{ +case HVMTRANS_okay: +break; + +case HVMTRANS_bad_linear_to_gfn: +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, _ctxt->ctxt); +err = ERR_PTR(~(long)X86EMUL_EXCEPTION); +goto out; + +case HVMTRANS_bad_gfn_to_mfn: +err = NULL; +goto out; + +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: +err = ERR_PTR(~(long)X86EMUL_RETRY); +goto out; + +default: +goto unhandleable; +} + +*mfn++ = _mfn(page_to_mfn(page)); +frame++; + +if ( p2m_is_discard_write(p2mt) ) +{ +err = ERR_PTR(~(long)X86EMUL_OKAY); +goto out; +} + +} while ( frame < final ); + +/* Entire access within a single frame? */ +if ( first == final ) +mapping = map_domain_page(hvmemul_ctxt->mfn[0]); +/* Multiple frames? Need to vmap(). */ +else if ( (mapping = vmap(hvmemul_ctxt->mfn, + final - first + 1)) == NULL ) +goto unhandleable; + +#ifndef NDEBUG
[Xen-devel] [PATCH v3 2/3] x86/hvm: Break out __hvm_copy()'s translation logic
From: Andrew CooperIt will be reused by later changes. Signed-off-by: Andrew Cooper Signed-off-by: Alexandru Isaila --- Changes since V2: - Changed _gfn() to gaddr_to_gfn - Changed gfn_x to gfn_to_gaddr --- xen/arch/x86/hvm/hvm.c| 144 +++--- xen/include/asm-x86/hvm/support.h | 12 2 files changed, 98 insertions(+), 58 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 488acbf..93394c1 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3069,6 +3069,83 @@ void hvm_task_switch( hvm_unmap_entry(nptss_desc); } +enum hvm_translation_result hvm_translate_get_page( +struct vcpu *v, unsigned long addr, bool linear, uint32_t pfec, +pagefault_info_t *pfinfo, struct page_info **page_p, +gfn_t *gfn_p, p2m_type_t *p2mt_p) +{ +struct page_info *page; +p2m_type_t p2mt; +gfn_t gfn; + +if ( linear ) +{ +gfn = _gfn(paging_gva_to_gfn(v, addr, )); + +if ( gfn_eq(gfn, INVALID_GFN) ) +{ +if ( pfec & PFEC_page_paged ) +return HVMTRANS_gfn_paged_out; + +if ( pfec & PFEC_page_shared ) +return HVMTRANS_gfn_shared; + +if ( pfinfo ) +{ +pfinfo->linear = addr; +pfinfo->ec = pfec & ~PFEC_implicit; +} + +return HVMTRANS_bad_linear_to_gfn; +} +} +else +{ +gfn = gaddr_to_gfn(addr); +ASSERT(!pfinfo); +} + +/* + * No need to do the P2M lookup for internally handled MMIO, benefiting + * - 32-bit WinXP (& older Windows) on AMD CPUs for LAPIC accesses, + * - newer Windows (like Server 2012) for HPET accesses. + */ +if ( v == current + && !nestedhvm_vcpu_in_guestmode(v) + && hvm_mmio_internal(gfn_to_gaddr(gfn)) ) +return HVMTRANS_bad_gfn_to_mfn; + +page = get_page_from_gfn(v->domain, gfn_x(gfn), , P2M_UNSHARE); + +if ( !page ) +return HVMTRANS_bad_gfn_to_mfn; + +if ( p2m_is_paging(p2mt) ) +{ +put_page(page); +p2m_mem_paging_populate(v->domain, gfn_x(gfn)); +return HVMTRANS_gfn_paged_out; +} +if ( p2m_is_shared(p2mt) ) +{ +put_page(page); +return HVMTRANS_gfn_shared; +} +if ( p2m_is_grant(p2mt) ) +{ +put_page(page); +return HVMTRANS_unhandleable; +} + +*page_p = page; +if ( gfn_p ) +*gfn_p = gfn; +if ( p2mt_p ) +*p2mt_p = p2mt; + +return HVMTRANS_okay; +} + #define HVMCOPY_from_guest (0u<<0) #define HVMCOPY_to_guest (1u<<0) #define HVMCOPY_phys (0u<<2) @@ -3077,7 +3154,7 @@ static enum hvm_translation_result __hvm_copy( void *buf, paddr_t addr, int size, struct vcpu *v, unsigned int flags, uint32_t pfec, pagefault_info_t *pfinfo) { -unsigned long gfn; +gfn_t gfn; struct page_info *page; p2m_type_t p2mt; char *p; @@ -3103,65 +3180,15 @@ static enum hvm_translation_result __hvm_copy( while ( todo > 0 ) { +enum hvm_translation_result res; paddr_t gpa = addr & ~PAGE_MASK; count = min_t(int, PAGE_SIZE - gpa, todo); -if ( flags & HVMCOPY_linear ) -{ -gfn = paging_gva_to_gfn(v, addr, ); -if ( gfn == gfn_x(INVALID_GFN) ) -{ -if ( pfec & PFEC_page_paged ) -return HVMTRANS_gfn_paged_out; -if ( pfec & PFEC_page_shared ) -return HVMTRANS_gfn_shared; -if ( pfinfo ) -{ -pfinfo->linear = addr; -pfinfo->ec = pfec & ~PFEC_implicit; -} -return HVMTRANS_bad_linear_to_gfn; -} -gpa |= (paddr_t)gfn << PAGE_SHIFT; -} -else -{ -gfn = addr >> PAGE_SHIFT; -gpa = addr; -} - -/* - * No need to do the P2M lookup for internally handled MMIO, benefiting - * - 32-bit WinXP (& older Windows) on AMD CPUs for LAPIC accesses, - * - newer Windows (like Server 2012) for HPET accesses. - */ -if ( v == current - && !nestedhvm_vcpu_in_guestmode(v) - && hvm_mmio_internal(gpa) ) -return HVMTRANS_bad_gfn_to_mfn; - -page = get_page_from_gfn(v->domain, gfn, , P2M_UNSHARE); - -if ( !page ) -return HVMTRANS_bad_gfn_to_mfn; - -if ( p2m_is_paging(p2mt) ) -{ -put_page(page); -p2m_mem_paging_populate(v->domain, gfn); -return HVMTRANS_gfn_paged_out; -} -if ( p2m_is_shared(p2mt) ) -{ -put_page(page); -return HVMTRANS_gfn_shared; -} -if ( p2m_is_grant(p2mt) ) -{
Re: [Xen-devel] [RFC PATCH V3 2/3] Tool/ACPI: DSDT extension to support more vcpus
On Tue, Sep 19, 2017 at 07:55:32AM -0600, Jan Beulich wrote: > >>> On 19.09.17 at 15:48,wrote: > > On Tue, Sep 19, 2017 at 07:44:21AM -0600, Jan Beulich wrote: > >> >>> On 19.09.17 at 15:29, wrote: > >> > On Wed, Sep 13, 2017 at 12:52:48AM -0400, Lan Tianyu wrote: > >> >> +if ( apic_id > 254 ) > >> > > >> > 255? An APIC ID of 255 should still be fine. > >> > >> Wasn't it you who (validly) asked for the boundary to be 254, due > >> to 0xff being the broadcast value? > > > > But that's the ACPI ID, not the APIC ID. > > The code above says "apic_id" - is the variable mis-named? Or am > I reading your reply the wrong way round, in which case the question > would be why an ACPI ID could ever express something like > "broadcast"? Yes, sorry I got messed up. This is indeed fine, as a local APIC ID of 255 is the broadcast ID. But this also applies to the ACPI ID, since an ACPI ID of 255 is also the broadcast ID for local APIC entries in the MADT. For example a Local APIC NMI Structure with an ACPI ID of 255 applies to all local APICs. We need to be careful to not create local APIC entries with either APIC or ACPI ID equal to 255 (and to also not create Processor objects with ACPI ID of 255). Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 1/3] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result
From: Andrew CooperSigned-off-by: Andrew Cooper --- Acked-by: Jan Beulich Reviewed-by: Kevin Tian Acked-by: George Dunlap --- xen/arch/x86/hvm/dom0_build.c | 2 +- xen/arch/x86/hvm/emulate.c| 40 ++-- xen/arch/x86/hvm/hvm.c| 56 +++ xen/arch/x86/hvm/intercept.c | 20 +++--- xen/arch/x86/hvm/svm/nestedsvm.c | 5 ++-- xen/arch/x86/hvm/svm/svm.c| 2 +- xen/arch/x86/hvm/viridian.c | 2 +- xen/arch/x86/hvm/vmsi.c | 2 +- xen/arch/x86/hvm/vmx/realmode.c | 2 +- xen/arch/x86/hvm/vmx/vvmx.c | 14 +- xen/arch/x86/mm/shadow/common.c | 12 - xen/common/libelf/libelf-loader.c | 4 +-- xen/include/asm-x86/hvm/support.h | 40 ++-- 13 files changed, 101 insertions(+), 100 deletions(-) diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c index 020c355..e8f746c 100644 --- a/xen/arch/x86/hvm/dom0_build.c +++ b/xen/arch/x86/hvm/dom0_build.c @@ -238,7 +238,7 @@ static int __init pvh_setup_vmx_realmode_helpers(struct domain *d) if ( !pvh_steal_ram(d, HVM_VM86_TSS_SIZE, 128, GB(4), ) ) { if ( hvm_copy_to_guest_phys(gaddr, NULL, HVM_VM86_TSS_SIZE, v) != - HVMCOPY_okay ) + HVMTRANS_okay ) printk("Unable to zero VM86 TSS area\n"); d->arch.hvm_domain.params[HVM_PARAM_VM86_TSS_SIZED] = VM86_TSS_UPDATED | ((uint64_t)HVM_VM86_TSS_SIZE << 32) | gaddr; diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 54811c1..cc874ce 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -100,7 +100,7 @@ static int ioreq_server_read(const struct hvm_io_handler *io_handler, uint32_t size, uint64_t *data) { -if ( hvm_copy_from_guest_phys(data, addr, size) != HVMCOPY_okay ) +if ( hvm_copy_from_guest_phys(data, addr, size) != HVMTRANS_okay ) return X86EMUL_UNHANDLEABLE; return X86EMUL_OKAY; @@ -893,18 +893,18 @@ static int __hvmemul_read( switch ( rc ) { -case HVMCOPY_okay: +case HVMTRANS_okay: break; -case HVMCOPY_bad_gva_to_gfn: +case HVMTRANS_bad_linear_to_gfn: x86_emul_pagefault(pfinfo.ec, pfinfo.linear, _ctxt->ctxt); return X86EMUL_EXCEPTION; -case HVMCOPY_bad_gfn_to_mfn: +case HVMTRANS_bad_gfn_to_mfn: if ( access_type == hvm_access_insn_fetch ) return X86EMUL_UNHANDLEABLE; return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, hvmemul_ctxt, 0); -case HVMCOPY_gfn_paged_out: -case HVMCOPY_gfn_shared: +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: return X86EMUL_RETRY; default: return X86EMUL_UNHANDLEABLE; @@ -1012,15 +1012,15 @@ static int hvmemul_write( switch ( rc ) { -case HVMCOPY_okay: +case HVMTRANS_okay: break; -case HVMCOPY_bad_gva_to_gfn: +case HVMTRANS_bad_linear_to_gfn: x86_emul_pagefault(pfinfo.ec, pfinfo.linear, _ctxt->ctxt); return X86EMUL_EXCEPTION; -case HVMCOPY_bad_gfn_to_mfn: +case HVMTRANS_bad_gfn_to_mfn: return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, hvmemul_ctxt, 0); -case HVMCOPY_gfn_paged_out: -case HVMCOPY_gfn_shared: +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: return X86EMUL_RETRY; default: return X86EMUL_UNHANDLEABLE; @@ -1384,7 +1384,7 @@ static int hvmemul_rep_movs( return rc; } -rc = HVMCOPY_okay; +rc = HVMTRANS_okay; } else /* @@ -1394,16 +1394,16 @@ static int hvmemul_rep_movs( */ rc = hvm_copy_from_guest_phys(buf, sgpa, bytes); -if ( rc == HVMCOPY_okay ) +if ( rc == HVMTRANS_okay ) rc = hvm_copy_to_guest_phys(dgpa, buf, bytes, current); xfree(buf); -if ( rc == HVMCOPY_gfn_paged_out ) +if ( rc == HVMTRANS_gfn_paged_out ) return X86EMUL_RETRY; -if ( rc == HVMCOPY_gfn_shared ) +if ( rc == HVMTRANS_gfn_shared ) return X86EMUL_RETRY; -if ( rc != HVMCOPY_okay ) +if ( rc != HVMTRANS_okay ) { gdprintk(XENLOG_WARNING, "Failed memory-to-memory REP MOVS: sgpa=%" PRIpaddr" dgpa=%"PRIpaddr" reps=%lu bytes_per_rep=%u\n", @@ -1513,10 +1513,10 @@ static int hvmemul_rep_stos( switch ( rc ) { -case HVMCOPY_gfn_paged_out: -case HVMCOPY_gfn_shared: +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: return X86EMUL_RETRY; -case HVMCOPY_okay: +case HVMTRANS_okay: return X86EMUL_OKAY; } @@ -2172,7 +2172,7 @@ void hvm_emulate_init_per_insn(
[Xen-devel] [PATCH v3 0/3] Various XSA followups
XSA-219 was discovered while trying to implement the bugfix in patch 3. Andrew Cooper (3): [RFC] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result x86/hvm: Break out __hvm_copy()'s translation logic x86/hvm: Implement hvmemul_write() using real mappings Alexandru Isaila (2): x86/hvm: Break out __hvm_copy()'s translation logic x86/hvm: Implement hvmemul_write() using real mappings --- Change log : I did not address the comments that are still in debate ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH V3 2/3] Tool/ACPI: DSDT extension to support more vcpus
>>> On 19.09.17 at 15:48,wrote: > On Tue, Sep 19, 2017 at 07:44:21AM -0600, Jan Beulich wrote: >> >>> On 19.09.17 at 15:29, wrote: >> > On Wed, Sep 13, 2017 at 12:52:48AM -0400, Lan Tianyu wrote: >> >> +if ( apic_id > 254 ) >> > >> > 255? An APIC ID of 255 should still be fine. >> >> Wasn't it you who (validly) asked for the boundary to be 254, due >> to 0xff being the broadcast value? > > But that's the ACPI ID, not the APIC ID. The code above says "apic_id" - is the variable mis-named? Or am I reading your reply the wrong way round, in which case the question would be why an ACPI ID could ever express something like "broadcast"? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel