Re: [Xen-devel] [PATCH] x86/domain: don't destroy IOREQ servers on soft reset
> -Original Message- > From: Igor Druzhinin > Sent: 28 August 2019 21:40 > To: xen-devel@lists.xenproject.org > Cc: jbeul...@suse.com; Andrew Cooper ; Paul Durrant > ; w...@xen.org; Roger Pau Monne > ; Igor Druzhinin > > Subject: [PATCH] x86/domain: don't destroy IOREQ servers on soft reset > > Performing soft reset should not opportunistically kill IOREQ servers > for device emulators that might be currently running for a domain. > Every emulator is supposed to clean up IOREQ servers for itself on exit. > This allows a toolstack to elect whether or not a particular device > model should be restarted. > > Since commit ba7fdd64b ("xen: cleanup IOREQ server on exit") QEMU now > destroys IOREQ server for itself as every other device emulator > is supposed to do. It's now safe to remove this code from soft reset > path - existing systems with old QEMU should be able to work as > even if there are IOREQ servers left behind, a new QEMU instance will > override its ranges anyway. > > Signed-off-by: Igor Druzhinin Code LGTM, but I think it also be useful to mention the commit that introduced hvm_domain_soft_reset(): 3235cbfe "arch-specific hooks for domain_soft_reset()" ...since it specifically calls out destroying ioreq servers in its commit message. I suspect that was necessary at the time because the 'default' ioreq server had no means of cleaning up (because it was unaware of the API) but now that support for a default server has gone away, this patch should be fine. Paul > --- > xen/arch/x86/domain.c | 2 -- > xen/arch/x86/hvm/hvm.c| 5 - > xen/include/asm-x86/hvm/hvm.h | 1 - > 3 files changed, 8 deletions(-) > > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index 2df3123..957f059 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -713,8 +713,6 @@ int arch_domain_soft_reset(struct domain *d) > if ( !is_hvm_domain(d) ) > return -EINVAL; > > -hvm_domain_soft_reset(d); > - > spin_lock(>event_lock); > for ( i = 0; i < d->nr_pirqs ; i++ ) > { > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index 029eea3..2b81899 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -5045,11 +5045,6 @@ void hvm_toggle_singlestep(struct vcpu *v) > v->arch.hvm.single_step = !v->arch.hvm.single_step; > } > > -void hvm_domain_soft_reset(struct domain *d) > -{ > -hvm_destroy_all_ioreq_servers(d); > -} > - > /* > * Segment caches in VMCB/VMCS are inconsistent about which bits are checked, > * important, and preserved across vmentry/exit. Cook the values to make > them > diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h > index b327bd2..4e72d07 100644 > --- a/xen/include/asm-x86/hvm/hvm.h > +++ b/xen/include/asm-x86/hvm/hvm.h > @@ -240,7 +240,6 @@ extern const struct hvm_function_table *start_vmx(void); > int hvm_domain_initialise(struct domain *d); > void hvm_domain_relinquish_resources(struct domain *d); > void hvm_domain_destroy(struct domain *d); > -void hvm_domain_soft_reset(struct domain *d); > > int hvm_vcpu_initialise(struct vcpu *v); > void hvm_vcpu_destroy(struct vcpu *v); > -- > 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [qemu-mainline test] 140739: regressions - FAIL
flight 140739 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/140739/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-pvshim 12 guest-start fail REGR. vs. 140282 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 140282 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 140282 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 140282 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 140282 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 140282 test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass version targeted for testing: qemuu23919ddfd56135cad3cb468a8f54d5a595f024f4 baseline version: qemuuafd760539308a5524accf964107cdb1d54a059e3 Last test of basis 140282 2019-08-18 05:36:51 Z 11 days Failing since140361 2019-08-19 11:36:26 Z9 days 11 attempts Testing same since 140739 2019-08-28 07:28:02 Z1 days1 attempts People who touched revisions under test: Alberto Garcia Aleksandar Markovic Alex Bennée Alexey Kardashevskiy Andrey Shinkevich Anthony PERARD Aurelien Jarno BALATON Zoltan Bastian Koppelmann Carlo Marcelo Arenas Belón Catherine Ho
Re: [Xen-devel] [RFC Patch] xen/pt: Emulate FLR capability
On Thu, Aug 29, 2019 at 05:02:27PM +0800, Chao Gao wrote: > Currently, for a HVM on Xen, no reset method is virtualized. So in a VM's > perspective, assigned devices cannot be reset. But some devices rely on PCI > reset to recover from hardware hangs. When being assigned to a VM, those > devices cannot be reset and won't work any longer if a hardware hang occurs. > We have to reboot VM to trigger PCI reset on host to recover the device. > > This patch exposes FLR capability to VMs if the assigned device can be reset > on > host. When VM initiates an FLR to a device, qemu cleans up the device state, > (including disabling of intx and/or MSI and unmapping BARs from guest, > deleting > emulated registers), then initiate PCI reset through 'reset' knob under the > device's sysfs, finally initialize the device again. I think you likely need to deassign the device from the VM, perform the reset, and then assign the device again, so that there's no Xen internal state carried over prior to the reset? Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V3 4/8] xen/common: Introduce xrealloc_flex_struct() helper macros
On 28.08.2019 20:23, Oleksandr wrote: > --- a/xen/include/xen/xmalloc.h > +++ b/xen/include/xen/xmalloc.h > @@ -35,6 +35,18 @@ > #define xzalloc_array(_type, _num) \ > ((_type *)_xzalloc_array(sizeof(_type), __alignof__(_type), _num)) > > +/* Allocate space for a structure with a flexible array of typed > objects. */ > +#define xmalloc_flex_struct(type, field, nr) \ > + ((type *)_xmalloc(offsetof(type, field[nr]), __alignof__(type))) > + > +/* Re-allocate space for a structure with a flexible array of typed > objects. */ > +#define xrealloc_flex_struct(ptr, type, field, nr) > ({ \ > + typeof(*(ptr)) *ptr_ = > (ptr); \ > + /* Type checking: make sure that incoming pointer is of correct > type */ \ > + (void)((ptr) == (type > *)0); \ > + (type *)_xrealloc(ptr_, offsetof(type, field[nr]), > __alignof__(type)); \ > +}) > + What about #define xrealloc_flex_struct(ptr, field, nr) \ (typeof(ptr))_xrealloc(ptr, offsetof(typeof(*(ptr)), field[nr]), \ __alignof__(typeof(*(ptr ? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] Partially revert "x86/mm: Clean IOMMU flags from p2m-pt code"
On Thu, Aug 29, 2019 at 11:33:11AM +0200, Jan Beulich wrote: > On 29.08.2019 10:49, Roger Pau Monne wrote: > > @@ -640,9 +670,17 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, > > mfn_t mfn, > > && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) ) > > p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1; > > > > +if ( need_iommu_pt_sync(p2m->domain) && > > + (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) ) > > +rc = iommu_pte_flags ? > > +iommu_legacy_map(d, _dfn(gfn), mfn, page_order, > > + iommu_pte_flags) : > > +iommu_legacy_unmap(d, _dfn(gfn), page_order); > > Indentation of the if() body is one level too deep. With this > corrected (easy enough to do while committing) > Reviewed-by: Jan Beulich Indeed, my editor seems to have done the wrong thing. Thanks. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [RFC Patch] xen/pt: Emulate FLR capability
On 29.08.2019 11:02, Chao Gao wrote: > Currently, for a HVM on Xen, no reset method is virtualized. So in a VM's > perspective, assigned devices cannot be reset. But some devices rely on PCI > reset to recover from hardware hangs. When being assigned to a VM, those > devices cannot be reset and won't work any longer if a hardware hang occurs. > We have to reboot VM to trigger PCI reset on host to recover the device. Did you consider a hot-unplug, reset (by host), hot-plug cycle instead? > +static int xen_pt_devctl_reg_write(XenPCIPassthroughState *s, > + XenPTReg *cfg_entry, uint16_t *val, > + uint16_t dev_value, uint16_t valid_mask) > +{ > +if (s->real_device.is_resetable && (*val & PCI_EXP_DEVCTL_BCR_FLR)) { > +xen_pt_reset(s); > +} > +return xen_pt_word_reg_write(s, cfg_entry, val, dev_value, valid_mask); I think you also need to clear the bit before handing on the request, such that reads will always observe it clear. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v8 01/28] linkage: new macros for assembler symbols
On 12. 08. 19, 19:06, Borislav Petkov wrote: >> + Again, every ``SYM_CODE_START*`` **shall** be coupled by ``SYM_CODE_END``. > > Btw, this coupling: I haven't gone through the whole patchset but do we > have automatic checking which makes sure a starting macro is coupled > with the correct ending macro? I do, but it's not part of this series. It's a pinch of link-time magic, but it works reliably (see e.g. 1cbec37b3f9c). I will post it if/after this gets merged. There were other approaches proposed in the past too -- using objtool for that (not implemented). >> +Overriding Macros >> +~ >> +Architecture can also override any of the macros in their own > > "Other architectures... " Not only "other", x86 can override the types if need be too. >> +``asm/linkage.h``, including macros specifying the type of a symbol >> +(``SYM_T_FUNC``, ``SYM_T_OBJECT``, and ``SYM_T_NONE``). As every macro >> +described in this file is surrounded by ``#ifdef`` + ``#endif``, it is >> enough >> +to define the macros differently in the aforementioned >> architecture-dependent >> +header. thanks, -- js suse labs ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] Partially revert "x86/mm: Clean IOMMU flags from p2m-pt code"
On 8/29/19 9:49 AM, Roger Pau Monne wrote: > This partially reverts commit > 854a49a7486a02edae5b3e53617bace526e9c1b1 by re-adding the logic that > propagates changes to the domain physmap done by p2m_pt_set_entry into > the iommu page tables. Without this logic changes to the guest physmap > are not propagated to the iommu, leaving stale iommu entries that can > leak data, or failing to add new entries. > > Note that this commit doesn't re-introduce iommu flags to the cpu page > table entries, since the logic to add/remove entries to the iommu page > tables is based on the p2m type and the mfn. > > Fixes: 854a49a7486a02 ('x86/mm: Clean IOMMU flags from p2m-pt code') > Signed-off-by: Roger Pau Monné Sorry, the removal of the iommu update looked a bit suspicious; I should have looked a bit closer. Acked-by: George Dunlap ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V3 8/8] iommu/arm: Add Renesas IPMMU-VMSA support
Hi, all + +static __init int ipmmu_init(struct dt_device_node *node, const void *data) +{ +int ret; + +/* + * Even if the device can't be initialized, we don't want to give + * the IPMMU device to dom0. + */ +dt_device_set_used_by(node, DOMID_XEN); + +if ( !iommu_hap_pt_share ) +{ +printk_once(XENLOG_ERR "ipmmu: P2M table must always be shared between the CPU and the IPMMU\n"); +return -EINVAL; +} + +if ( !ipmmu_stage2_supported() ) +{ +printk_once(XENLOG_ERR "ipmmu: P2M sharing is not supported in current SoC revision\n"); +return -ENODEV; +} +else +{ +/* + * As 4-level translation table is not supported in IPMMU, we need + * to check IPA size used for P2M table beforehand to be sure it is + * 3-level and the IPMMU will be able to use it. + * + * TODO: First initialize the IOMMU and gather the requirements and + * then initialize the P2M. In the P2M code, take into the account + * the IOMMU requirements and restrict "pa_range" if necessary. + */ +if ( IPMMU_MAX_P2M_IPA_BITS < p2m_ipa_bits ) +{ +printk_once(XENLOG_ERR "ipmmu: P2M IPA size is not supported (P2M=%u IPMMU=%u)!\n", +p2m_ipa_bits, IPMMU_MAX_P2M_IPA_BITS); +return -ENODEV; +} There is a patch in ML which is intended to address this TODO: https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg02237.html -- Regards, Oleksandr Tyshchenko ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/boot: Annotate pagetables with STT_OBJECT
On 28.08.2019 20:24, Andrew Cooper wrote: > On 27/08/2019 15:36, Jan Beulich wrote: >> On 14.08.2019 12:44, Andrew Cooper wrote: >>> Introduce a new ENDDATA() helper which sets type and size together. >> >> Except this isn't very natural: Setting the size late is quite >> common, to avoid the need for an "end" label. But the type would >> better be set together with the label definition, i.e. by a >> GLOBAL() counterpart (e.g. GLOBAL_DATA()). However, if despite >> this remark you think your approach is the way to go: > > Well - this way is fewer moving parts. > > GLOBAL and ENTRY are to do with visibility and alignment. While ENTRY > might not typically be used for data, both are commonly used for code. > > We can either have a proliferation of {GLOBAL,ENTRY}{,_DATA,_FUNC,etc} > and a single END, or we can have ENDDATA/ENDFUNC and need no changes to > the existing GLOBAL/ENTRY infrastructure. Yes, it's slightly more of a change the way I'd prefer, but as said, it would (imo) yield a more natural result. Omission of END() would then mean only the symbol's size to not be set, but all other attributes to be correct nevertheless. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match()
On 29.08.2019 09:15, Chao Gao wrote: > On Wed, Aug 28, 2019 at 05:12:34PM +0200, Jan Beulich wrote: >> On 19.08.2019 03:25, Chao Gao wrote: >>> --- a/xen/arch/x86/microcode_intel.c >>> +++ b/xen/arch/x86/microcode_intel.c >>> @@ -134,14 +134,39 @@ static int collect_cpu_info(unsigned int cpu_num, >>> struct cpu_signature *csig) >>> return 0; >>> } >>> >>> -static inline int microcode_update_match( >>> -unsigned int cpu_num, const struct microcode_header_intel *mc_header, >>> -int sig, int pf) >>> +/* Check an update against the CPU signature and current update revision */ >>> +static enum microcode_match_result microcode_update_match( >>> +const struct microcode_header_intel *mc_header, unsigned int cpu) >>> { >>> -struct ucode_cpu_info *uci = _cpu(ucode_cpu_info, cpu_num); >>> - >>> -return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) && >>> -(mc_header->rev > uci->cpu_sig.rev)); >>> +const struct extended_sigtable *ext_header; >>> +const struct extended_signature *ext_sig; >>> +unsigned int i; >>> +struct ucode_cpu_info *uci = _cpu(ucode_cpu_info, cpu); >>> +unsigned int sig = uci->cpu_sig.sig; >>> +unsigned int pf = uci->cpu_sig.pf; >>> +unsigned int rev = uci->cpu_sig.rev; >>> +unsigned long data_size = get_datasize(mc_header); >>> +const void *end = (const void *)mc_header + get_totalsize(mc_header); >>> + >>> +if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) ) >>> +return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE; >> >> Didn't you lose a range check against "end" ahead of this if()? >> get_totalsize() and get_datasize() aiui also would need to live >> after a range check, just a sizeof() (i.e. MC_HEADER_SIZE) based >> one. This would also affect the caller as it seems. > > I think microcode_sanity_check() is for this purpose. We can do > sanity check before the if(). Perhaps, we can just add an assertion > that sanity check won't fail. Because whenever sanity check failed > when pasing an ucode blob, we just drop the ucode; we won't pass an > broken ucode to this function. Well - that's the main question. The purpose of this patch, after all, is (aiui) to allow calling the function in more cases. If all callers are indeed supposed to check the basic properties, then yes, an ASSERT() would be fine. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
On Fri, Aug 23, 2019 at 11:09:07AM +0200, Roger Pau Monné wrote: >On Fri, Aug 23, 2019 at 12:44:34AM +0800, Chao Gao wrote: >> On Thu, Aug 22, 2019 at 04:10:46PM +0200, Roger Pau Monné wrote: >> >On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote: >> >> Both are loading the cached patch. Since APs call the unified function, >> >> microcode_update_one(), during wakeup, the 'start_update' parameter >> >> which originally used to distinguish BSP and APs is redundant. So remove >> >> this parameter. >> >> >> >> Signed-off-by: Chao Gao >> >> --- >> >> Note that here is a functional change: resuming a CPU would call >> >> ->end_update() now while previously it wasn't. Not quite sure >> >> whether it is correct. >> > >> >I guess that's required if it called start_update prior to calling >> >end_update? >> > >> >> >> >> Changes in v9: >> >> - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in >> >>microcode_update_one() >> >> - rebase and fix conflicts. >> >> >> >> Changes in v8: >> >> - split out from the previous patch >> >> --- >> >> xen/arch/x86/acpi/power.c | 2 +- >> >> xen/arch/x86/microcode.c| 90 >> >> ++--- >> >> xen/arch/x86/smpboot.c | 5 +-- >> >> xen/include/asm-x86/processor.h | 4 +- >> >> 4 files changed, 44 insertions(+), 57 deletions(-) >> >> >> >> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c >> >> index 4f21903..24798d5 100644 >> >> --- a/xen/arch/x86/acpi/power.c >> >> +++ b/xen/arch/x86/acpi/power.c >> >> @@ -253,7 +253,7 @@ static int enter_state(u32 state) >> >> >> >> console_end_sync(); >> >> >> >> -microcode_resume_cpu(); >> >> +microcode_update_one(); >> >> >> >> if ( !recheck_cpu_features(0) ) >> >> panic("Missing previously available feature(s)\n"); >> >> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c >> >> index a2febc7..bdd9c9f 100644 >> >> --- a/xen/arch/x86/microcode.c >> >> +++ b/xen/arch/x86/microcode.c >> >> @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char >> >> *buf, uint32_t len) >> >> return NULL; >> >> } >> >> >> >> -int microcode_resume_cpu(void) >> >> -{ >> >> -int err; >> >> -struct cpu_signature *sig = _cpu(cpu_sig); >> >> - >> >> -if ( !microcode_ops ) >> >> -return 0; >> >> - >> >> -spin_lock(_mutex); >> >> - >> >> -err = microcode_ops->collect_cpu_info(sig); >> >> -if ( likely(!err) ) >> >> -err = microcode_ops->apply_microcode(microcode_cache); >> >> -spin_unlock(_mutex); >> >> - >> >> -return err; >> >> -} >> >> - >> >> void microcode_free_patch(struct microcode_patch *microcode_patch) >> >> { >> >> microcode_ops->free_patch(microcode_patch->mc); >> >> @@ -384,11 +366,29 @@ static int __init microcode_init(void) >> >> } >> >> __initcall(microcode_init); >> >> >> >> -int __init early_microcode_update_cpu(bool start_update) >> >> +/* Load a cached update to current cpu */ >> >> +int microcode_update_one(void) >> >> +{ >> >> +int rc; >> >> + >> >> +if ( !microcode_ops ) >> >> +return -EOPNOTSUPP; >> >> + >> >> +rc = microcode_update_cpu(NULL); >> >> + >> >> +if ( microcode_ops->end_update ) >> >> +microcode_ops->end_update(); >> > >> >Don't you need to call start_update before calling >> >microcode_update_cpu? >> >> No. On AMD side, osvw_status records the hardware erratum in the system. >> As we don't assume all CPUs have the same erratum, each cpu calls >> end_update to update osvw_status after ucode loading. >> start_update just resets osvw_status to 0. And it is called once prior >> to ucode loading on any CPU so that osvw_status can be recomputed. > >Oh, I think I understand it. start_update must only be called once >_before_ the sequence to update the microcode on all CPUs is >performed, while end_update needs to be called on _each_ CPU after the >update has been completed in order to account for any erratas. > >The name for those hooks should be improved, I guess renaming >end_update to end_update_each or end_update_percpu would be clearer in >order to make it clear that start_update is global, while end_update >is percpu. Anyway, I don't want to delay this series for a naming nit. > >I'm still unsure where start_update is called for the resume from >suspension case, I don't seem to see any call to start_update neither >in enter_state or microcode_update_one, hence I think this is missing? No. Actually, no call of start_update for resume case. > >I would expect you need to clean osvw_status also on resume from >suspension, in case microcode loading fails? Or else you will be >carrying a stale osvw_status. Then we need to send IPI to all other CPUs to recompute osvw_state. But I think it is not necessary. If ucode cache isn't changed during the CPU's suspension period, there is not stale osvw bit (assuming OSVW on the resuming CPU won't change). If the ucode cache is
Re: [Xen-devel] [PATCH v6 07/10] use is_iommu_enabled() where appropriate...
> -Original Message- > From: Roger Pau Monne > Sent: 23 August 2019 11:56 > To: Paul Durrant > Cc: xen-devel@lists.xenproject.org; Stefano Stabellini > ; Julien Grall > ; Volodymyr Babchuk ; Jan > Beulich > ; Andrew Cooper ; Wei Liu > ; Jun Nakajima > ; Kevin Tian ; George Dunlap > ; > Suravee Suthikulpanit ; Brian Woods > ; Daniel De > Graaf > Subject: Re: [PATCH v6 07/10] use is_iommu_enabled() where appropriate... > > On Fri, Aug 16, 2019 at 06:19:58PM +0100, Paul Durrant wrote: > > ...rather than testing the global iommu_enabled flag and ops pointer. > > > > Now that there is a per-domain flag indicating whether the domain is > > permitted to use the IOMMU (which determines whether the ops pointer will > > be set), many tests of the global iommu_enabled flag and ops pointer can > > be translated into tests of the per-domain flag. Some of the other tests of > > purely the global iommu_enabled flag can also be translated into tests of > > the per-domain flag. > > > > NOTE: The comment in iommu_share_p2m_table() is also fixed; need_iommu() > > disappeared some time ago. Also, whilst the style of the 'if' in > > flask_iommu_resource_use_perm() is fixed, I have not translated any > > instances of u32 into uint32_t to keep consistency. IMO such a > > translation would be better done globally for the source module in > > a separate patch. > > I've usually changed those instances as the line gets modified anyway, > but I'm not going to ask everyone to do this if they don't feel like > it. > > The problem with doing wholesale changes of u32 -> uint32_t is that > you introduce a lot of noise when trying to use git blame afterwards > for example. Doing it when touching the line for legitimate changes > avoids the noise. > > > The change in the initialization of the 'hd' variable in iommu_map() > > and iommu_unmap() is done to keep the PV shim build happy. Without > > this change it will fail to compile with errors of the form: > > > > iommu.c:361:32: error: unused variable ‘hd’ [-Werror=unused-variable] > > const struct domain_iommu *hd = dom_iommu(d); > > ^~ > > > > Signed-off-by: Paul Durrant > > Reviewed-by: Roger Pau Monné > Thanks. > I haven't looked however if there are further cases where > iommu_enabled should be changed into is_iommu_enabled. > Jan had clearly checked on a previous review iteration because he spotted a couple of places I had missed. I'm reasonably confident I've found them all now. Paul > > @@ -556,8 +560,8 @@ int iommu_do_domctl( > > { > > int ret = -ENODEV; > > > > -if ( !iommu_enabled ) > > -return -ENOSYS; > > +if ( !is_iommu_enabled(d) ) > > +return -EOPNOTSUPP; > > I would use ENOENT here, but I don't have a strong opinion. The > hypercall is supported, it's just that there's no iommu for this > domain. > > Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] x86: properly gate clearing of PKU feature
setup_clear_cpu_cap() is __init and hence may not be called post-boot. Note that opt_pku nevertheless is not getting __initdata added - see e.g. commit 43fa95ae6a ("mm: make opt_bootscrub non-init"). Signed-off-by: Jan Beulich --- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -466,7 +466,7 @@ void identify_cpu(struct cpuinfo_x86 *c) this_cpu->c_init(c); - if ( !opt_pku ) + if (c == _cpu_data && !opt_pku) setup_clear_cpu_cap(X86_FEATURE_PKU); /* ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode()
On 22.08.2019 15:59, Roger Pau Monné wrote: > Seeing how this works I'm not sure what's the best option here. As > updating will be attempted on other CPUs, I'm not sure if it's OK to > return an error if the update succeed on some CPUs but failed on > others. The overall result of a partially successful update should be an error - mismatched ucode may, after all, be more of a problem than outdated ucode. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] p2m/ept: pass correct level to atomic_write_ept_entry in ept_invalidate_emt
On Tue, Aug 27, 2019 at 05:23:33PM +0200, Jan Beulich wrote: > On 23.08.2019 07:58, Tian, Kevin wrote: > > > From: Roger Pau Monne [mailto:roger@citrix.com] > > > Sent: Tuesday, August 20, 2019 11:38 PM > > > > > > The level passed to ept_invalidate_emt corresponds to the EPT entry > > > passed as the mfn parameter, which is a pointer to an EPT page table, > > > hence the entries in that page table will have one level less than the > > > parent. > > > > > > Fix the call to atomic_write_ept_entry to pass the correct level, ie: > > > one level less than the parent. > > > > > > Fixes: 50fe6e73705 ('pvh dom0: add and remove foreign pages') > > > Signed-off-by: Roger Pau Monné > > > --- > > > Cc: Jun Nakajima > > > Cc: Kevin Tian > > > Cc: George Dunlap > > > Cc: Jan Beulich > > > Cc: Andrew Cooper > > > Cc: Wei Liu > > > Cc: Sergey Dyasli > > > Cc: Paul Durrant > > > --- > > > xen/arch/x86/mm/p2m-ept.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > > > index 6b8468c793..23ae6e9da3 100644 > > > --- a/xen/arch/x86/mm/p2m-ept.c > > > +++ b/xen/arch/x86/mm/p2m-ept.c > > > @@ -367,7 +367,7 @@ static bool_t ept_invalidate_emt(struct p2m_domain > > > *p2m, mfn_t mfn, > > > e.emt = MTRR_NUM_TYPES; > > > if ( recalc ) > > > e.recalc = 1; > > > -rc = atomic_write_ept_entry(p2m, [i], e, level); > > > +rc = atomic_write_ept_entry(p2m, [i], e, level - 1); > > > ASSERT(rc == 0); > > > changed = 1; > > > } > > > > Reviewed-by: Kevin Tian . > > > > One small comment about readability. What about renaming 'level' > > to 'parent_level' in this function? > > And also taking the opportunity and making it unsigned int? I've been thinking about this, and I'm not sure renaming level to parent_level is correct, level actually matches the level of the mfn also passed as a parameter, and hence it's correct. It's the function itself that actually iterates over the page table entries, and hence descends one level into the page tables. ie: I could add something like ASSERT(level) to make sure no level 0 entries are passed to the function, but renaming doesn't seem correct to me. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
On 29.08.2019 12:20, Paul Durrant wrote: >> -Original Message- >> From: Jan Beulich >> Sent: 29 August 2019 10:52 >> To: Paul Durrant >> Cc: 'JulienGrall' ; 'Alexandru Isaila' >> ; 'Petre >> Pircalabu' ; 'Razvan Cojocaru' >> ; Andrew Cooper >> ; George Dunlap ; Ian >> Jackson >> ; Roger Pau Monne ; >> 'VolodymyrBabchuk' >> ; 'Stefano Stabellini' ; >> 'xen- >> de...@lists.xenproject.org' ; 'Konrad >> Rzeszutek Wilk' >> ; 'Tamas K Lengyel' ; Tim >> (Xen.org) ; 'Wei >> Liu' >> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of >> IOMMU page tables >> >> On 29.08.2019 11:33, Paul Durrant wrote: >>> TBH I've seen sufficient numbers of domain create failures when using >>> auto-ballooning that I stopped using it some time ago (besides the fact >>> that it's slow). If you're happy with the simplistic >>> double-the-page-table-reservation calculation then I can add that but >>> IMO it would be better to add another patch to just remove auto-ballooning. >> >> I'd not be overly happy, but I could live with this. But this needs >> consent by others, not the least the tool stack maintainers. >> > > Ok. I'll deal with that later. Looking again though, I'm not sure what > you mean by 'the amount reserved for page tables'? Do you mean > 'shadow_memkb' or something else? This one, I think, yes, or the value it gets defaulted to when not set in the guest config file. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
On Thu, Aug 29, 2019 at 03:37:47PM +0800, Chao Gao wrote: > On Fri, Aug 23, 2019 at 11:09:07AM +0200, Roger Pau Monné wrote: > >On Fri, Aug 23, 2019 at 12:44:34AM +0800, Chao Gao wrote: > >> On Thu, Aug 22, 2019 at 04:10:46PM +0200, Roger Pau Monné wrote: > >> >On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote: > >> >> Both are loading the cached patch. Since APs call the unified function, > >> >> microcode_update_one(), during wakeup, the 'start_update' parameter > >> >> which originally used to distinguish BSP and APs is redundant. So remove > >> >> this parameter. > >> >> > >> >> Signed-off-by: Chao Gao > >> >> --- > >> >> Note that here is a functional change: resuming a CPU would call > >> >> ->end_update() now while previously it wasn't. Not quite sure > >> >> whether it is correct. > >> > > >> >I guess that's required if it called start_update prior to calling > >> >end_update? > >> > > >> >> > >> >> Changes in v9: > >> >> - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in > >> >>microcode_update_one() > >> >> - rebase and fix conflicts. > >> >> > >> >> Changes in v8: > >> >> - split out from the previous patch > >> >> --- > >> >> xen/arch/x86/acpi/power.c | 2 +- > >> >> xen/arch/x86/microcode.c| 90 > >> >> ++--- > >> >> xen/arch/x86/smpboot.c | 5 +-- > >> >> xen/include/asm-x86/processor.h | 4 +- > >> >> 4 files changed, 44 insertions(+), 57 deletions(-) > >> >> > >> >> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c > >> >> index 4f21903..24798d5 100644 > >> >> --- a/xen/arch/x86/acpi/power.c > >> >> +++ b/xen/arch/x86/acpi/power.c > >> >> @@ -253,7 +253,7 @@ static int enter_state(u32 state) > >> >> > >> >> console_end_sync(); > >> >> > >> >> -microcode_resume_cpu(); > >> >> +microcode_update_one(); > >> >> > >> >> if ( !recheck_cpu_features(0) ) > >> >> panic("Missing previously available feature(s)\n"); > >> >> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c > >> >> index a2febc7..bdd9c9f 100644 > >> >> --- a/xen/arch/x86/microcode.c > >> >> +++ b/xen/arch/x86/microcode.c > >> >> @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const > >> >> char *buf, uint32_t len) > >> >> return NULL; > >> >> } > >> >> > >> >> -int microcode_resume_cpu(void) > >> >> -{ > >> >> -int err; > >> >> -struct cpu_signature *sig = _cpu(cpu_sig); > >> >> - > >> >> -if ( !microcode_ops ) > >> >> -return 0; > >> >> - > >> >> -spin_lock(_mutex); > >> >> - > >> >> -err = microcode_ops->collect_cpu_info(sig); > >> >> -if ( likely(!err) ) > >> >> -err = microcode_ops->apply_microcode(microcode_cache); > >> >> -spin_unlock(_mutex); > >> >> - > >> >> -return err; > >> >> -} > >> >> - > >> >> void microcode_free_patch(struct microcode_patch *microcode_patch) > >> >> { > >> >> microcode_ops->free_patch(microcode_patch->mc); > >> >> @@ -384,11 +366,29 @@ static int __init microcode_init(void) > >> >> } > >> >> __initcall(microcode_init); > >> >> > >> >> -int __init early_microcode_update_cpu(bool start_update) > >> >> +/* Load a cached update to current cpu */ > >> >> +int microcode_update_one(void) > >> >> +{ > >> >> +int rc; > >> >> + > >> >> +if ( !microcode_ops ) > >> >> +return -EOPNOTSUPP; > >> >> + > >> >> +rc = microcode_update_cpu(NULL); > >> >> + > >> >> +if ( microcode_ops->end_update ) > >> >> +microcode_ops->end_update(); > >> > > >> >Don't you need to call start_update before calling > >> >microcode_update_cpu? > >> > >> No. On AMD side, osvw_status records the hardware erratum in the system. > >> As we don't assume all CPUs have the same erratum, each cpu calls > >> end_update to update osvw_status after ucode loading. > >> start_update just resets osvw_status to 0. And it is called once prior > >> to ucode loading on any CPU so that osvw_status can be recomputed. > > > >Oh, I think I understand it. start_update must only be called once > >_before_ the sequence to update the microcode on all CPUs is > >performed, while end_update needs to be called on _each_ CPU after the > >update has been completed in order to account for any erratas. > > > >The name for those hooks should be improved, I guess renaming > >end_update to end_update_each or end_update_percpu would be clearer in > >order to make it clear that start_update is global, while end_update > >is percpu. Anyway, I don't want to delay this series for a naming nit. > > > >I'm still unsure where start_update is called for the resume from > >suspension case, I don't seem to see any call to start_update neither > >in enter_state or microcode_update_one, hence I think this is missing? > > No. Actually, no call of start_update for resume case. > > > > >I would expect you need to clean osvw_status also on resume from > >suspension, in case microcode loading fails? Or else you will be
[Xen-devel] [PATCH v2] Partially revert "x86/mm: Clean IOMMU flags from p2m-pt code"
This partially reverts commit 854a49a7486a02edae5b3e53617bace526e9c1b1 by re-adding the logic that propagates changes to the domain physmap done by p2m_pt_set_entry into the iommu page tables. Without this logic changes to the guest physmap are not propagated to the iommu, leaving stale iommu entries that can leak data, or failing to add new entries. Note that this commit doesn't re-introduce iommu flags to the cpu page table entries, since the logic to add/remove entries to the iommu page tables is based on the p2m type and the mfn. Fixes: 854a49a7486a02 ('x86/mm: Clean IOMMU flags from p2m-pt code') Signed-off-by: Roger Pau Monné --- Cc: Alexandru Stefan ISAILA --- Changes since v1: - Remove the share-pt branch, there's no sharing on AMD hardware. --- xen/arch/x86/mm/p2m-pt.c | 42 ++-- 1 file changed, 40 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c index 3a0a500d66..8483dc8142 100644 --- a/xen/arch/x86/mm/p2m-pt.c +++ b/xen/arch/x86/mm/p2m-pt.c @@ -508,7 +508,18 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn, l2_pgentry_t l2e_content; l3_pgentry_t l3e_content; int rc; -unsigned int flags; +unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt, mfn); +/* + * old_mfn and iommu_old_flags control possible flush/update needs on the + * IOMMU: We need to flush when MFN or flags (i.e. permissions) change. + * iommu_old_flags being initialized to zero covers the case of the entry + * getting replaced being a non-present (leaf or intermediate) one. For + * present leaf entries the real value will get calculated below, while + * for present intermediate entries ~0 (guaranteed != iommu_pte_flags) + * will be used (to cover all cases of what the leaf entries underneath + * the intermediate one might be). + */ +unsigned int flags, iommu_old_flags = 0; unsigned long old_mfn = mfn_x(INVALID_MFN); if ( !sve ) @@ -556,9 +567,17 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn, if ( flags & _PAGE_PRESENT ) { if ( flags & _PAGE_PSE ) +{ old_mfn = l1e_get_pfn(*p2m_entry); +iommu_old_flags = +p2m_get_iommu_flags(p2m_flags_to_type(flags), +_mfn(old_mfn)); +} else +{ +iommu_old_flags = ~0; intermediate_entry = *p2m_entry; +} } check_entry(mfn, p2mt, p2m_flags_to_type(flags), page_order); @@ -594,6 +613,9 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn, 0, L1_PAGETABLE_ENTRIES); ASSERT(p2m_entry); old_mfn = l1e_get_pfn(*p2m_entry); +iommu_old_flags = +p2m_get_iommu_flags(p2m_flags_to_type(l1e_get_flags(*p2m_entry)), +_mfn(old_mfn)); if ( mfn_valid(mfn) || p2m_allows_invalid_mfn(p2mt) ) entry_content = p2m_l1e_from_pfn(mfn_x(mfn), @@ -617,9 +639,17 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn, if ( flags & _PAGE_PRESENT ) { if ( flags & _PAGE_PSE ) +{ old_mfn = l1e_get_pfn(*p2m_entry); +iommu_old_flags = +p2m_get_iommu_flags(p2m_flags_to_type(flags), +_mfn(old_mfn)); +} else +{ +iommu_old_flags = ~0; intermediate_entry = *p2m_entry; +} } check_entry(mfn, p2mt, p2m_flags_to_type(flags), page_order); @@ -640,9 +670,17 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn, && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) ) p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1; +if ( need_iommu_pt_sync(p2m->domain) && + (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) ) +rc = iommu_pte_flags ? +iommu_legacy_map(d, _dfn(gfn), mfn, page_order, + iommu_pte_flags) : +iommu_legacy_unmap(d, _dfn(gfn), page_order); + /* * Free old intermediate tables if necessary. This has to be the - * last thing we do so as to avoid a potential use-after-free. + * last thing we do, after removal from the IOMMU tables, so as to + * avoid a potential use-after-free. */ if ( l1e_get_flags(intermediate_entry) & _PAGE_PRESENT ) p2m_free_entry(p2m, _entry, page_order); -- 2.22.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [OSSTEST PATCH 2/3] host properties: Firmware: Move default to selecthost
Drop the explicit default from all the call sites. This centralises the default. This is going to be the new scheme for host properties in general. (Two of the call sites had a different default, "", which in their context was semantically equivalent to "bios".) Signed-off-by: Ian Jackson --- Osstest/Debian.pm | 4 ++-- Osstest/TestSupport.pm | 13 + ts-xen-install | 2 +- 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/Osstest/Debian.pm b/Osstest/Debian.pm index 911d8905..79aa2d24 100644 --- a/Osstest/Debian.pm +++ b/Osstest/Debian.pm @@ -443,7 +443,7 @@ sub setupboot_grub2 () { # Grub2 on jessie/stretch ARM* doesn't do multiboot, so we must chainload. my $need_uefi_chainload = -get_host_property($ho, "firmware", "") eq "uefi" && +get_host_property($ho, "firmware") eq "uefi" && $ho->{Suite} =~ m/jessie|stretch/ && $ho->{Arch} =~ m/^arm/; my $parsemenu= sub { @@ -1498,7 +1498,7 @@ END preseed_microcode($ho,$sfx); -if (get_host_property($ho, "firmware",'') eq "uefi") { +if (get_host_property($ho, "firmware") eq "uefi") { die unless $ho->{Suite} =~ m/jessie|stretch/; # Prevent grub-install from making a new Debian boot entry, so # we always reboot from the network. Debian bug #789798 proposes a diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm index e554af38..b629fb7d 100644 --- a/Osstest/TestSupport.pm +++ b/Osstest/TestSupport.pm @@ -1218,7 +1218,10 @@ sub selecthost ($;$) { #- calculation of the host's properties - -$ho->{Properties} = { }; +# Firstly, hardcoded defaults +$ho->{Properties} = { +Firmware => 'bios', +}; my $setprop = sub { my ($pn,$val) = @_; $ho->{Properties}{$pn} = $val; @@ -1363,6 +1366,8 @@ sub propname_check ($) { # It is fine to call this on a guest object too, in which case it will # always return $defval. +# Ideally all uses of $defval would be replaced by defaults in the +# initial array in selecthost. sub get_host_property ($$;$) { my ($ho, $prop, $defval) = @_; return $defval unless $ho->{Properties}; @@ -2811,7 +2816,7 @@ sub host_netboot_file ($;$) { # in array context, returns (dir, pathtail) # where dir does not depend on $templatekeytail my %v = %r; -my $firmware = get_host_property($ho, "firmware", "bios"); +my $firmware = get_host_property($ho, "firmware"); my $templatekeybase = $firmware eq 'uefi' ? 'NetGrub' : 'Pxe'; $templatekeytail //= 'Templates'; my $templatekey = $templatekeybase.$templatekeytail; @@ -2937,7 +2942,7 @@ END sub setup_netboot_local ($) { my ($ho) = @_; -my $firmware = get_host_property($ho, "firmware", "bios"); +my $firmware = get_host_property($ho, "firmware"); $firmware =~ s/-/_/g; no strict qw(refs); return &{"setup_netboot_local_${firmware}"}($ho); @@ -2945,7 +2950,7 @@ sub setup_netboot_local ($) { sub setup_netboot_di ($;%) { my ($ho,$kern,$initrd,$dicmd,$hocmd,%xopts) = @_; -my $firmware = get_host_property($ho, "firmware", "bios"); +my $firmware = get_host_property($ho, "firmware"); $firmware =~ s/-/_/g; no strict qw(refs); return &{"setup_netboot_di_${firmware}"}($ho,$kern,$initrd, diff --git a/ts-xen-install b/ts-xen-install index 2d3c644d..154f78c7 100755 --- a/ts-xen-install +++ b/ts-xen-install @@ -114,7 +114,7 @@ sub extradebs () { some_extradebs([ 'DebianExtraPackages', $suite ]); # $c{ DebianExtraPackages___ } -my $firmware = get_host_property($ho, "firmware", "bios"); +my $firmware = get_host_property($ho, "firmware"); some_extradebs([ 'DebianExtraPackages', $firmware, $ho->{Arch}, $suite ]); } -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [OSSTEST PATCH 1/3] mg-hosts mknetbootdir: Improve sub-option parser
No semantic change. Signed-off-by: Ian Jackson --- mg-hosts | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mg-hosts b/mg-hosts index 58b4acc3..d68f7b4e 100755 --- a/mg-hosts +++ b/mg-hosts @@ -121,7 +121,14 @@ sub l ($) { return split /,/, $_[0]; } sub cmd_mknetbootdir () { my $dryrun = 0; -if (@ARGV && $ARGV[0] eq '-n') { shift @ARGV; $dryrun= 1; } +while (@ARGV && $ARGV[0] =~ m/^-/) { + $_ = shift @ARGV; + last if $_ =~ m/^--?$/; + while (m/^-./) { + if (s/^-n/-/) { $dryrun= 1; } + else { die "unknown mknetbootdir option $_"; } + } +} die unless @ARGV>=1; my $sudo = $ENV{'OSSTEST_SUDO'} // 'sudo'; foreach my $hn (@ARGV) { -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [OSSTEST PATCH 3/3] mg-hosts mknetbootdir: Introduce and require -F
If one runs ./mg-hosts mknetbootdir HOST before having sorted out all the host configuration, it uses the default configuration value for the host's firmware kind, which is "bios". If the configuration is then changed, things don't work. This is confusing. So ask the user to specify one or more -F, or -Fany. CC: Dominic Brekau Signed-off-by: Ian Jackson --- mg-hosts | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/mg-hosts b/mg-hosts index d68f7b4e..14f816ae 100755 --- a/mg-hosts +++ b/mg-hosts @@ -19,7 +19,7 @@ # Usages: # -# ./mg-hosts mknetbootdir HOST... +# ./mg-hosts mknetbootdir -Fany | -F... HOST... # Create directories for netboot as expected by the rest # of osstest. Will use "sudo". The HOST(s) must be # allocated (via mg-allocate HOST). @@ -121,19 +121,29 @@ sub l ($) { return split /,/, $_[0]; } sub cmd_mknetbootdir () { my $dryrun = 0; +my %expect_firmware; while (@ARGV && $ARGV[0] =~ m/^-/) { $_ = shift @ARGV; last if $_ =~ m/^--?$/; while (m/^-./) { if (s/^-n/-/) { $dryrun= 1; } + elsif (s/^-F(.*)//) { $expect_firmware{$1} = 1 } else { die "unknown mknetbootdir option $_"; } } } +die "need at least one -F or -Fany option" + unless %expect_firmware; die unless @ARGV>=1; my $sudo = $ENV{'OSSTEST_SUDO'} // 'sudo'; foreach my $hn (@ARGV) { my $ho= selecthost("host=$hn"); my ($dir, $file) = host_netboot_file($ho); + if (!$expect_firmware{any}) { + my $got_firmware = get_host_property($ho, "firmware"); + die + "host $hn configuration says firmware \`$got_firmware', not as expected\n" + unless $expect_firmware{$got_firmware}; + } die unless defined $dir; my ($rdir, $realfile) = host_netboot_file($ho, 'TemplatesReal'); $realfile //= $file; -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
On 29.08.2019 09:37, Chao Gao wrote: > On Fri, Aug 23, 2019 at 11:09:07AM +0200, Roger Pau Monné wrote: >> On Fri, Aug 23, 2019 at 12:44:34AM +0800, Chao Gao wrote: >>> On Thu, Aug 22, 2019 at 04:10:46PM +0200, Roger Pau Monné wrote: On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote: > Both are loading the cached patch. Since APs call the unified function, > microcode_update_one(), during wakeup, the 'start_update' parameter > which originally used to distinguish BSP and APs is redundant. So remove > this parameter. > > Signed-off-by: Chao Gao > --- > Note that here is a functional change: resuming a CPU would call > ->end_update() now while previously it wasn't. Not quite sure > whether it is correct. I guess that's required if it called start_update prior to calling end_update? > > Changes in v9: > - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in >microcode_update_one() > - rebase and fix conflicts. > > Changes in v8: > - split out from the previous patch > --- > xen/arch/x86/acpi/power.c | 2 +- > xen/arch/x86/microcode.c| 90 > ++--- > xen/arch/x86/smpboot.c | 5 +-- > xen/include/asm-x86/processor.h | 4 +- > 4 files changed, 44 insertions(+), 57 deletions(-) > > diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c > index 4f21903..24798d5 100644 > --- a/xen/arch/x86/acpi/power.c > +++ b/xen/arch/x86/acpi/power.c > @@ -253,7 +253,7 @@ static int enter_state(u32 state) > > console_end_sync(); > > -microcode_resume_cpu(); > +microcode_update_one(); > > if ( !recheck_cpu_features(0) ) > panic("Missing previously available feature(s)\n"); > diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c > index a2febc7..bdd9c9f 100644 > --- a/xen/arch/x86/microcode.c > +++ b/xen/arch/x86/microcode.c > @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char > *buf, uint32_t len) > return NULL; > } > > -int microcode_resume_cpu(void) > -{ > -int err; > -struct cpu_signature *sig = _cpu(cpu_sig); > - > -if ( !microcode_ops ) > -return 0; > - > -spin_lock(_mutex); > - > -err = microcode_ops->collect_cpu_info(sig); > -if ( likely(!err) ) > -err = microcode_ops->apply_microcode(microcode_cache); > -spin_unlock(_mutex); > - > -return err; > -} > - > void microcode_free_patch(struct microcode_patch *microcode_patch) > { > microcode_ops->free_patch(microcode_patch->mc); > @@ -384,11 +366,29 @@ static int __init microcode_init(void) > } > __initcall(microcode_init); > > -int __init early_microcode_update_cpu(bool start_update) > +/* Load a cached update to current cpu */ > +int microcode_update_one(void) > +{ > +int rc; > + > +if ( !microcode_ops ) > +return -EOPNOTSUPP; > + > +rc = microcode_update_cpu(NULL); > + > +if ( microcode_ops->end_update ) > +microcode_ops->end_update(); Don't you need to call start_update before calling microcode_update_cpu? >>> >>> No. On AMD side, osvw_status records the hardware erratum in the system. >>> As we don't assume all CPUs have the same erratum, each cpu calls >>> end_update to update osvw_status after ucode loading. >>> start_update just resets osvw_status to 0. And it is called once prior >>> to ucode loading on any CPU so that osvw_status can be recomputed. >> >> Oh, I think I understand it. start_update must only be called once >> _before_ the sequence to update the microcode on all CPUs is >> performed, while end_update needs to be called on _each_ CPU after the >> update has been completed in order to account for any erratas. >> >> The name for those hooks should be improved, I guess renaming >> end_update to end_update_each or end_update_percpu would be clearer in >> order to make it clear that start_update is global, while end_update >> is percpu. Anyway, I don't want to delay this series for a naming nit. >> >> I'm still unsure where start_update is called for the resume from >> suspension case, I don't seem to see any call to start_update neither >> in enter_state or microcode_update_one, hence I think this is missing? > > No. Actually, no call of start_update for resume case. > >> >> I would expect you need to clean osvw_status also on resume from >> suspension, in case microcode loading fails? Or else you will be >> carrying a stale osvw_status. > > Then we need to send IPI to all other CPUs to recompute osvw_state. But > I think it is not necessary. If ucode cache isn't changed during the > CPU's suspension period, there is not
[Xen-devel] [RFC Patch] xen/pt: Emulate FLR capability
Currently, for a HVM on Xen, no reset method is virtualized. So in a VM's perspective, assigned devices cannot be reset. But some devices rely on PCI reset to recover from hardware hangs. When being assigned to a VM, those devices cannot be reset and won't work any longer if a hardware hang occurs. We have to reboot VM to trigger PCI reset on host to recover the device. This patch exposes FLR capability to VMs if the assigned device can be reset on host. When VM initiates an FLR to a device, qemu cleans up the device state, (including disabling of intx and/or MSI and unmapping BARs from guest, deleting emulated registers), then initiate PCI reset through 'reset' knob under the device's sysfs, finally initialize the device again. Signed-off-by: Chao Gao --- Do we need to introduce an attribute, like "permissive" to explicitly enable FLR capability emulation? During PCI reset, interrupts and BARs are unmapped from the guest. It seems that guest cannot interact with the device directly except access to device's configuration space which is emulated by qemu. If proper method can be used to prevent qemu accessing the physical device there is no new security hole caused by the FLR emulation. VM's FLR may be backed by any reset function on host to the physical device, for example: FLR, D3softreset, secondary bus reset. Not sure it is fine to mix them. Given Linux kernel just uses an unified API to reset device and caller cannot choose a specific one, it might be OK. --- hw/xen/xen-host-pci-device.c | 30 ++ hw/xen/xen-host-pci-device.h | 3 +++ hw/xen/xen_pt.c | 9 + hw/xen/xen_pt.h | 1 + hw/xen/xen_pt_config_init.c | 30 +++--- 5 files changed, 70 insertions(+), 3 deletions(-) diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c index 1b44dcafaf..d549656f42 100644 --- a/hw/xen/xen-host-pci-device.c +++ b/hw/xen/xen-host-pci-device.c @@ -198,6 +198,35 @@ static bool xen_host_pci_dev_is_virtfn(XenHostPCIDevice *d) return !stat(path, ); } +static bool xen_host_pci_resetable(XenHostPCIDevice *d) +{ +char path[PATH_MAX]; + +xen_host_pci_sysfs_path(d, "reset", path, sizeof(path)); + +return !access(path, W_OK); +} + +void xen_host_pci_reset(XenHostPCIDevice *d) +{ +char path[PATH_MAX]; +int fd; + +xen_host_pci_sysfs_path(d, "reset", path, sizeof(path)); + +fd = open(path, O_WRONLY); +if (fd == -1) { +XEN_HOST_PCI_LOG("Xen host pci reset: open error\n"); +return; +} + +if (write(fd, "1", 1) != 1) { +XEN_HOST_PCI_LOG("Xen host pci reset: write error\n"); +} + +return; +} + static void xen_host_pci_config_open(XenHostPCIDevice *d, Error **errp) { char path[PATH_MAX]; @@ -377,6 +406,7 @@ void xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t domain, d->class_code = v; d->is_virtfn = xen_host_pci_dev_is_virtfn(d); +d->is_resetable = xen_host_pci_resetable(d); return; diff --git a/hw/xen/xen-host-pci-device.h b/hw/xen/xen-host-pci-device.h index 4d8d34ecb0..cacf9b3df8 100644 --- a/hw/xen/xen-host-pci-device.h +++ b/hw/xen/xen-host-pci-device.h @@ -32,6 +32,7 @@ typedef struct XenHostPCIDevice { XenHostPCIIORegion rom; bool is_virtfn; +bool is_resetable; int config_fd; } XenHostPCIDevice; @@ -55,4 +56,6 @@ int xen_host_pci_set_block(XenHostPCIDevice *d, int pos, uint8_t *buf, int xen_host_pci_find_ext_cap_offset(XenHostPCIDevice *s, uint32_t cap); +void xen_host_pci_reset(XenHostPCIDevice *d); + #endif /* XEN_HOST_PCI_DEVICE_H */ diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c index 8fbaf2eae9..d750367c0a 100644 --- a/hw/xen/xen_pt.c +++ b/hw/xen/xen_pt.c @@ -938,6 +938,15 @@ static void xen_pt_unregister_device(PCIDevice *d) xen_pt_destroy(d); } +void xen_pt_reset(XenPCIPassthroughState *s) +{ +PCIDevice *d = PCI_DEVICE(s); + +xen_pt_unregister_device(d); +xen_host_pci_reset(>real_device); +xen_pt_realize(d, NULL); +} + static Property xen_pci_passthrough_properties[] = { DEFINE_PROP_PCI_HOST_DEVADDR("hostaddr", XenPCIPassthroughState, hostaddr), DEFINE_PROP_BOOL("permissive", XenPCIPassthroughState, permissive, false), diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h index 9167bbaf6d..ed05bc0d39 100644 --- a/hw/xen/xen_pt.h +++ b/hw/xen/xen_pt.h @@ -332,4 +332,5 @@ int xen_pt_register_vga_regions(XenHostPCIDevice *dev); int xen_pt_unregister_vga_regions(XenHostPCIDevice *dev); void xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev, Error **errp); +void xen_pt_reset(XenPCIPassthroughState *s); #endif /* XEN_PT_H */ diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c index 31ec5add1d..435abd7286 100644 --- a/hw/xen/xen_pt_config_init.c +++ b/hw/xen/xen_pt_config_init.c @@ -852,6 +852,30 @@ static inline uint8_t get_device_type(XenPCIPassthroughState *s, return (flag &
Re: [Xen-devel] [PATCH] x86/mmcfg: add "force" option for MCFG
On 29.08.2019 10:00, Roger Pau Monné wrote: > On Wed, Aug 28, 2019 at 04:24:22PM +0100, Igor Druzhinin wrote: >> --- a/xen/arch/x86/x86_64/mmconfig-shared.c >> +++ b/xen/arch/x86/x86_64/mmconfig-shared.c >> @@ -26,33 +26,34 @@ >> >> #include "mmconfig.h" >> >> +static bool_t __read_mostly force_mmcfg = true; >> unsigned int pci_probe = PCI_PROBE_CONF1 | PCI_PROBE_MMCONF; >> >> static int __init parse_mmcfg(const char *s) >> { >> const char *ss; >> -int rc = 0; >> +int val, rc = 0; >> >> do { >> ss = strchr(s, ','); >> if ( !ss ) >> ss = strchr(s, '\0'); >> >> -switch ( parse_bool(s, ss) ) >> -{ >> -case 0: >> -pci_probe &= ~PCI_PROBE_MMCONF; >> -break; >> -case 1: >> -break; >> -default: >> -if ( !cmdline_strcmp(s, "amd_fam10") || >> - !cmdline_strcmp(s, "amd-fam10") ) >> +if ( (val = parse_bool(s, ss)) >= 0 ) { >> +if ( val ) >> + pci_probe |= PCI_PROBE_MMCONF; >> +else >> + pci_probe &= ~PCI_PROBE_MMCONF; >> +} else if ( (val = parse_boolean("amd_fam10", s, ss)) >= 0 || >> +(val = parse_boolean("amd-fam10", s, ss)) >= 0 ) { >> +if ( val ) >> pci_probe |= PCI_CHECK_ENABLE_AMD_MMCONF; >> else >> -rc = -EINVAL; >> -break; >> -} >> +pci_probe &= ~PCI_CHECK_ENABLE_AMD_MMCONF; >> +} else if ( (val = parse_boolean("force", s, ss)) >= 0) >> +force_mmcfg = val; > > You could also consider adding a new flag to pci_probe, ie: > PCI_PROBE_FORCE_MMCFG. Yes please, albeit to be in sync with the other flag perhaps better PCI_PROBE_FORCE_MMCONF or PCI_PROBE_MMCONF_FORCE. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h
Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24: There have been reports of RDRAND issues after resuming from suspend on some AMD family 15h and family 16h systems. This issue stems from a BIOS not performing the proper steps during resume to ensure RDRAND continues to function properly. Update the CPU initialization to clear the RDRAND CPUID bit for any family 15h and 16h processor that supports RDRAND. If it is known that the family 15h or family 16h system does not have an RDRAND resume issue or that the system will not be placed in suspend, the "cpuid=rdrand" kernel parameter can be used to stop the clearing of the RDRAND CPUID bit. Note, that clearing the RDRAND CPUID bit does not prevent a processor that normally supports the RDRAND instruction from executing it. So any code that determined the support based on family and model won't #UD. Signed-off-by: Jan Beulich --- Slightly RFC, in particular because of the change to parse_xen_cpuid(): Alternative approach suggestions are welcome. --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -488,6 +488,10 @@ The Speculation Control hardware feature be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and won't offer them to guests. +`rdrand` can be used to override the default disabling of the feature on certain +AMD systems. Its negative form can of course also be used to suppress use and +exposure of the feature. + ### cpuid_mask_cpu > `= fam_0f_rev_[cdefg] | fam_10_rev_[bc] | fam_11_rev_b` --- a/xen/arch/x86/cpu/amd.c +++ b/xen/arch/x86/cpu/amd.c @@ -641,6 +641,18 @@ static void init_amd(struct cpuinfo_x86 if (acpi_smi_cmd && (acpi_enable_value | acpi_disable_value)) amd_acpi_c1e_quirk = true; break; + + case 0x15: case 0x16: + /* +* There are too many Fam15/Fam16 systems where upon resume +* from S3 firmware fails to re-setup properly functioning +* RDRAND. Clear the feature unless force-enabled on the +* command line. +*/ + if (c == _cpu_data && + !is_forced_cpu_cap(X86_FEATURE_RDRAND)) + setup_clear_cpu_cap(X86_FEATURE_RDRAND); + break; } display_cacheinfo(c); --- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -94,6 +94,11 @@ void __init setup_force_cpu_cap(unsigned __set_bit(cap, boot_cpu_data.x86_capability); } +bool __init is_forced_cpu_cap(unsigned int cap) +{ + return test_bit(cap, forced_caps); +} + static void default_init(struct cpuinfo_x86 * c) { /* Not much we can do here... */ --- a/xen/arch/x86/cpuid.c +++ b/xen/arch/x86/cpuid.c @@ -74,6 +74,13 @@ static int __init parse_xen_cpuid(const setup_clear_cpu_cap(X86_FEATURE_AMD_SSBD); } } +else if ( (val = parse_boolean("rdrand", s, ss)) >= 0 ) +{ +if ( !val ) +setup_clear_cpu_cap(X86_FEATURE_RDRAND); +else if ( cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_RDRAND) ) +setup_force_cpu_cap(X86_FEATURE_RDRAND); +} else rc = -EINVAL; --- a/xen/include/asm-x86/processor.h +++ b/xen/include/asm-x86/processor.h @@ -166,6 +166,7 @@ extern const struct x86_cpu_id *x86_matc extern void identify_cpu(struct cpuinfo_x86 *); extern void setup_clear_cpu_cap(unsigned int); extern void setup_force_cpu_cap(unsigned int); +extern bool is_forced_cpu_cap(unsigned int); extern void print_cpu_info(unsigned int cpu); extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c); ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
> -Original Message- > From: Jan Beulich > Sent: 27 August 2019 08:46 > To: Paul Durrant > Cc: JulienGrall ; Alexandru Isaila > ; Petre Pircalabu > ; Razvan Cojocaru ; > Andrew Cooper > ; George Dunlap ; Ian > Jackson > ; Roger Pau Monne ; > VolodymyrBabchuk > ; Stefano Stabellini ; > xen- > de...@lists.xenproject.org; Konrad Rzeszutek Wilk ; > Tamas K Lengyel > ; Tim (Xen.org) ; Wei Liu > Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of > IOMMU page tables > > On 12.08.2019 17:41, Paul Durrant wrote: > >> From: Jan Beulich > >> Sent: 07 August 2019 11:32 > >> > >> On 30.07.2019 15:44, Paul Durrant wrote: > >>> @@ -625,8 +548,7 @@ static void iommu_dump_p2m_table(unsigned char key) > >>>ops = iommu_get_ops(); > >>>for_each_domain(d) > >>>{ > >>> -if ( is_hardware_domain(d) || > >>> - dom_iommu(d)->status < IOMMU_STATUS_initialized ) > >>> +if ( !is_iommu_enabled(d) ) > >>>continue; > >> > >> Why do you drop the hwdom check here? > > > > Because is_iommu_enabled() for the h/w domain will always be true if > > iommu_enabled is true, so no need for a special case. > > But the effect of the extra check was to _skip_ Dom0. If you mean to > change this, then you should say so (and why) in the description. > Ah, yes, it does still need to remain. Paul > Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/mmcfg: add "force" option for MCFG
On 28.08.2019 17:24, Igor Druzhinin wrote: > If MCFG area is not reserved in E820 Xen by default will defer its usage > until Dom0 registers it explicitly after ACPI parser recognizes it as > a reserved resource in DSDT. Having it reserved in E820 is not > mandatory according to "PCI Firmware Specification, rev 3.2" (par. 4.1.2) > and firmware is free to keep a hole E820 in that place. > > Unfortunately, keeping it disabled at this point makes Xen fail to > recognize many of PCIe extended capabilities early enough for newly added > devices. Namely, (1) some of VT-d quirks are not applied during Dom0 > device handoff, (2) currently MCFG reservation report comes > too late from Dom0 after some of PCI devices being registered in Xen > by Dom0 kernel that break initialization of a number of PCIe capabilities > (e.g. SR-IOV VF BAR sizing). > > Since introduction of ACPI parser in Xen is not possible add a "force" > option that will allow Xen to use MCFG area even it's not properly > reserved in E820. > > Signed-off-by: Igor Druzhinin > --- > > I think we need to have this option to at least have a way to workaround > problem (1). Problem (2) could be solved in Dom0 kernel by invoking > xen_mcfg_late() earlier but before the first PCI bus enumertaion which > currently happens somwhere during ACPI scan I think. Indeed (2) should be fixed as you say here. (1) should imo be fixed differently too: pci_vtd_quirk() (and anything else that relies on extended config space accesses) should be re-invoked once MCFG becomes available for a given bus (range). Nevertheless I think the command line option is a good thing to have. > --- a/docs/misc/xen-command-line.pandoc > +++ b/docs/misc/xen-command-line.pandoc > @@ -1424,11 +1424,13 @@ ordinary DomU, control domain, hardware domain, and - > when supported > by the platform - DomU with pass-through device assigned). > > ### mmcfg (x86) > -> `= [,amd-fam10]` > +> `= List of [ , force, amd-fam10 ]` > > -> Default: `1` > +> Default: `true,force` > > -Specify if the MMConfig space should be enabled. > +Specify if the MMConfig space should be enabled. If force option is specified > +(default) MMConfig space will be used by Xen early in boot even if it's > +not reserved by firmware in the E820 table. The term "force" is too heavy for my taste: You still require the range to be at least in an E820 hole. "allow-e820-hole" otoh is a little longish. Perhaps "relaxed"? > --- a/xen/arch/x86/x86_64/mmconfig-shared.c > +++ b/xen/arch/x86/x86_64/mmconfig-shared.c > @@ -26,33 +26,34 @@ > > #include "mmconfig.h" > > +static bool_t __read_mostly force_mmcfg = true; Just bool please. > unsigned int pci_probe = PCI_PROBE_CONF1 | PCI_PROBE_MMCONF; > > static int __init parse_mmcfg(const char *s) > { > const char *ss; > -int rc = 0; > +int val, rc = 0; > > do { > ss = strchr(s, ','); > if ( !ss ) > ss = strchr(s, '\0'); > > -switch ( parse_bool(s, ss) ) > -{ > -case 0: > -pci_probe &= ~PCI_PROBE_MMCONF; > -break; > -case 1: > -break; > -default: > -if ( !cmdline_strcmp(s, "amd_fam10") || > - !cmdline_strcmp(s, "amd-fam10") ) > +if ( (val = parse_bool(s, ss)) >= 0 ) { > +if ( val ) > + pci_probe |= PCI_PROBE_MMCONF; > +else > + pci_probe &= ~PCI_PROBE_MMCONF; > +} else if ( (val = parse_boolean("amd_fam10", s, ss)) >= 0 || > +(val = parse_boolean("amd-fam10", s, ss)) >= 0 ) { > +if ( val ) > pci_probe |= PCI_CHECK_ENABLE_AMD_MMCONF; > else > -rc = -EINVAL; > -break; > -} > +pci_probe &= ~PCI_CHECK_ENABLE_AMD_MMCONF; > +} else if ( (val = parse_boolean("force", s, ss)) >= 0) > +force_mmcfg = val; > +else > +rc = -EINVAL; Brace placement is wrong in your additions. I also don't really see why you do away with the switch(). > @@ -355,6 +356,11 @@ static int __init is_mmconf_reserved( > (unsigned int)cfg->start_bus_number, > (unsigned int)cfg->end_bus_number); > } > +} else if (!e820_any_mapped(addr, addr + old_size - 1, 0)) { > +if (!force_mmcfg) > +printk(KERN_WARNING "PCI: MCFG area at %lx is not reserved in > E820, " > + "use mmcfg=force option\n", addr); I don't think saying "use" is a good idea. If anything, make it less strict - either by attaching "if ..." or by saying "consider using". Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86: properly gate clearing of PKU feature
On 29/08/2019 10:27, Jan Beulich wrote: > setup_clear_cpu_cap() is __init and hence may not be called post-boot. > Note that opt_pku nevertheless is not getting __initdata added - see > e.g. commit 43fa95ae6a ("mm: make opt_bootscrub non-init"). > > Signed-off-by: Jan Beulich Acked-by: Andrew Cooper However, I'm tempted to suggest that this logic disappears entirely. Given that we don't support it for PV guests, and can't without taking a CR4 write on toggle_kernel_mode(), all this actually controls is whether the bit finds its way into the HVM max policy. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch
On 19.08.2019 03:25, Chao Gao wrote: > @@ -542,29 +505,21 @@ static struct microcode_patch > *cpu_request_microcode(const void *buf, > while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize, > )) == 0 ) > { > -struct microcode_patch *new_patch = alloc_microcode_patch(mc_amd); > - > -if ( IS_ERR(new_patch) ) > -{ > -error = PTR_ERR(new_patch); > -break; > -} > - > /* > - * If the new patch covers current CPU, compare patches and store the > + * If the new ucode covers current CPU, compare ucodes and store the > * one with higher revision. > */ > -if ( (microcode_fits(new_patch->mc_amd) != MIS_UCODE) && > - (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) ) > +#define REV_ID(mpb) (((struct microcode_header_amd > *)(mpb))->processor_rev_id) > +if ( (microcode_fits(mc_amd) != MIS_UCODE) && > + (!saved || (REV_ID(mc_amd->mpb) > REV_ID(saved))) ) > +#undef REV_ID I'm not happy with this helper #define, the more that "saved" already is of the correct type. compare_patch() in reality only acts on the header, so I'd suggest having that function forward to a new compare_header() (or some other suitable name) and use that new function here as well. > @@ -379,47 +360,47 @@ static struct microcode_patch > *cpu_request_microcode(const void *buf, > { > long offset = 0; > int error = 0; > -void *mc; > +struct microcode_intel *mc, *saved = NULL; > struct microcode_patch *patch = NULL; > > -while ( (offset = get_next_ucode_from_buffer(, buf, size, offset)) > > 0 ) > +while ( (offset = get_next_ucode_from_buffer((void **), buf, Casts like this make me rather nervous. Please see about getting rid of it (by using a union or a 2nd local variable). Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match()
On Wed, Aug 28, 2019 at 05:12:34PM +0200, Jan Beulich wrote: >On 19.08.2019 03:25, Chao Gao wrote: >> to a more generic function. So that it can be used alone to check >> an update against the CPU signature and current update revision. >> >> Note that enum microcode_match_result will be used in common code >> (aka microcode.c), it has been placed in the common header. >> >> Signed-off-by: Chao Gao >> Reviewed-by: Roger Pau Monné >> Reviewed-by: Jan Beulich > >I don't think these can be legitimately retained with ... > >> Changes in v9: >> - microcode_update_match() doesn't accept (sig, pf, rev) any longer. >> Hence, it won't be used to compare two arbitrary updates. > >... this kind of a change. Will drop RBs. > >> --- a/xen/arch/x86/microcode_intel.c >> +++ b/xen/arch/x86/microcode_intel.c >> @@ -134,14 +134,39 @@ static int collect_cpu_info(unsigned int cpu_num, >> struct cpu_signature *csig) >> return 0; >> } >> >> -static inline int microcode_update_match( >> -unsigned int cpu_num, const struct microcode_header_intel *mc_header, >> -int sig, int pf) >> +/* Check an update against the CPU signature and current update revision */ >> +static enum microcode_match_result microcode_update_match( >> +const struct microcode_header_intel *mc_header, unsigned int cpu) >> { >> -struct ucode_cpu_info *uci = _cpu(ucode_cpu_info, cpu_num); >> - >> -return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) && >> -(mc_header->rev > uci->cpu_sig.rev)); >> +const struct extended_sigtable *ext_header; >> +const struct extended_signature *ext_sig; >> +unsigned int i; >> +struct ucode_cpu_info *uci = _cpu(ucode_cpu_info, cpu); >> +unsigned int sig = uci->cpu_sig.sig; >> +unsigned int pf = uci->cpu_sig.pf; >> +unsigned int rev = uci->cpu_sig.rev; >> +unsigned long data_size = get_datasize(mc_header); >> +const void *end = (const void *)mc_header + get_totalsize(mc_header); >> + >> +if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) ) >> +return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE; > >Didn't you lose a range check against "end" ahead of this if()? >get_totalsize() and get_datasize() aiui also would need to live >after a range check, just a sizeof() (i.e. MC_HEADER_SIZE) based >one. This would also affect the caller as it seems. I think microcode_sanity_check() is for this purpose. We can do sanity check before the if(). Perhaps, we can just add an assertion that sanity check won't fail. Because whenever sanity check failed when pasing an ucode blob, we just drop the ucode; we won't pass an broken ucode to this function. Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/3] build: allow picking the env values for compiler variables
On Tue, Aug 27, 2019 at 11:59:33AM +0100, Ian Jackson wrote: > Jan Beulich writes ("Re: [PATCH 2/3] build: allow picking the env values for > compiler variables"): > > On 20.08.2019 09:58, Roger Pau Monné wrote: > > > I don't have things 'left' in the environment, neither have most build > > > systems AFAIK. I do have things set that I want the build to > > > acknowledge, instead of fight against it. > > > > My view is this: XEN_* things coming from the environment are fine. > > Generic (i.e. project agnostic) variables (like CC) otoh would > > better be ignored, as it's not clear for what purpose they've been > > set. (Istr a case where some FORTIFY option was set by RPM build > > environments, breaking our build.) They may well have been meant > > for some other project. > > CC is set *specifically to cause build systems's like Xen's to use > that compiler*. That is its purpose. It should be honoured, not > ignored. > > Likewise FORTIFY, even though it broke something. FORTIFY_SOURCE was > set specifically to cause the changes it did. The people who set it > didn't see all the implications, but that change *was* their design > intent and the bugs were real bugs in what they were trying to do. > > > Question is whether to take the above just for the hypervisor part > > of the build, or also the tool stack etc ones. > > *Definitely* for the toolstack build, we should honour CC et al. > > The hypervisor is a more subtle question because people who set these > variables often forget to think about kernel code, embedded code, > etc. That's what happened with FORTIFY_SOURCE. However, I would > argue that it is better, in such a situation, to honour the variable > and break the build, than it is to simply ignore it. I fully agree. It's true that some build systems will likely break bulding Xen, but that's a build system issue, and we shouldn't try to work around this in Xen. As said before, what build systems (like travis or gitlab for instance) set in the environment should be acknowledged by Xen, or else we are forcing everyone to have custom workarounds for building Xen. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V3 8/8] iommu/arm: Add Renesas IPMMU-VMSA support
Hi Oleksandr-san, > From: Oleksandr Tyshchenko, Sent: Wednesday, August 21, 2019 3:10 AM > > From: Oleksandr Tyshchenko > > The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU) > which provides address translation and access protection functionalities > to processing units and interconnect networks. > > Please note, current driver is supposed to work only with newest > Gen3 SoCs revisions which IPMMU hardware supports stage 2 translation This is still "Gen3", so that please replace it with "R-Car Gen3"... > table format and is able to use CPU's P2M table as is if one is > 3-level page table (up to 40 bit IPA). > > The major differences compare to the Linux driver are: > > 1. Stage 1/Stage 2 translation. Linux driver supports Stage 1 > translation only (with Stage 1 translation table format). It manages > page table by itself. But Xen driver supports Stage 2 translation > (with Stage 2 translation table format) to be able to share the P2M > with the CPU. Stage 1 translation is always bypassed in Xen driver. > > So, Xen driver is supposed to be used with newest R-Car Gen3 SoC revisions > only (H3 ES3.0, M3-W+, etc.) which IPMMU H/W supports stage 2 translation > table format. > > 2. AArch64 support. Linux driver uses VMSAv8-32 mode, while Xen driver > enables Armv8 VMSAv8-64 mode to cover up to 40 bit input address. > > 3. Context bank (sets of page table) usage. In Xen, each context bank is > mapped to one Xen domain. So, all devices being pass throughed to the > same Xen domain share the same context bank. > > 4. IPMMU device tracking. In Xen, all IOMMU devices are managed > by single driver instance. So, driver uses global list to keep track > of registered IPMMU devices. > > Signed-off-by: Oleksandr Tyshchenko > CC: Julien Grall > CC: Yoshihiro Shimoda About this hardware handling, this patch seems good to me. But, since I'm not familiar about Xen passthrough framework, I think I cannot add my Reviewed-by tag into this patch. What do you think? Best regards, Yoshihiro Shimoda ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
> -Original Message- > From: Jan Beulich > Sent: 27 August 2019 08:49 > To: Paul Durrant > Cc: 'JulienGrall' ; 'Alexandru Isaila' > ; 'Petre > Pircalabu' ; 'Razvan Cojocaru' > ; Andrew Cooper > ; George Dunlap ; Ian > Jackson > ; Roger Pau Monne ; > 'VolodymyrBabchuk' > ; 'Stefano Stabellini' ; > 'xen- > de...@lists.xenproject.org' ; 'Konrad > Rzeszutek Wilk' > ; 'Tamas K Lengyel' ; Tim > (Xen.org) ; 'Wei > Liu' > Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of > IOMMU page tables > > On 14.08.2019 11:39, Paul Durrant wrote: > >> -Original Message- > >> From: Xen-devel On Behalf Of Paul > >> Durrant > >> Sent: 12 August 2019 17:26 > >> To: 'Jan Beulich' > >> Cc: 'Petre Pircalabu' ; 'Stefano Stabellini' > >> ; > >> 'Wei Liu' ; 'Razvan Cojocaru' ; > >> 'Konrad Rzeszutek Wilk' > >> ; Andrew Cooper ; Tim > >> (Xen.org) ; > >> George Dunlap ; 'Julien Grall' > >> ; 'Tamas K Lengyel' > >> ; Ian Jackson ; 'Alexandru > >> Isaila' > >> ; 'xen-devel@lists.xenproject.org' > >> ; > >> 'VolodymyrBabchuk' ; Roger Pau Monne > >> > >> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction > >> of IOMMU page tables > >> > >>> -Original Message- > >> [snip] > > On 30.07.2019 15:44, Paul Durrant wrote: > > NOTE: This patch will cause a small amount of extra resource to be used > > to accommodate IOMMU page tables that may never be used, since > > the > > per-domain IOMMU flag enable flag is currently set to the value > > of the global iommu_enable flag. A subsequent patch will add an > > option to the toolstack to allow it to be turned off if there is > > no intention to assign passthrough hardware to the domain. > > In particular if the default of this is going to be "true" (I > didn't look at that patch yet, but the wording above makes me > assume so), in auto-ballooning mode without shared page tables > more memory should imo be ballooned out of Dom0 now. It has > always been a bug that IOMMU page tables weren't accounted for, > but it would become even more prominent then. > >>> > >>> Ultimately, once the whole series is applied, then nothing much should > >>> change for those specifying > >>> passthrough h/w in an xl.cfg. The main difference will be that h/w cannot > >>> be passed through to a > >>> domain that was not originally created with IOMMU pagetables. > >>> With patches applied up to this point then, yes, every domain will get > >>> IOMMU page tables. I guess > >> I'll > >>> take a look at the auto-ballooning code and see what needs to be done. > >>> > >> > >> Ok, I've had a look... > >> > >> I could make a rough calculation in libxl_domain_need_memory() based on > >> the domain's max_memkb and > an > >> assumption of a 4 level translation with 512 PTEs per level, or would > >> prefer such guestimation to > be > >> overridable using an xl.cfg parameter in a broadly similar way to > >> shadow_memkb? > >> > > > > I think I'm going to say no to this anyway since, as you say, the overhead > > as never been accounted > for and having to make assumptions about the IOMMU table structure is not > very attractive. Given that > any issue is going to be transient (i.e. until patch #6 is committed) I don't > think fixing auto- > ballooning ought to be in scope for this series. > > I'm afraid I disagree: The series extends a pre-existing problem > affecting some guests to all ones (at least by default). TBH I've seen sufficient numbers of domain create failures when using auto-ballooning that I stopped using it some time ago (besides the fact that it's slow). If you're happy with the simplistic double-the-page-table-reservation calculation then I can add that but IMO it would be better to add another patch to just remove auto-ballooning. Paul > > Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] Partially revert "x86/mm: Clean IOMMU flags from p2m-pt code"
On 29.08.2019 10:49, Roger Pau Monne wrote: > @@ -640,9 +670,17 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, > mfn_t mfn, > && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) ) > p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1; > > +if ( need_iommu_pt_sync(p2m->domain) && > + (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) ) > +rc = iommu_pte_flags ? > +iommu_legacy_map(d, _dfn(gfn), mfn, page_order, > + iommu_pte_flags) : > +iommu_legacy_unmap(d, _dfn(gfn), page_order); Indentation of the if() body is one level too deep. With this corrected (easy enough to do while committing) Reviewed-by: Jan Beulich Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
On 29.08.2019 11:33, Paul Durrant wrote: > TBH I've seen sufficient numbers of domain create failures when using > auto-ballooning that I stopped using it some time ago (besides the fact > that it's slow). If you're happy with the simplistic > double-the-page-table-reservation calculation then I can add that but > IMO it would be better to add another patch to just remove auto-ballooning. I'd not be overly happy, but I could live with this. But this needs consent by others, not the least the tool stack maintainers. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
> -Original Message- > From: Jan Beulich > Sent: 29 August 2019 10:52 > To: Paul Durrant > Cc: 'JulienGrall' ; 'Alexandru Isaila' > ; 'Petre > Pircalabu' ; 'Razvan Cojocaru' > ; Andrew Cooper > ; George Dunlap ; Ian > Jackson > ; Roger Pau Monne ; > 'VolodymyrBabchuk' > ; 'Stefano Stabellini' ; > 'xen- > de...@lists.xenproject.org' ; 'Konrad > Rzeszutek Wilk' > ; 'Tamas K Lengyel' ; Tim > (Xen.org) ; 'Wei > Liu' > Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of > IOMMU page tables > > On 29.08.2019 11:33, Paul Durrant wrote: > > TBH I've seen sufficient numbers of domain create failures when using > > auto-ballooning that I stopped using it some time ago (besides the fact > > that it's slow). If you're happy with the simplistic > > double-the-page-table-reservation calculation then I can add that but > > IMO it would be better to add another patch to just remove auto-ballooning. > > I'd not be overly happy, but I could live with this. But this needs > consent by others, not the least the tool stack maintainers. > Ok. I'll deal with that later. Looking again though, I'm not sure what you mean by 'the amount reserved for page tables'? Do you mean 'shadow_memkb' or something else? Paul ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 1/5] xen/spinlocks: in debug builds store cpu holding the lock
Add the cpu currently holding the lock to struct lock_debug. This makes analysis of locking errors easier and it can be tested whether the correct cpu is releasing a lock again. Signed-off-by: Juergen Gross --- V2: adjust types (Jan Beulich) --- xen/common/spinlock.c | 33 ++--- xen/include/xen/spinlock.h | 17 - 2 files changed, 38 insertions(+), 12 deletions(-) diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c index 6bc52d70c0..1be1b5ebe6 100644 --- a/xen/common/spinlock.c +++ b/xen/common/spinlock.c @@ -13,9 +13,9 @@ static atomic_t spin_debug __read_mostly = ATOMIC_INIT(0); -static void check_lock(struct lock_debug *debug) +static void check_lock(union lock_debug *debug) { -int irq_safe = !local_irq_is_enabled(); +bool irq_safe = !local_irq_is_enabled(); if ( unlikely(atomic_read(_debug) <= 0) ) return; @@ -43,18 +43,21 @@ static void check_lock(struct lock_debug *debug) */ if ( unlikely(debug->irq_safe != irq_safe) ) { -int seen = cmpxchg(>irq_safe, -1, irq_safe); +union lock_debug seen, new = { 0 }; -if ( seen == !irq_safe ) +new.irq_safe = irq_safe; +seen.val = cmpxchg(>val, LOCK_DEBUG_INITVAL, new.val); + +if ( !seen.unseen && seen.irq_safe == !irq_safe ) { printk("CHECKLOCK FAILURE: prev irqsafe: %d, curr irqsafe %d\n", - seen, irq_safe); + seen.irq_safe, irq_safe); BUG(); } } } -static void check_barrier(struct lock_debug *debug) +static void check_barrier(union lock_debug *debug) { if ( unlikely(atomic_read(_debug) <= 0) ) return; @@ -70,7 +73,18 @@ static void check_barrier(struct lock_debug *debug) * However, if we spin on an IRQ-unsafe lock with IRQs disabled then that * is clearly wrong, for the same reason outlined in check_lock() above. */ -BUG_ON(!local_irq_is_enabled() && (debug->irq_safe == 0)); +BUG_ON(!local_irq_is_enabled() && !debug->irq_safe); +} + +static void got_lock(union lock_debug *debug) +{ +debug->cpu = smp_processor_id(); +} + +static void rel_lock(union lock_debug *debug) +{ +ASSERT(debug->cpu == smp_processor_id()); +debug->cpu = SPINLOCK_NO_CPU; } void spin_debug_enable(void) @@ -87,6 +101,8 @@ void spin_debug_disable(void) #define check_lock(l) ((void)0) #define check_barrier(l) ((void)0) +#define got_lock(l) ((void)0) +#define rel_lock(l) ((void)0) #endif @@ -150,6 +166,7 @@ void inline _spin_lock_cb(spinlock_t *lock, void (*cb)(void *), void *data) cb(data); arch_lock_relax(); } +got_lock(>debug); LOCK_PROFILE_GOT; preempt_disable(); arch_lock_acquire_barrier(); @@ -181,6 +198,7 @@ void _spin_unlock(spinlock_t *lock) arch_lock_release_barrier(); preempt_enable(); LOCK_PROFILE_REL; +rel_lock(>debug); add_sized(>tickets.head, 1); arch_lock_signal(); } @@ -224,6 +242,7 @@ int _spin_trylock(spinlock_t *lock) if ( cmpxchg(>tickets.head_tail, old.head_tail, new.head_tail) != old.head_tail ) return 0; +got_lock(>debug); #ifdef CONFIG_LOCK_PROFILE if (lock->profile) lock->profile->time_locked = NOW(); diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h index 2c7415e23a..88017e0a89 100644 --- a/xen/include/xen/spinlock.h +++ b/xen/include/xen/spinlock.h @@ -6,14 +6,21 @@ #include #ifndef NDEBUG -struct lock_debug { -s16 irq_safe; /* +1: IRQ-safe; 0: not IRQ-safe; -1: don't know yet */ +union lock_debug { +uint16_t val; +#define LOCK_DEBUG_INITVAL 0x +struct { +uint16_t cpu:12; +uint16_t pad:2; +bool irq_safe:1; +bool unseen:1; +}; }; -#define _LOCK_DEBUG { -1 } +#define _LOCK_DEBUG { LOCK_DEBUG_INITVAL } void spin_debug_enable(void); void spin_debug_disable(void); #else -struct lock_debug { }; +union lock_debug { }; #define _LOCK_DEBUG { } #define spin_debug_enable() ((void)0) #define spin_debug_disable() ((void)0) @@ -142,7 +149,7 @@ typedef struct spinlock { #define SPINLOCK_NO_CPU 0xfffu u16 recurse_cnt:4; #define SPINLOCK_MAX_RECURSE 0xfu -struct lock_debug debug; +union lock_debug debug; #ifdef CONFIG_LOCK_PROFILE struct lock_profile *profile; #endif -- 2.16.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 3/5] xen: print lock profile info in panic()
Print the lock profile data when the system crashes and add some more information for each lock data (lock address, cpu holding the lock). While at it use the PRI_stime format specifier for printing time data. This is especially beneficial for watchdog triggered crashes in case of deadlocks. In order to have the cpu holding the lock available let the lock profile config option select DEBUG_LOCKS. As printing the lock profile data will make use of locking, too, we need to disable spinlock debugging before calling spinlock_profile_printall() from panic(). While at it remove a superfluous #ifdef CONFIG_LOCK_PROFILE and rename CONFIG_LOCK_PROFILE to CONFIG_DEBUG_LOCK_PROFILE. Also move the .lockprofile.data section to init area in linker scripts as the data is no longer needed after boot. Signed-off-by: Juergen Gross --- V2: - rename CONFIG_LOCK_PROFILE to CONFIG_DEBUG_LOCK_PROFILE (Jan Beulich) - move .lockprofile.data section to init area in linker scripts - use PRI_stime (Andrew Cooper) - don't print "cpu=4095", but "not locked" (Andrew Cooper) --- xen/Kconfig.debug | 3 ++- xen/arch/arm/xen.lds.S | 13 +++-- xen/arch/x86/domain.c | 2 +- xen/arch/x86/xen.lds.S | 13 +++-- xen/common/keyhandler.c| 2 +- xen/common/spinlock.c | 33 ++--- xen/common/sysctl.c| 2 +- xen/drivers/char/console.c | 4 +++- xen/include/xen/spinlock.h | 12 +++- 9 files changed, 47 insertions(+), 37 deletions(-) diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug index 1faaa3ba6a..22573e74db 100644 --- a/xen/Kconfig.debug +++ b/xen/Kconfig.debug @@ -44,8 +44,9 @@ config COVERAGE If unsure, say N here. -config LOCK_PROFILE +config DEBUG_LOCK_PROFILE bool "Lock Profiling" + select DEBUG_LOCKS ---help--- Lock profiling allows you to see how often locks are taken and blocked. You can use serial console to print (and reset) using 'l' and 'L' diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S index 16ce1dd01e..a497f6a48d 100644 --- a/xen/arch/arm/xen.lds.S +++ b/xen/arch/arm/xen.lds.S @@ -54,12 +54,6 @@ SECTIONS *(.data.rel.ro) *(.data.rel.ro.*) -#ifdef CONFIG_LOCK_PROFILE - . = ALIGN(POINTER_ALIGN); - __lock_profile_start = .; - *(.lockprofile.data) - __lock_profile_end = .; -#endif . = ALIGN(POINTER_ALIGN); __param_start = .; *(.data.param) @@ -173,6 +167,13 @@ SECTIONS . = ALIGN(4); *(.altinstr_replacement) +#ifdef CONFIG_DEBUG_LOCK_PROFILE + . = ALIGN(POINTER_ALIGN); + __lock_profile_start = .; + *(.lockprofile.data) + __lock_profile_end = .; +#endif + *(.init.data) *(.init.data.rel) *(.init.data.rel.*) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index bc082ed827..b1c67c2d28 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -305,7 +305,7 @@ struct domain *alloc_domain_struct(void) #endif -#ifndef CONFIG_LOCK_PROFILE +#ifndef CONFIG_DEBUG_LOCK_PROFILE BUILD_BUG_ON(sizeof(*d) > PAGE_SIZE); #endif d = alloc_xenheap_pages(order, MEMF_bits(bits)); diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S index 87fa02b9b5..111edb5360 100644 --- a/xen/arch/x86/xen.lds.S +++ b/xen/arch/x86/xen.lds.S @@ -128,12 +128,6 @@ SECTIONS *(.ex_table.pre) __stop___pre_ex_table = .; -#ifdef CONFIG_LOCK_PROFILE - . = ALIGN(POINTER_ALIGN); - __lock_profile_start = .; - *(.lockprofile.data) - __lock_profile_end = .; -#endif . = ALIGN(POINTER_ALIGN); __param_start = .; *(.data.param) @@ -251,6 +245,13 @@ SECTIONS *(.altinstructions) __alt_instructions_end = .; +#ifdef CONFIG_DEBUG_LOCK_PROFILE + . = ALIGN(POINTER_ALIGN); + __lock_profile_start = .; + *(.lockprofile.data) + __lock_profile_end = .; +#endif + . = ALIGN(8); __ctors_start = .; *(.ctors) diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c index 57b360ee4b..c36baa4dff 100644 --- a/xen/common/keyhandler.c +++ b/xen/common/keyhandler.c @@ -62,7 +62,7 @@ static struct keyhandler { KEYHANDLER('P', perfc_reset, "reset performance counters", 0), #endif -#ifdef CONFIG_LOCK_PROFILE +#ifdef CONFIG_DEBUG_LOCK_PROFILE KEYHANDLER('l', spinlock_profile_printall, "print lock profile info", 1), KEYHANDLER('L', spinlock_profile_reset, "reset lock profile info", 0), #endif diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c index 79e70a9947..c4f706c627 100644 --- a/xen/common/spinlock.c +++ b/xen/common/spinlock.c @@ -106,7 +106,7 @@ void spin_debug_disable(void) #endif -#ifdef CONFIG_LOCK_PROFILE +#ifdef CONFIG_DEBUG_LOCK_PROFILE #define LOCK_PROFILE_REL \ if (lock->profile) \ @@
Re: [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode()
On 19.08.2019 03:25, Chao Gao wrote: > @@ -300,32 +322,44 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) > buf, unsigned long len) > if ( microcode_ops == NULL ) > return -EINVAL; > > -info = xmalloc_bytes(sizeof(*info) + len); > -if ( info == NULL ) > +buffer = xmalloc_bytes(len); > +if ( !buffer ) > return -ENOMEM; > > -ret = copy_from_guest(info->buffer, buf, len); > -if ( ret != 0 ) > +if ( copy_from_guest(buffer, buf, len) ) > { > -xfree(info); > -return ret; > +ret = -EFAULT; > +goto free; > } > > -info->buffer_size = len; > -info->error = 0; > -info->cpu = cpumask_first(_online_map); > - > if ( microcode_ops->start_update ) > { > ret = microcode_ops->start_update(); > if ( ret != 0 ) > -{ > -xfree(info); > -return ret; > -} > +goto free; > } > > -return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info); > +patch = parse_blob(buffer, len); > +if ( IS_ERR(patch) ) > +{ > +ret = PTR_ERR(patch); > +printk(XENLOG_INFO "Parsing microcode blob error %d\n", ret); I think this wants to be at least XENLOG_WARNING. > @@ -372,23 +406,46 @@ int __init early_microcode_update_cpu(bool start_update) > > microcode_ops->collect_cpu_info(_cpu(cpu_sig)); > > -if ( data ) > +if ( !data ) > +return -ENOMEM; > + > +if ( start_update ) > { > -if ( start_update && microcode_ops->start_update ) > +struct microcode_patch *patch; > + > +if ( microcode_ops->start_update ) > rc = microcode_ops->start_update(); > > if ( rc ) > return rc; > > -rc = microcode_update_cpu(data, len); > +patch = parse_blob(data, len); > +if ( IS_ERR(patch) ) > +{ > +printk(XENLOG_INFO "Parsing microcode blob error %ld\n", Same here. > + PTR_ERR(patch)); > +return PTR_ERR(patch); > +} > + > +if ( !patch ) > +{ > +printk(XENLOG_INFO "No ucode found. Update aborted!\n"); Here I'm not sure the message is worthwhile to have. > @@ -41,8 +42,6 @@ struct cpu_signature { > DECLARE_PER_CPU(struct cpu_signature, cpu_sig); > extern const struct microcode_ops *microcode_ops; > > -const struct microcode_patch *microcode_get_cache(void); > -bool microcode_update_cache(struct microcode_patch *patch); If you remove the declaration but not the definition, then the latter should become static. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch
On 19.08.2019 03:25, Chao Gao wrote: > +static enum microcode_match_result compare_patch( > +const struct microcode_patch *new, const struct microcode_patch *old) > +{ > +return (new->mc_intel->hdr.rev > old->mc_intel->hdr.rev) ? NEW_UCODE : > +OLD_UCODE; There's one too many blank after the ? here. Also we commonly align the : under the ? in cases like this one. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 2/5] xen: add new CONFIG_DEBUG_LOCKS option
Instead of enabling debugging for debug builds only add a dedicated Kconfig option for that purpose which defaults to DEBUG. Signed-off-by: Juergen Gross --- V2: - rename to CONFIG_DEBUG_LOCKS (Jan Beulich) --- xen/Kconfig.debug | 7 +++ xen/common/spinlock.c | 4 ++-- xen/include/xen/spinlock.h | 2 +- 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug index e10e314e25..1faaa3ba6a 100644 --- a/xen/Kconfig.debug +++ b/xen/Kconfig.debug @@ -51,6 +51,13 @@ config LOCK_PROFILE You can use serial console to print (and reset) using 'l' and 'L' respectively, or the 'xenlockprof' tool. +config DEBUG_LOCKS + bool "Lock debugging" + default DEBUG + ---help--- + Enable debugging features of lock handling. Some additional + checks will be performed when acquiring and releasing locks. + config PERF_COUNTERS bool "Performance Counters" ---help--- diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c index 1be1b5ebe6..79e70a9947 100644 --- a/xen/common/spinlock.c +++ b/xen/common/spinlock.c @@ -9,7 +9,7 @@ #include #include -#ifndef NDEBUG +#ifdef CONFIG_DEBUG_LOCKS static atomic_t spin_debug __read_mostly = ATOMIC_INIT(0); @@ -97,7 +97,7 @@ void spin_debug_disable(void) atomic_dec(_debug); } -#else /* defined(NDEBUG) */ +#else /* CONFIG_DEBUG_LOCKS */ #define check_lock(l) ((void)0) #define check_barrier(l) ((void)0) diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h index 88017e0a89..736490f52b 100644 --- a/xen/include/xen/spinlock.h +++ b/xen/include/xen/spinlock.h @@ -5,7 +5,7 @@ #include #include -#ifndef NDEBUG +#ifdef CONFIG_DEBUG_LOCKS union lock_debug { uint16_t val; #define LOCK_DEBUG_INITVAL 0x -- 2.16.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] p2m/ept: pass correct level to atomic_write_ept_entry in ept_invalidate_emt
On 29.08.2019 12:26, Roger Pau Monné wrote: > On Tue, Aug 27, 2019 at 05:23:33PM +0200, Jan Beulich wrote: >> On 23.08.2019 07:58, Tian, Kevin wrote: From: Roger Pau Monne [mailto:roger@citrix.com] Sent: Tuesday, August 20, 2019 11:38 PM The level passed to ept_invalidate_emt corresponds to the EPT entry passed as the mfn parameter, which is a pointer to an EPT page table, hence the entries in that page table will have one level less than the parent. Fix the call to atomic_write_ept_entry to pass the correct level, ie: one level less than the parent. Fixes: 50fe6e73705 ('pvh dom0: add and remove foreign pages') Signed-off-by: Roger Pau Monné --- Cc: Jun Nakajima Cc: Kevin Tian Cc: George Dunlap Cc: Jan Beulich Cc: Andrew Cooper Cc: Wei Liu Cc: Sergey Dyasli Cc: Paul Durrant --- xen/arch/x86/mm/p2m-ept.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index 6b8468c793..23ae6e9da3 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -367,7 +367,7 @@ static bool_t ept_invalidate_emt(struct p2m_domain *p2m, mfn_t mfn, e.emt = MTRR_NUM_TYPES; if ( recalc ) e.recalc = 1; -rc = atomic_write_ept_entry(p2m, [i], e, level); +rc = atomic_write_ept_entry(p2m, [i], e, level - 1); ASSERT(rc == 0); changed = 1; } >>> >>> Reviewed-by: Kevin Tian . >>> >>> One small comment about readability. What about renaming 'level' >>> to 'parent_level' in this function? >> >> And also taking the opportunity and making it unsigned int? > > I've been thinking about this, and I'm not sure renaming level to > parent_level is correct, level actually matches the level of the mfn > also passed as a parameter, and hence it's correct. It's the function > itself that actually iterates over the page table entries, and hence > descends one level into the page tables. > > ie: I could add something like ASSERT(level) to make sure no level 0 > entries are passed to the function, but renaming doesn't seem correct > to me. Hmm, I'm afraid I've made the change as requested by Kevin while committing. Personally I think either name is fine, but if Kevin agrees with your response, then maybe we should undo that adjustment. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86: properly gate clearing of PKU feature
On 29.08.2019 12:46, Andrew Cooper wrote: > On 29/08/2019 10:27, Jan Beulich wrote: >> setup_clear_cpu_cap() is __init and hence may not be called post-boot. >> Note that opt_pku nevertheless is not getting __initdata added - see >> e.g. commit 43fa95ae6a ("mm: make opt_bootscrub non-init"). >> >> Signed-off-by: Jan Beulich > > Acked-by: Andrew Cooper > > However, I'm tempted to suggest that this logic disappears entirely. > Given that we don't support it for PV guests, and can't without taking a > CR4 write on toggle_kernel_mode(), all this actually controls is whether > the bit finds its way into the HVM max policy. Well, if you mean replacing opt_pku by an addition to parse_xen_cpuid(), then I'll be happy to do that as a follow-on. If there is another direction you've been considering, then you'd have to at least outline it. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
On 27.08.2019 06:52, Chao Gao wrote: > On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote: >> On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote: >>> On 19/08/2019 02:25, Chao Gao wrote: register an nmi callback. And this callback does busy-loop on threads which are waiting for loading completion. Control threads send NMI to slave threads to prevent NMI acceptance during ucode loading. Signed-off-by: Chao Gao --- Changes in v9: - control threads send NMI to all other threads. Slave threads will stay in the NMI handling to prevent NMI acceptance during ucode loading. Note that self-nmi is invalid according to SDM. >>> >>> To me this looks like a half-measure: why keep only slave threads in >>> the NMI handler, when master threads can update the microcode from >>> inside the NMI handler as well? >> >> No special reason. Because the issue we want to address is that slave >> threads might go to handle NMI and access MSRs when master thread is >> loading ucode. So we only keep slave threads in the NMI handler. >> >>> >>> You mention that self-nmi is invalid, but Xen has self_nmi() which is >>> used for apply_alternatives() during boot, so can be trusted to work. >> >> Sorry, I meant using self shorthand to send self-nmi. I tried to use >> self shorthand but got APIC error. And I agree that it is better to >> make slave thread call self_nmi() itself. >> >>> >>> I experimented a bit with the following approach: after loading_state >>> becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous >>> via cpu_callin_map into LOADING_ENTER to do a ucode update directly in >>> the NMI handler. And it seems to work. >>> >>> Separate question is about the safety of this approach: can we be sure >>> that a ucode update would not reset the status of the NMI latch? I.e. >>> can it cause another NMI to be delivered while Xen already handles one? >> >> Ashok, what's your opinion on Sergey's approach and his concern? > > I talked with Ashok. We think your approach is better. I will follow > your approach in v10. It would be much helpful if you post your patch > so that I can just rebase it onto other patches. Doing the actual ucode update inside an NMI handler seems rather risky to me. Even if Ashok confirmed it would not be an issue on past and current Intel CPUs - what about future ones, or ones from other vendors? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
On 19.08.2019 03:25, Chao Gao wrote: > @@ -481,12 +478,28 @@ static int do_microcode_update(void *patch) > return ret; > } > > +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu) > +{ > +/* The first thread of a core is to load an update. Don't block it. */ > +if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) || > + loading_state != LOADING_CALLIN ) > +return 0; > + > +cpumask_set_cpu(cpu, _callin_map); > + > +while ( loading_state != LOADING_EXIT ) > +cpu_relax(); > + > +return 0; > +} By returning 0 you tell do_nmi() to continue processing the NMI. Since you can't tell whether a non-IPI NMI has surfaced at about the same time this is generally the right thing imo, but how do you prevent unknown_nmi_error() from getting entered when do_nmi() ends up setting handle_unknown to true? (The question is mostly rhetorical, but there's a disconnect between do_nmi() checking "cpu == 0" and the control thread running on cpumask_first(_online_map), i.e. you introduce a well hidden dependency on CPU 0 never going offline. IOW my request is to at least make this less well hidden, such that it can be noticed if and when someone endeavors to remove said limitation.) Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h
On 29.08.19 14:27, Jan Beulich wrote: On 29.08.2019 14:01, Juergen Gross wrote: On 29.08.19 13:42, Andrew Cooper wrote: On 29/08/2019 10:28, Jan Beulich wrote: Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24: There have been reports of RDRAND issues after resuming from suspend on some AMD family 15h and family 16h systems. This issue stems from a BIOS not performing the proper steps during resume to ensure RDRAND continues to function properly. Update the CPU initialization to clear the RDRAND CPUID bit for any family 15h and 16h processor that supports RDRAND. If it is known that the family 15h or family 16h system does not have an RDRAND resume issue or that the system will not be placed in suspend, the "cpuid=rdrand" kernel parameter can be used to stop the clearing of the RDRAND CPUID bit. Note, that clearing the RDRAND CPUID bit does not prevent a processor that normally supports the RDRAND instruction from executing it. So any code that determined the support based on family and model won't #UD. Yeah, but it will no longer see random numbers after resume. That's the implication; note that I've taken the text from the Linux commit. Signed-off-by: Jan Beulich --- Slightly RFC, in particular because of the change to parse_xen_cpuid(): Alternative approach suggestions are welcome. This issue was on my radar, but I hadn't got around to starting a patch yet. I was wondering if we could perhaps default it to whether S3 was available, but looking at the code which prints "ACPI sleep modes: S3", it doesn't appear to be related to any real ACPI information. Wouldn't it make more sense to inhibit suspend/resume instead? That's an alternative option. But if anything I'd see us providing both, not the one you suggest instead of what the patch here does. I'm quite sure a machine running Xen is very rarely put to S3. I'm not at all sure about this. Suspend/resume in Xen was broken for more than 3 months in the late 4.12 development phase and nobody noticed it. Only when I started my core scheduling series I came across the issue. Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86: properly gate clearing of PKU feature
On 29/08/2019 14:13, Jan Beulich wrote: > On 29.08.2019 12:46, Andrew Cooper wrote: >> On 29/08/2019 10:27, Jan Beulich wrote: >>> setup_clear_cpu_cap() is __init and hence may not be called post-boot. >>> Note that opt_pku nevertheless is not getting __initdata added - see >>> e.g. commit 43fa95ae6a ("mm: make opt_bootscrub non-init"). >>> >>> Signed-off-by: Jan Beulich >> Acked-by: Andrew Cooper >> >> However, I'm tempted to suggest that this logic disappears entirely. >> Given that we don't support it for PV guests, and can't without taking a >> CR4 write on toggle_kernel_mode(), all this actually controls is whether >> the bit finds its way into the HVM max policy. > Well, if you mean replacing opt_pku by an addition to > parse_xen_cpuid(), then I'll be happy to do that as a follow-on. > If there is another direction you've been considering, then > you'd have to at least outline it. Well - I did say "disappear entirely". I don't see it as having, or liable to gain, any useful purpose. In reality, it was subsumed into cpuid= by my patch to allow all bits to be tweak-able. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/3] x86/ACPI: re-park previously parked CPUs upon resume from S3
On 14/06/2019 12:37, Jan Beulich wrote: > Aiui when resuming from S3, CPUs come back out of RESET/INIT. From a programmers point of view, all APs are in Wait-for-SIPI after resume, and this is confirmed by several of the BWG flowcharts. This is a logical consequence of the CPU losing all power, meaning that its startup sequence will be the same whether it is booting from S5 or S3. > Therefore > they need to undergo the same procedure as was added elsewhere by > commits d8f974f1a6 ("x86: command line option to avoid use of secondary > hyper-threads") and 8797d20a6e ("x86: possibly bring up all CPUs even > if not all are supposed to be used"). > > Signed-off-by: Jan Beulich Acked-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] x86/mm: Add mem access rights to NPT
On 22.08.2019 16:02, Alexandru Stefan ISAILA wrote: > This patch adds access control for NPT mode. > > The access rights are stored in the NPT p2m table 56:53. Why starting from bit 53? I can't seem to find any use of bit 52. > The bits are free after clearing the IOMMU flags [1]. > > Modify p2m_type_to_flags() to accept and interpret an access value, > parallel to the ept code. > > Add a set_default_access() method to the p2m-pt and p2m-ept versions > of the p2m rather than setting it directly, to deal with different > default permitted access values. I think this would better be a separate change. > --- a/xen/arch/x86/mm/p2m-pt.c > +++ b/xen/arch/x86/mm/p2m-pt.c > @@ -63,10 +63,13 @@ > #define needs_recalc(level, ent) _needs_recalc(level##e_get_flags(ent)) > #define valid_recalc(level, ent) (!(level##e_get_flags(ent) & > _PAGE_ACCESSED)) > > +#define _PAGE_ACCESS_BITS (0x1e00) /* mem_access rights 16:13 */ I guess this is too disconnected from the two page.h headers where the correlation between bit positions gets explained, so I guess you want to extend the comment and either refer there, or replicate some of it to make understandable why 16:13 matches 56:53. I'm also concerned how easy it'll be for someone to find this definition when wanting to use other of the available bits. > @@ -104,8 +112,32 @@ static unsigned long p2m_type_to_flags(const struct > p2m_domain *p2m, > flags |= _PAGE_PWT; > ASSERT(!level); > } > -return flags | P2M_BASE_FLAGS | _PAGE_PCD; > +flags |= P2M_BASE_FLAGS | _PAGE_PCD; > +break; > } > + > +switch ( access ) > +{ > +case p2m_access_r: > +flags |= _PAGE_NX_BIT; > +flags &= ~_PAGE_RW; > +break; > +case p2m_access_rw: > +flags |= _PAGE_NX_BIT; > +break; > +case p2m_access_rx: > +case p2m_access_rx2rw: > +flags &= ~(_PAGE_NX_BIT | _PAGE_RW); > +break; > +case p2m_access_x: > +flags &= ~_PAGE_RW; > +break; I can't seem to be able to follow you here. In fact I don't see how you would be able to express execute-only with NPT. If this is really needed for some reason, then a justifying comment should be added. > @@ -152,6 +184,17 @@ p2m_free_entry(struct p2m_domain *p2m, l1_pgentry_t > *p2m_entry, int page_order) > p2m_free_ptp(p2m, l1e_get_page(*p2m_entry)); > } > > +static void p2m_set_access(intpte_t *entry, p2m_access_t access) > +{ > +*entry |= put_pte_flags((get_pte_flags(*entry) & ~_PAGE_ACCESS_BITS) | > +(MASK_INSR(access, _PAGE_ACCESS_BITS))); > +} > + > +static p2m_access_t p2m_get_access(intpte_t entry) > +{ > +return (p2m_access_t)(MASK_EXTR(get_pte_flags(entry), > _PAGE_ACCESS_BITS)); Is the cast really needed here? > @@ -226,6 +269,7 @@ p2m_next_level(struct p2m_domain *p2m, void **table, > { > new_entry = l1e_from_pfn(pfn | (i << ((level - 1) * > PAGETABLE_ORDER)), > flags); > +p2m_set_access(_entry.l1, p2m->default_access); > rc = p2m->write_p2m_entry(p2m, gfn, l1_entry + i, new_entry, > level); > if ( rc ) > { > @@ -237,6 +281,7 @@ p2m_next_level(struct p2m_domain *p2m, void **table, > unmap_domain_page(l1_entry); > > new_entry = l1e_from_mfn(mfn, P2M_BASE_FLAGS | _PAGE_RW); > +p2m_set_access(_entry.l1, p2m->default_access); > rc = p2m->write_p2m_entry(p2m, gfn, p2m_entry, new_entry, >level + 1); > if ( rc ) Is it really intended to insert the access bits also into non-leaf entries (which may be what is being processed here)? (May also apply further down.) > @@ -474,6 +520,32 @@ int p2m_pt_handle_deferred_changes(uint64_t gpa) > return rc; > } > > +static int p2m_pt_check_access(p2m_access_t p2ma) > +{ > +switch ( p2ma ) > +{ > +case p2m_access_n: > +case p2m_access_w: > +case p2m_access_wx: > +case p2m_access_n2rwx: > +return -EINVAL; I'm not convinced EINVAL is appropriate here - the argument isn't invalid, it's just that there's no way to represent it. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h
On 29.08.2019 14:01, Juergen Gross wrote: > On 29.08.19 13:42, Andrew Cooper wrote: >> On 29/08/2019 10:28, Jan Beulich wrote: >>> Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24: >>> >>> There have been reports of RDRAND issues after resuming from suspend on >>> some AMD family 15h and family 16h systems. This issue stems from a >>> BIOS >>> not performing the proper steps during resume to ensure RDRAND >>> continues >>> to function properly. >>> >>> Update the CPU initialization to clear the RDRAND CPUID bit for any >>> family >>> 15h and 16h processor that supports RDRAND. If it is known that the >>> family >>> 15h or family 16h system does not have an RDRAND resume issue or that >>> the >>> system will not be placed in suspend, the "cpuid=rdrand" kernel >>> parameter >>> can be used to stop the clearing of the RDRAND CPUID bit. >>> >>> Note, that clearing the RDRAND CPUID bit does not prevent a processor >>> that normally supports the RDRAND instruction from executing it. So any >>> code that determined the support based on family and model won't #UD. > > Yeah, but it will no longer see random numbers after resume. That's the implication; note that I've taken the text from the Linux commit. >>> Signed-off-by: Jan Beulich >>> --- >>> Slightly RFC, in particular because of the change to parse_xen_cpuid(): >>> Alternative approach suggestions are welcome. >> >> This issue was on my radar, but I hadn't got around to starting a patch yet. >> >> I was wondering if we could perhaps default it to whether S3 was >> available, but looking at the code which prints "ACPI sleep modes: S3", >> it doesn't appear to be related to any real ACPI information. > > Wouldn't it make more sense to inhibit suspend/resume instead? That's an alternative option. But if anything I'd see us providing both, not the one you suggest instead of what the patch here does. > I'm quite sure a machine running Xen is very rarely put to S3. I'm not at all sure about this. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] x86/AMD: Fix handling of x87 exception pointers on Fam17h hardware
On 19.08.2019 20:26, Andrew Cooper wrote: > AMD Pre-Fam17h CPUs "optimise" {F,}X{SAVE,RSTOR} by not saving/restoring > FOP/FIP/FDP if an x87 exception isn't pending. This causes an information > leak, CVE-2006-1056, and worked around by several OSes, including Xen. AMD > Fam17h CPUs no longer have this leak, and advertise so in a CPUID bit. > > Introduce the RSTR_FP_ERR_PTRS feature, as specified by AMD, and expose to all > guests by default. While adjusting libxl's cpuid table, add CLZERO which > looks to have been omitted previously. > > Also introduce an X86_BUG bit to trigger the (F)XRSTOR workaround, and set it > on AMD hardware where RSTR_FP_ERR_PTRS is not advertised. Optimise the > workaround path by dropping the data-dependent unpredictable conditions which > will evalute to true for all 64bit OSes and most 32bit ones. I definitely don't buy the "all 64bit OSes" part here: Anyone doing full 80-bit FP operations will have to use the FPU, and hence may want to have some unmasked exceptions. I'm also not sure why you call them "unpredictable": If all (or most) cases match, the branch there could be pretty well predicted (subject of course to capacity). All in all I'd prefer if the conditions remained in place; my minimal request would be for there to be a comment why there's no evaluation of FSW/FCW. > --- a/xen/arch/x86/i387.c > +++ b/xen/arch/x86/i387.c > @@ -43,20 +43,17 @@ static inline void fpu_fxrstor(struct vcpu *v) > const typeof(v->arch.xsave_area->fpu_sse) *fpu_ctxt = v->arch.fpu_ctxt; > > /* > - * AMD CPUs don't save/restore FDP/FIP/FOP unless an exception > + * Some CPUs don't save/restore FDP/FIP/FOP unless an exception Are there any non-AMD CPUs known to have this issue? If not, is there a particular reason you don't say "Some AMD CPUs ..."? > * is pending. Clear the x87 state here by setting it to fixed > * values. The hypervisor data segment can be sometimes 0 and > * sometimes new user value. Both should be ok. Use the FPU saved > * data block as a safe address because it should be in L1. > */ > -if ( !(fpu_ctxt->fsw & ~fpu_ctxt->fcw & 0x003f) && > - boot_cpu_data.x86_vendor == X86_VENDOR_AMD ) > -{ > +if ( cpu_bug_fpu_ptr_leak ) > asm volatile ( "fnclex\n\t" > "ffree %%st(7)\n\t" /* clear stack tag */ > "fildl %0" /* load to clear state */ > : : "m" (*fpu_ctxt) ); If here and in the respective xsave instance you'd use alternatives patching, I wouldn't mind the use of a X86_BUG_* for this (as made possible by patch 1). But as said before, just like for synthetic features I strongly think we should use simple boolean variables when using them only in if()-s. Use of the feature(/bug) machinery is needed only to not further complicate alternatives patching. > @@ -169,11 +166,10 @@ static inline void fpu_fxsave(struct vcpu *v) > : "=m" (*fpu_ctxt) : "R" (fpu_ctxt) ); > > /* > - * AMD CPUs don't save/restore FDP/FIP/FOP unless an exception > - * is pending. > + * Some CPUs don't save/restore FDP/FIP/FOP unless an exception is > + * pending. The restore code fills in suitable defaults. > */ > -if ( !(fpu_ctxt->fsw & 0x0080) && > - boot_cpu_data.x86_vendor == X86_VENDOR_AMD ) > +if ( cpu_bug_fpu_ptr_leak && !(fpu_ctxt->fsw & 0x0080) ) > return; The comment addition seems a little unmotivated: The code here isn't about leaking data, but about having valid data to consume (down from here). With this, keying the return to cpu_bug_* also doesn't look very nice, but I admit I can't suggest a better alternative (other than leaving the vendor check in place and checking the X86_FEATURE_RSTR_FP_ERR_PTRS bit). An option might be to give the construct a different name, without "leak" in it (NO_FP_ERR_PTRS?). Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/3] x86: a little bit of 16-bit video mode setting code cleanup
On 14/06/2019 12:38, Jan Beulich wrote: > To "compensate" for the code size growth by an earlier change: > - drop "trampoline" labels (in almost all cases the target label is > reachable with an 8-bit-displacement branch anyway, and a single 16- > bit-displacement branch is still better than a pair of two branches) Do you happen to know why we any to start with? It can't plausibly be for manual code alignment reasons. (probably) whatever the reason, good riddance. > - drop an entirely dead insn > - reduce code size in a few other (obvious I hope) cases, by more > suitable insn/operands selection I'm afraid these are rather hard to identify, given no hints. Comments in line. > Also drop redundant #define-s (move suitable #include a little earlier > instead) and add two alignment directives. > > Signed-off-by: Jan Beulich > > --- a/xen/arch/x86/boot/trampoline.S > +++ b/xen/arch/x86/boot/trampoline.S > @@ -176,6 +176,7 @@ start64: > > jmpq*%rdi > > +#include "video.h" > #include "wakeup.S" > > .balign 8 > @@ -283,8 +284,6 @@ trampoline_boot_cpu_entry: > /* Jump to the common bootstrap entry point. */ > jmp trampoline_protmode_entry > > -#include "video.h" > - > .align 2 > /* Keep in sync with cmdline.c:early_boot_opts_t type! */ > early_boot_opts: > --- a/xen/arch/x86/boot/video.S > +++ b/xen/arch/x86/boot/video.S > @@ -384,9 +384,6 @@ lmbad: leawbootsym(unknt), %si > jmp mode_menu > lmdef: ret > > -_setrec:jmp setrec # Ugly... > -_set_80x25: jmp set_80x25 > - > # Setting of user mode (AX=mode ID) => CF=success > mode_set: > movw%ax, bootsym(boot_vid_mode) > @@ -396,7 +393,7 @@ mode_set: > je setvesabysize > > testb $VIDEO_RECALC>>8, %ah > -jnz _setrec > +jnz setrec > > cmpb$VIDEO_FIRST_SPECIAL>>8, %ah > jz setspc > @@ -421,7 +418,7 @@ setspc: xorb%bh, %bh > > setmenu: > orb %al, %al# 80x25 is an exception > -jz _set_80x25 > +jz set_80x25 > > pushw %bx # Set mode chosen from menu > callmode_table # Build the mode table > @@ -441,36 +438,32 @@ check_vesa: > cmpw$0x004f, %ax > jnz setbad > > -leawvesa_mode_info, %di > -subb$VIDEO_FIRST_VESA>>8, %bh > -movw%bx, %cx# Get mode information structure > +leawvesa_mode_info, %di # Get mode information structure > +leaw-VIDEO_FIRST_VESA(%bx), %cx > movw$0x4f01, %ax > int $0x10 > -addb$VIDEO_FIRST_VESA>>8, %bh Is this the redundant instruction you are talking about, or ... (near the end)? I think I follow this as "no logical change", by leaving %bx intact throughout, and only clearing %ch as part of the %bx=>%cx copy. > cmpw$0x004f, %ax > jnz setbad > > movb(%di), %al # Check mode attributes. > andb$0x99, %al > cmpb$0x99, %al > -jnz _setbad # Doh! No linear frame buffer. > +jnz setbad # Doh! No linear frame buffer. > > pushw %bx > subb$VIDEO_FIRST_VESA>>8, %bh > -orw $0x4000, %bx# Use linear frame buffer > +orb $0x40, %bh # Use linear frame buffer > movw$0x4f02, %ax# VESA BIOS mode set call > int $0x10 > popw%bx > cmpw$0x004f, %ax# AL=4f if implemented > -jnz _setbad # AH=0 if OK > +jnz setbad # AH=0 if OK > > movb$1, bootsym(graphic_mode) # flag graphic mode > movw%bx, bootsym(video_mode) > stc > ret > > -_setbad: jmpsetbad # Ugly... > - > # Recalculate vertical display end registers -- this fixes various > # inconsistencies of extended modes on many adapters. Called when > # the VIDEO_RECALC flag is set in the mode ID. > @@ -515,7 +508,7 @@ setvesabysize: > leawmodelist,%si > 1: add $8,%si > cmpw$ASK_VGA,-8(%si)# End? > -je _setbad > +je setbad > movw-6(%si),%ax > cmpw%ax,bootsym(vesa_size)+0 > jne 1b > @@ -948,6 +941,7 @@ store_edid: > #endif > ret > > +.p2align 1 > mt_end: .word 0 # End of video mode table if built > edit_buf: .space 6 # Line editor buffer > card_name: .word 0 # Pointer to adapter name > @@ -991,6 +985,7 @@ vesa_name: .asciz "VESA" > > name_bann: .asciz "Video adapter: " > > +.p2align 1 > force_size: .word 0 # Use this size
Re: [Xen-devel] [PATCH v6 10/10] introduce a 'passthrough' configuration option to xl.cfg...
On 16.08.2019 19:20, Paul Durrant wrote: > ...and hence the ability to disable IOMMU mappings, and control EPT > sharing. > > This patch introduces a new 'libxl_passthrough' enumeration into > libxl_domain_create_info. The value will be set by xl either when it parses > a new 'passthrough' option in xl.cfg, or implicitly if there is passthrough > hardware specified for the domain. > > If the value of the passthrough configuration option is 'disabled' then > the XEN_DOMCTL_CDF_iommu flag will be clear in the xen_domctl_createdomain > flags, thus allowing the toolstack to control whether the domain gets > IOMMU mappings or not (where previously they were globally set). > > If the value of the passthrough configuration option is 'sync_pt' then > a new 'iommu_opts' field in xen_domctl_createdomain will be set with the > value XEN_DOMCTL_IOMMU_no_sharept. This will override the global default > set in iommu_hap_pt_share, thus allowing the toolstack to control whether > EPT sharing is used for the domain. > > Signed-off-by: Paul Durrant Reviewed-by: Jan Beulich with a question/suggestion and a nit: > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -308,6 +308,13 @@ static int sanitise_domain_config(struct > xen_domctl_createdomain *config) > return -EINVAL; > } > > +if ( !(config->flags & XEN_DOMCTL_CDF_iommu) && config->iommu_opts ) > +{ > +dprintk(XENLOG_INFO, > +"IOMMU options specified but IOMMU not enabled\n"); > +return -EINVAL; > +} Seeing this I wonder whether XEN_DOMCTL_CDF_iommu wouldn't better be bit 0 of iommu_opts. > @@ -70,6 +70,10 @@ struct xen_domctl_createdomain { > > uint32_t flags; > > +#define _XEN_DOMCTL_IOMMU_no_sharept 0 > +#define XEN_DOMCTL_IOMMU_no_sharept (1U<<_XEN_DOMCTL_IOMMU_no_sharept) Please can you add the missing blanks around << ? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 08/12] livepatch: Add support for inline asm hotpatching expectations
> diff --git a/xen/test/livepatch/Makefile b/xen/test/livepatch/Makefile > index 23113d3418..067861903f 100644 > --- a/xen/test/livepatch/Makefile > +++ b/xen/test/livepatch/Makefile > @@ -27,6 +27,8 @@ LIVEPATCH_ACTION_HOOKS_NOFUNC := > xen_action_hooks_nofunc.livepatch > LIVEPATCH_ACTION_HOOKS_MARKER:= xen_action_hooks_marker.livepatch > LIVEPATCH_ACTION_HOOKS_NOAPPLY:= xen_action_hooks_noapply.livepatch > LIVEPATCH_ACTION_HOOKS_NOREVERT:= xen_action_hooks_norevert.livepatch > +LIVEPATCH_EXPECTATIONS:= xen_expectations.livepatch > +LIVEPATCH_EXPECTATIONS_FAIL:= xen_expectations_fail.livepatch > > LIVEPATCHES += $(LIVEPATCH) > LIVEPATCHES += $(LIVEPATCH_BYE) > @@ -40,6 +42,8 @@ LIVEPATCHES += $(LIVEPATCH_ACTION_HOOKS_NOFUNC) > LIVEPATCHES += $(LIVEPATCH_ACTION_HOOKS_MARKER) > LIVEPATCHES += $(LIVEPATCH_ACTION_HOOKS_NOAPPLY) > LIVEPATCHES += $(LIVEPATCH_ACTION_HOOKS_NOREVERT) > +LIVEPATCHES += $(LIVEPATCH_EXPECTATIONS) > +LIVEPATCHES += $(LIVEPATCH_EXPECTATIONS_FAIL) > > LIVEPATCH_DEBUG_DIR ?= $(DEBUG_DIR)/xen-livepatch > > @@ -54,7 +58,7 @@ uninstall: > > .PHONY: clean > clean:: > - rm -f *.o .*.o.d *.livepatch config.h > + rm -f *.o .*.o.d *.livepatch config.h expect_config.h > > # > # To compute these values we need the binary files: xen-syms > @@ -182,8 +186,27 @@ xen_actions_hooks_norevert.o: config.h > $(LIVEPATCH_ACTION_HOOKS_NOREVERT): xen_action_hooks_marker.o > xen_hello_world_func.o note.o xen_note.o > $(LD) $(LDFLAGS) $(build_id_linker) -r -o > $(LIVEPATCH_ACTION_HOOKS_NOREVERT) $^ > > +CODE_GET_EXPECT=$(shell objdump -d --insn-width=1 $(1) | grep -A6 -E > '<'$(2)'>:' | tail -n +2 | awk 'BEGIN {printf "{"} {printf "0x%s,", $$2}' | > sed 's/,$$/}/g') > +.PHONY: expect_config.h > +expect_config.h: EXPECT_BYTES=$(call > CODE_GET_EXPECT,$(BASEDIR)/xen-syms,xen_extra_version) > +expect_config.h: EXPECT_BYTES_COUNT=6 > +expect_config.h: xen_expectations.o > + (set -e; \ > + echo "#define EXPECT_BYTES $(EXPECT_BYTES)"; \ > + echo "#define EXPECT_BYTES_COUNT $(EXPECT_BYTES_COUNT)") > $@ > + > +xen_expectations.o: expect_config.h > + > +.PHONY: $(LIVEPATCH_EXPECTATIONS) > +$(LIVEPATCH_EXPECTATIONS): xen_expectations.o xen_hello_world_func.o note.o > xen_note.o > + $(LD) $(LDFLAGS) $(build_id_linker) -r -o $(LIVEPATCH_EXPECTATIONS) $^ > + > +.PHONY: $(LIVEPATCH_EXPECTATIONS_FAIL) > +$(LIVEPATCH_EXPECTATIONS_FAIL): xen_expectations_fail.o > xen_hello_world_func.o note.o xen_note.o > + $(LD) $(LDFLAGS) $(build_id_linker) -r -o > $(LIVEPATCH_EXPECTATIONS_FAIL) $^ > + > .PHONY: livepatch > livepatch: $(LIVEPATCH) $(LIVEPATCH_BYE) $(LIVEPATCH_REPLACE) > $(LIVEPATCH_NOP) $(LIVEPATCH_NO_XEN_BUILDID) \ > $(LIVEPATCH_PREPOST_HOOKS) $(LIVEPATCH_PREPOST_HOOKS_FAIL) > $(LIVEPATCH_ACTION_HOOKS) \ > $(LIVEPATCH_ACTION_HOOKS_NOFUNC) $(LIVEPATCH_ACTION_HOOKS_MARKER) > $(LIVEPATCH_ACTION_HOOKS_NOAPPLY) \ > - $(LIVEPATCH_ACTION_HOOKS_NOREVERT) > + $(LIVEPATCH_ACTION_HOOKS_NOREVERT) $(LIVEPATCH_EXPECTATIONS) > $(LIVEPATCH_EXPECTATIONS_FAIL) > diff --git a/xen/test/livepatch/xen_expectations.c > b/xen/test/livepatch/xen_expectations.c > new file mode 100644 > index 00..c8175a458b > --- /dev/null > +++ b/xen/test/livepatch/xen_expectations.c > @@ -0,0 +1,41 @@ > +/* > + * Copyright (c) 2019 Amazon.com, Inc. or its affiliates. All rights > reserved. > + * > + */ > + > +#include "expect_config.h" > +#include > +#include > +#include > +#include > +#include > + > +#include > + > +static const char livepatch_exceptions_str[] = "xen_extra_version"; > +extern const char *xen_hello_world(void); > + > +struct livepatch_func __section(".livepatch.funcs") livepatch_exceptions = { > +.version = LIVEPATCH_PAYLOAD_VERSION, > +.name = livepatch_exceptions_str, > +.new_addr = xen_hello_world, > +.old_addr = xen_extra_version, > +.new_size = EXPECT_BYTES_COUNT, > +.old_size = EXPECT_BYTES_COUNT, > +.expect = { > +.enabled = 1, > +.len = EXPECT_BYTES_COUNT, > +.data = EXPECT_BYTES > +}, > + > +}; When I compile with 32-bit ARM 'make tests' I get: arm-eabi-ld-EL -EL --build-id=sha1 -r -o xen_action_hooks_norevert.livepatch xen_action_hooks_marker.o xen_hello_world_func.o note.o xen_note.o make[3]: Circular expect_config.h <- xen_expectations.o dependency dropped. objdump: can't disassemble for architecture UNKNOWN! (set -e; \ echo "#define EXPECT_BYTES {"; \ echo "#define EXPECT_BYTES_COUNT 6") > expect_config.h arm-eabi-gcc -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include /home/konrad/A20/xen.git/xen/include/xen/config.h '-D__OBJECT_FILE__="xen_expectations.o"'
Re: [Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h
On 29.08.19 13:42, Andrew Cooper wrote: On 29/08/2019 10:28, Jan Beulich wrote: Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24: There have been reports of RDRAND issues after resuming from suspend on some AMD family 15h and family 16h systems. This issue stems from a BIOS not performing the proper steps during resume to ensure RDRAND continues to function properly. Update the CPU initialization to clear the RDRAND CPUID bit for any family 15h and 16h processor that supports RDRAND. If it is known that the family 15h or family 16h system does not have an RDRAND resume issue or that the system will not be placed in suspend, the "cpuid=rdrand" kernel parameter can be used to stop the clearing of the RDRAND CPUID bit. Note, that clearing the RDRAND CPUID bit does not prevent a processor that normally supports the RDRAND instruction from executing it. So any code that determined the support based on family and model won't #UD. Yeah, but it will no longer see random numbers after resume. Signed-off-by: Jan Beulich --- Slightly RFC, in particular because of the change to parse_xen_cpuid(): Alternative approach suggestions are welcome. This issue was on my radar, but I hadn't got around to starting a patch yet. I was wondering if we could perhaps default it to whether S3 was available, but looking at the code which prints "ACPI sleep modes: S3", it doesn't appear to be related to any real ACPI information. Wouldn't it make more sense to inhibit suspend/resume instead? I'm quite sure a machine running Xen is very rarely put to S3. Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [OSSTEST PATCH] make-hosts-flight: Prefer stretch to jessie
The list of suites in ALL_SUITES is in decreasing order of preference: it gets passed to cs-hosts-list, where the order is significant. Without this patch, we try to commission hosts by running jessie if the host flags seem to say jessie would be supported. That's not really sensible. CC: Dominic Brekau Signed-off-by: Ian Jackson --- make-hosts-flight | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/make-hosts-flight b/make-hosts-flight index f7422932..92da1c7c 100755 --- a/make-hosts-flight +++ b/make-hosts-flight @@ -26,7 +26,8 @@ blessing=$4 buildflight=$5 : ${ALL_ARCHES:=amd64 i386 arm64 armhf} -: ${ALL_SUITES:=jessie stretch} +: ${ALL_SUITES:=stretch jessie} +# ^ most preferred suite first : ${PERHOST_MAXWAIT:=2} # seconds -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v6 07/10] use is_iommu_enabled() where appropriate...
On 16.08.2019 19:19, Paul Durrant wrote: > ...rather than testing the global iommu_enabled flag and ops pointer. > > Now that there is a per-domain flag indicating whether the domain is > permitted to use the IOMMU (which determines whether the ops pointer will > be set), many tests of the global iommu_enabled flag and ops pointer can > be translated into tests of the per-domain flag. Some of the other tests of > purely the global iommu_enabled flag can also be translated into tests of > the per-domain flag. > > NOTE: The comment in iommu_share_p2m_table() is also fixed; need_iommu() > disappeared some time ago. Also, whilst the style of the 'if' in > flask_iommu_resource_use_perm() is fixed, I have not translated any > instances of u32 into uint32_t to keep consistency. IMO such a > translation would be better done globally for the source module in > a separate patch. > The change in the initialization of the 'hd' variable in iommu_map() > and iommu_unmap() is done to keep the PV shim build happy. Without > this change it will fail to compile with errors of the form: > > iommu.c:361:32: error: unused variable ‘hd’ [-Werror=unused-variable] > const struct domain_iommu *hd = dom_iommu(d); > ^~ Why would that be? Afaict there's no short-circuiting of is_iommu_enabled() (albeit there could be, as we don't mean the shim to have/use an IOMMU). Oh - I guess I see it: Instead of this change what I think we want is for x86's iommu_call() to evaluate its 1st argument. Otherwise we're liable to run into the same issue elsewhere again. > --- a/xen/xsm/flask/hooks.c > +++ b/xen/xsm/flask/hooks.c > @@ -886,7 +886,7 @@ static int flask_map_domain_msi (struct domain *d, int > irq, const void *data, > #endif > } > > -static u32 flask_iommu_resource_use_perm(void) > +static u32 flask_iommu_resource_use_perm(struct domain *d) const? With these adjustments Reviewed-by: Jan Beulich Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v6 08/10] remove late (on-demand) construction of IOMMU page tables
On 16.08.2019 19:19, Paul Durrant wrote: > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -146,6 +146,17 @@ static int __init parse_dom0_iommu_param(const char *s) > } > custom_param("dom0-iommu", parse_dom0_iommu_param); > > +static void __hwdom_init check_hwdom_reqs(struct domain *d) This really should have const, but I realize ... > +{ > +if ( iommu_hwdom_none || !paging_mode_translate(d) ) > +return; > + > +arch_iommu_check_autotranslated_hwdom(d); ... this one wants non-const (for - afaict - no reason). > @@ -159,129 +170,44 @@ int iommu_domain_init(struct domain *d) > return ret; > > hd->platform_ops = iommu_get_ops(); > -return hd->platform_ops->init(d); > -} > +ret = hd->platform_ops->init(d); > +if ( ret ) > +return ret; > > -static void __hwdom_init check_hwdom_reqs(struct domain *d) > -{ > -if ( iommu_hwdom_none || !paging_mode_translate(d) ) > -return; > +/* > + * NB: 'relaxed' h/w domains don't need the IOMMU mappings to be kept > + * in-sync with their assigned pages because all host RAM will be > + * mapped during hwdom_init(). > + */ Doesn't this comment belong to ... > +if ( is_hardware_domain(d) ) > +check_hwdom_reqs(d); /* may modify iommu_hwdom_strict */ > > -arch_iommu_check_autotranslated_hwdom(d); > +if ( !is_hardware_domain(d) || iommu_hwdom_strict ) > +hd->need_sync = !iommu_use_hap_pt(d); ... this if()? > @@ -629,8 +552,7 @@ static void iommu_dump_p2m_table(unsigned char key) > ops = iommu_get_ops(); > for_each_domain(d) > { > -if ( is_hardware_domain(d) || > - dom_iommu(d)->status < IOMMU_STATUS_initialized ) > +if ( !is_iommu_enabled(d) ) > continue; Didn't you agree to retain the hwdom part of the condition here? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading
On 19.08.2019 03:25, Chao Gao wrote: > @@ -232,6 +276,34 @@ bool microcode_update_cache(struct microcode_patch > *patch) > return true; > } > > +/* Wait for a condition to be met with a timeout (us). */ > +static int wait_for_condition(int (*func)(void *data), void *data, > + unsigned int timeout) > +{ > +while ( !func(data) ) > +{ > +if ( !timeout-- ) > +{ > +printk("CPU%u: Timeout in %pS\n", > + smp_processor_id(), __builtin_return_address(0)); > +return -EBUSY; > +} > +udelay(1); > +} > + > +return 0; > +} > + > +static int wait_cpu_callin(void *nr) > +{ > +return cpumask_weight(_callin_map) >= (unsigned long)nr; > +} > + > +static int wait_cpu_callout(void *nr) > +{ > +return atomic_read(_out) >= (unsigned long)nr; > +} Since wait_for_condition() is used with only these two functions as callbacks, they should imo return bool and take const void *. > @@ -265,37 +337,155 @@ static int microcode_update_cpu(const struct > microcode_patch *patch) > return err; > } > > -static long do_microcode_update(void *patch) > +static int slave_thread_fn(void) > +{ > +unsigned int cpu = smp_processor_id(); > +unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask)); > + > +while ( loading_state != LOADING_CALLIN ) > +cpu_relax(); > + > +cpumask_set_cpu(cpu, _callin_map); > + > +while ( loading_state != LOADING_EXIT ) > +cpu_relax(); > + > +/* Copy update revision from the "master" thread. */ > +this_cpu(cpu_sig).rev = per_cpu(cpu_sig, master).rev; > + > +return 0; > +} > + > +static int master_thread_fn(const struct microcode_patch *patch) > +{ > +unsigned int cpu = smp_processor_id(); > +int ret = 0; > + > +while ( loading_state != LOADING_CALLIN ) > +cpu_relax(); > + > +cpumask_set_cpu(cpu, _callin_map); > + > +while ( loading_state != LOADING_ENTER ) > +cpu_relax(); > + > +/* > + * If an error happened, control thread would set 'loading_state' > + * to LOADING_EXIT. Don't perform ucode loading for this case > + */ > +if ( loading_state == LOADING_EXIT ) > +return ret; Even if the producer transitions this through ENTER to EXIT, the observer here may never get to see the ENTER state, and hence never exit the loop above. You want either < ENTER or == CALLIN. > +ret = microcode_ops->apply_microcode(patch); > +if ( !ret ) > +atomic_inc(_updated); > +atomic_inc(_out); > + > +while ( loading_state != LOADING_EXIT ) > +cpu_relax(); > + > +return ret; > +} As a cosmetic remark, I don't think "master" and "slave" are suitable terms here. "primary" and "secondary" would imo come closer to what the threads' relationship is. > +static int control_thread_fn(const struct microcode_patch *patch) > { > -unsigned int cpu; > +unsigned int cpu = smp_processor_id(), done; > +unsigned long tick; > +int ret; > > -/* Store the patch after a successful loading */ > -if ( !microcode_update_cpu(patch) && patch ) > +/* Allow threads to call in */ > +loading_state = LOADING_CALLIN; > +smp_mb(); Why not just smp_wmb()? (Same further down then.) > +cpumask_set_cpu(cpu, _callin_map); > + > +/* Waiting for all threads calling in */ > +ret = wait_for_condition(wait_cpu_callin, > + (void *)(unsigned long)num_online_cpus(), > + MICROCODE_CALLIN_TIMEOUT_US); > +if ( ret ) { Misplaced brace. > +static int do_microcode_update(void *patch) const? > @@ -326,19 +523,67 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) > buf, unsigned long len) > { > ret = PTR_ERR(patch); > printk(XENLOG_INFO "Parsing microcode blob error %d\n", ret); > -goto free; > +goto put; > } > > if ( !patch ) > { > printk(XENLOG_INFO "No ucode found. Update aborted!\n"); > ret = -EINVAL; > -goto free; > +goto put; > +} > + > +cpumask_clear(_callin_map); > +atomic_set(_out, 0); > +atomic_set(_updated, 0); > +loading_state = LOADING_PREPARE; > + > +/* Calculate the number of online CPU core */ > +nr_cores = 0; > +for_each_online_cpu(cpu) > +if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) ) > +nr_cores++; > + > +printk(XENLOG_INFO "%u cores are to update their microcode\n", nr_cores); > + > +/* > + * We intend to disable interrupt for long time, which may lead to > + * watchdog timeout. > + */ > +watchdog_disable(); > +/* > + * Late loading dance. Why the heavy-handed stop_machine effort? > + * > + * - HT siblings must be idle and not execute other code while the other > + * sibling is loading microcode in order to avoid any negative > + * interactions cause by the
Re: [Xen-devel] [PATCH] x86: properly gate clearing of PKU feature
On 29.08.2019 15:21, Andrew Cooper wrote: > On 29/08/2019 14:13, Jan Beulich wrote: >> On 29.08.2019 12:46, Andrew Cooper wrote: >>> On 29/08/2019 10:27, Jan Beulich wrote: setup_clear_cpu_cap() is __init and hence may not be called post-boot. Note that opt_pku nevertheless is not getting __initdata added - see e.g. commit 43fa95ae6a ("mm: make opt_bootscrub non-init"). Signed-off-by: Jan Beulich >>> Acked-by: Andrew Cooper >>> >>> However, I'm tempted to suggest that this logic disappears entirely. >>> Given that we don't support it for PV guests, and can't without taking a >>> CR4 write on toggle_kernel_mode(), all this actually controls is whether >>> the bit finds its way into the HVM max policy. >> Well, if you mean replacing opt_pku by an addition to >> parse_xen_cpuid(), then I'll be happy to do that as a follow-on. >> If there is another direction you've been considering, then >> you'd have to at least outline it. > > Well - I did say "disappear entirely". I don't see it as having, or > liable to gain, any useful purpose. > > In reality, it was subsumed into cpuid= by my patch to allow all bits to > be tweak-able. Did you post this patch already (I don't seem to recall; I only recall there being the plan to do this)? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v6 09/10] iommu: tidy up iommu_use_hap_pt() and need_iommu_pt_sync() macros
On 16.08.2019 19:20, Paul Durrant wrote: > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -102,8 +102,10 @@ static int __init parse_iommu_param(const char *s) > iommu_hwdom_passthrough = val; > else if ( (val = parse_boolean("dom0-strict", s, ss)) >= 0 ) > iommu_hwdom_strict = val; > +#ifndef CONFIG_ARM > else if ( (val = parse_boolean("sharept", s, ss)) >= 0 ) > iommu_hap_pt_share = val; > +#endif > else > rc = -EINVAL; I think you/we should go further here: Arm should #define this to true, and here we should have "#ifndef iommu_hap_pt_share". > --- a/xen/include/xen/iommu.h > +++ b/xen/include/xen/iommu.h > @@ -268,6 +268,17 @@ struct domain_iommu { > #define iommu_set_feature(d, f) set_bit(f, dom_iommu(d)->features) > #define iommu_clear_feature(d, f) clear_bit(f, dom_iommu(d)->features) > > +/* Are we using the domain P2M table as its IOMMU pagetable? */ > +#define iommu_use_hap_pt(d) \ > +(hap_enabled(d) && is_iommu_enabled(d) && iommu_hap_pt_share) > + > +/* Does the IOMMU pagetable need to be kept synchronized with the P2M */ > +#ifdef CONFIG_HAS_PASSTHROUGH > +#define need_iommu_pt_sync(d) (dom_iommu(d)->need_sync) > +#else > +#define need_iommu_pt_sync(d) false I think you'd better evaluate d here; one (somewhat in risk of opposition) variant would be #define need_iommu_pt_sync(d) (!(d)) Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/3] x86: a little bit of 16-bit video mode setting code cleanup
On 29.08.2019 16:08, Andrew Cooper wrote: > On 14/06/2019 12:38, Jan Beulich wrote: >> To "compensate" for the code size growth by an earlier change: >> - drop "trampoline" labels (in almost all cases the target label is >> reachable with an 8-bit-displacement branch anyway, and a single 16- >> bit-displacement branch is still better than a pair of two branches) > > Do you happen to know why we any to start with? It can't plausibly be > for manual code alignment reasons. I have no idea - my guess is that some pre-386 code was cloned, or it was written by someone used to the pre-386 limitations. >> @@ -421,7 +418,7 @@ setspc: xorb%bh, %bh >> >> setmenu: >> orb %al, %al# 80x25 is an exception >> -jz _set_80x25 >> +jz set_80x25 >> >> pushw %bx # Set mode chosen from menu >> callmode_table # Build the mode table >> @@ -441,36 +438,32 @@ check_vesa: >> cmpw$0x004f, %ax >> jnz setbad >> >> -leawvesa_mode_info, %di >> -subb$VIDEO_FIRST_VESA>>8, %bh >> -movw%bx, %cx# Get mode information structure >> +leawvesa_mode_info, %di # Get mode information structure >> +leaw-VIDEO_FIRST_VESA(%bx), %cx >> movw$0x4f01, %ax >> int $0x10 >> -addb$VIDEO_FIRST_VESA>>8, %bh > > Is this the redundant instruction you are talking about, or ... (near > the end)? No, here it's simply ... > I think I follow this as "no logical change", by leaving %bx intact > throughout, and only clearing %ch as part of the %bx=>%cx copy. ... as you say. >> --- a/xen/arch/x86/boot/wakeup.S >> +++ b/xen/arch/x86/boot/wakeup.S >> @@ -30,7 +30,7 @@ ENTRY(wakeup_start) >> jne bogus_real_magic >> >> # for acpi_sleep=s3_bios >> -testl $1, wakesym(video_flags) >> +testb $1, wakesym(video_flags) > > video_flags is currently .long, and, AFAICT, uses 2 bits even after this > series. We could get better code reduction by shrinking it to .byte. Well, the goal of this patch is to play with the assembly code. To do what you suggest I'd have to touch a C (or really .h) file as well. I'm fine doing this change though, but preferably in a separate patch. >> @@ -38,9 +38,9 @@ ENTRY(wakeup_start) >> movw%ax, %ss# Need this? How to ret if clobbered? >> >> 1: # for acpi_sleep=s3_mode >> -testl $2, wakesym(video_flags) >> +testb $2, wakesym(video_flags) >> jz 1f >> -movlwakesym(video_mode), %eax >> +movwwakesym(video_mode), %ax > > Similarly, video_mode can become .word, I think. But a word is less efficient to access (because of the operand size override), so I'd prefer to not shrink this one. Just let me know whether you agree, and I'll cook up a patch accordingly. >> @@ -55,48 +55,26 @@ ENTRY(wakeup_start) >> lmsw%ax # Turn on CR0.PE >> ljmpl $BOOT_CS32, $bootsym_rel(wakeup_32, 6) >> >> -/* This code uses an extended set of video mode numbers. These include: >> - * Aliases for standard modes >> - * NORMAL_VGA (-1) >> - * EXTENDED_VGA (-2) >> - * ASK_VGA (-3) >> - * Video modes numbered by menu position -- NOT RECOMMENDED because of lack >> - * of compatibility when extending the table. These are between 0x00 and >> 0xff. >> - */ >> -#define VIDEO_FIRST_MENU 0x >> - >> -/* Standard BIOS video modes (BIOS number + 0x0100) */ >> -#define VIDEO_FIRST_BIOS 0x0100 >> - >> -/* VESA BIOS video modes (VESA number + 0x0200) */ >> -#define VIDEO_FIRST_VESA 0x0200 >> - >> -/* Video7 special modes (BIOS number + 0x0900) */ >> -#define VIDEO_FIRST_V7 0x0900 >> - >> # Setting of user mode (AX=mode ID) => CF=success >> mode_setw: >> movw%ax, %bx >> cmpb$VIDEO_FIRST_VESA>>8, %ah >> jnc check_vesaw >> -decb%ah > > ... or is this the no functional change? If so, I'm not sure I agree, > given the clc below. How does the CLC matter? CF, as the comment says, indicates success. Whether or not there's a DEC ahead of it (which doesn't even alter CF) doesn't matter, does it? In any event - yes, that's the dead insn. I'll mention the function name in the description. Jan >> setbadw: clc >> ret ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH RFC 2/3] x86/ACPI: restore VESA mode upon resume from S3
On 14/06/2019 12:37, Jan Beulich wrote: > In order for "acpi_sleep=s3_mode" to have any effect, we should record > the video mode we switched to during boot. Since right now there's mode > setting code for VESA modes only in the resume case, record the mode > just in that one case. > > Signed-off-by: Jan Beulich > --- > RFC: On the box that I've been trying to test this on this didn't really > make a difference (in the random cases where resume works in the > first place there): The graphics card looks to remain powered off > even after the Dom0 kernel has resumed. Additionally using > "acpi_sleep=s3_bios" didn't make a difference either. Furthermore > it looks like the serial console (connected via PCI card) doesn't > work (yet) immediately after resume (I suppose it too is powered > down), and resume hangs altogether with it in use. Hence it's sort > of difficult to actually debug anything here. While you were away, I had an awful time debugging c/s 7aae9c1c91, even with COM1. I think we should take some proactive steps to try and make the serial settings more robust, so printk() does continue to function before the console irq gets restored. In the case of legacy COM1/COM2, this should be falling back to polled mode which at least lets the characters get out, and for PCI serial cards, we should probably make an attempt to bring it out of D3 ahead of relying on dom0 to do this. > --- > I'm wondering actually whether the user having to explicitly request the > mode restoration is a good model: Why would we _not_ want to restore the > mode we've set during boot? In the worst case Dom0 kernel or X will > change the mode another time. I think I agree. I can't see anything good which can come from offering a choice here, and I am all for reducing the quantity of 16bit VGA code we have. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [linux-next test] 140745: regressions - FAIL
flight 140745 linux-next real [real] http://logs.test-lab.xenproject.org/osstest/logs/140745/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-arm64-arm64-examine11 examine-serial/bootloader fail REGR. vs. 140676 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 14 saverestore-support-check fail blocked in 140676 test-armhf-armhf-libvirt-raw 13 saverestore-support-check fail blocked in 140676 test-amd64-i386-examine 8 reboot fail like 140676 test-amd64-i386-libvirt 7 xen-boot fail like 140676 test-amd64-i386-qemut-rhel6hvm-intel 7 xen-boot fail like 140676 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail like 140676 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail like 140676 test-amd64-i386-xl-qemuu-win10-i386 7 xen-boot fail like 140676 test-amd64-i386-qemut-rhel6hvm-amd 7 xen-bootfail like 140676 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail like 140676 test-amd64-i386-xl-shadow 7 xen-boot fail like 140676 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-boot fail like 140676 test-amd64-i386-xl-qemuu-debianhvm-amd64 7 xen-boot fail like 140676 test-amd64-i386-xl-xsm7 xen-boot fail like 140676 test-amd64-i386-xl7 xen-boot fail like 140676 test-amd64-i386-qemuu-rhel6hvm-amd 7 xen-bootfail like 140676 test-amd64-i386-xl-qemut-debianhvm-amd64 7 xen-boot fail like 140676 test-amd64-i386-freebsd10-i386 7 xen-bootfail like 140676 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-boot fail like 140676 test-amd64-i386-xl-raw7 xen-boot fail like 140676 test-amd64-i386-pair 10 xen-boot/src_hostfail like 140676 test-amd64-i386-pair 11 xen-boot/dst_hostfail like 140676 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail like 140676 test-amd64-i386-xl-qemut-win10-i386 7 xen-boot fail like 140676 test-amd64-i386-xl-qemut-ws16-amd64 7 xen-boot fail like 140676 test-amd64-i386-libvirt-xsm 7 xen-boot fail like 140676 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-boot fail like 140676 test-amd64-i386-qemuu-rhel6hvm-intel 7 xen-boot fail like 140676 test-amd64-i386-xl-qemuu-ws16-amd64 7 xen-boot fail like 140676 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-boot fail like 140676 test-amd64-i386-xl-qemut-win7-amd64 7 xen-boot fail like 140676 test-amd64-i386-xl-pvshim 7 xen-boot fail like 140676 test-amd64-i386-freebsd10-amd64 7 xen-boot fail like 140676 test-amd64-i386-xl-qemuu-win7-amd64 7 xen-boot fail like 140676 test-amd64-i386-xl-qemuu-ovmf-amd64 7 xen-boot fail like 140676 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 140676 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 140676 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 140676 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 140676 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass
Re: [Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h
On 29/08/2019 13:39, Juergen Gross wrote: > On 29.08.19 14:27, Jan Beulich wrote: >> On 29.08.2019 14:01, Juergen Gross wrote: >>> On 29.08.19 13:42, Andrew Cooper wrote: On 29/08/2019 10:28, Jan Beulich wrote: > Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24: > > There have been reports of RDRAND issues after resuming from > suspend on > some AMD family 15h and family 16h systems. This issue stems > from a BIOS > not performing the proper steps during resume to ensure > RDRAND continues > to function properly. > > Update the CPU initialization to clear the RDRAND CPUID bit > for any family > 15h and 16h processor that supports RDRAND. If it is known > that the family > 15h or family 16h system does not have an RDRAND resume > issue or that the > system will not be placed in suspend, the "cpuid=rdrand" > kernel parameter > can be used to stop the clearing of the RDRAND CPUID bit. > > Note, that clearing the RDRAND CPUID bit does not prevent a > processor > that normally supports the RDRAND instruction from executing > it. So any > code that determined the support based on family and model > won't #UD. >>> >>> Yeah, but it will no longer see random numbers after resume. >> >> That's the implication; note that I've taken the text from the >> Linux commit. >> > Signed-off-by: Jan Beulich > --- > Slightly RFC, in particular because of the change to > parse_xen_cpuid(): > Alternative approach suggestions are welcome. This issue was on my radar, but I hadn't got around to starting a patch yet. I was wondering if we could perhaps default it to whether S3 was available, but looking at the code which prints "ACPI sleep modes: S3", it doesn't appear to be related to any real ACPI information. >>> >>> Wouldn't it make more sense to inhibit suspend/resume instead? >> >> That's an alternative option. But if anything I'd see us providing >> both, not the one you suggest instead of what the patch here does. >> >>> I'm quite sure a machine running Xen is very rarely put to S3. >> >> I'm not at all sure about this. > > Suspend/resume in Xen was broken for more than 3 months in the late 4.12 > development phase and nobody noticed it. Only when I started my core > scheduling series I came across the issue. OpenXT and Qubes will come after you with sticks... For traditional server scenarios, S3 is rare to non-existent, but Xen also runs in a number of well known cases where S3 is very important. The reason S3 breaks frequently (see also Jan's IRQ adjustments broke it for a while in staging), is because none of the developers who work in these areas use it. I now have a test scenario using a CoffeeLake system sat on my desk, but it only gets testing when I'm poking the low level code, or when I suspect a change might have an impact. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v6 08/10] remove late (on-demand) construction of IOMMU page tables
> -Original Message- > From: Jan Beulich > Sent: 29 August 2019 14:39 > To: Paul Durrant > Cc: xen-devel@lists.xenproject.org; Julien Grall ; > Alexandru Isaila > ; Petre Pircalabu ; > Razvan Cojocaru > ; Andrew Cooper ; Roger > Pau Monne > ; Volodymyr Babchuk ; > George Dunlap > ; Ian Jackson ; Stefano > Stabellini > ; KonradRzeszutek Wilk ; > Tamas K Lengyel > ; Tim (Xen.org) ; Wei Liu > Subject: Re: [PATCH v6 08/10] remove late (on-demand) construction of IOMMU > page tables > > On 16.08.2019 19:19, Paul Durrant wrote: > > --- a/xen/drivers/passthrough/iommu.c > > +++ b/xen/drivers/passthrough/iommu.c > > @@ -146,6 +146,17 @@ static int __init parse_dom0_iommu_param(const char *s) > > } > > custom_param("dom0-iommu", parse_dom0_iommu_param); > > > > +static void __hwdom_init check_hwdom_reqs(struct domain *d) > > This really should have const, but I realize ... > > > +{ > > +if ( iommu_hwdom_none || !paging_mode_translate(d) ) > > +return; > > + > > +arch_iommu_check_autotranslated_hwdom(d); > > ... this one wants non-const (for - afaict - no reason). > > > @@ -159,129 +170,44 @@ int iommu_domain_init(struct domain *d) > > return ret; > > > > hd->platform_ops = iommu_get_ops(); > > -return hd->platform_ops->init(d); > > -} > > +ret = hd->platform_ops->init(d); > > +if ( ret ) > > +return ret; > > > > -static void __hwdom_init check_hwdom_reqs(struct domain *d) > > -{ > > -if ( iommu_hwdom_none || !paging_mode_translate(d) ) > > -return; > > +/* > > + * NB: 'relaxed' h/w domains don't need the IOMMU mappings to be kept > > + * in-sync with their assigned pages because all host RAM will be > > + * mapped during hwdom_init(). > > + */ > > Doesn't this comment belong to ... > > > +if ( is_hardware_domain(d) ) > > +check_hwdom_reqs(d); /* may modify iommu_hwdom_strict */ > > > > -arch_iommu_check_autotranslated_hwdom(d); > > +if ( !is_hardware_domain(d) || iommu_hwdom_strict ) > > +hd->need_sync = !iommu_use_hap_pt(d); > > ... this if()? Yes, it's probably adrift from where it should be now. > > > @@ -629,8 +552,7 @@ static void iommu_dump_p2m_table(unsigned char key) > > ops = iommu_get_ops(); > > for_each_domain(d) > > { > > -if ( is_hardware_domain(d) || > > - dom_iommu(d)->status < IOMMU_STATUS_initialized ) > > +if ( !is_iommu_enabled(d) ) > > continue; > > Didn't you agree to retain the hwdom part of the condition here? > Indeed I have as of today, but this was posted 2 weeks ago :-) Paul > Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/3] x86: a little bit of 16-bit video mode setting code cleanup
On 29/08/2019 15:23, Jan Beulich wrote: > On 29.08.2019 16:08, Andrew Cooper wrote: >> On 14/06/2019 12:38, Jan Beulich wrote: >>> To "compensate" for the code size growth by an earlier change: >>> - drop "trampoline" labels (in almost all cases the target label is >>> reachable with an 8-bit-displacement branch anyway, and a single 16- >>> bit-displacement branch is still better than a pair of two branches) >> Do you happen to know why we any to start with? It can't plausibly be >> for manual code alignment reasons. > I have no idea - my guess is that some pre-386 code was cloned, or > it was written by someone used to the pre-386 limitations. > >>> @@ -421,7 +418,7 @@ setspc: xorb%bh, %bh >>> >>> setmenu: >>> orb %al, %al# 80x25 is an exception >>> -jz _set_80x25 >>> +jz set_80x25 >>> >>> pushw %bx # Set mode chosen from menu >>> callmode_table # Build the mode table >>> @@ -441,36 +438,32 @@ check_vesa: >>> cmpw$0x004f, %ax >>> jnz setbad >>> >>> -leawvesa_mode_info, %di >>> -subb$VIDEO_FIRST_VESA>>8, %bh >>> -movw%bx, %cx# Get mode information structure >>> +leawvesa_mode_info, %di # Get mode information structure >>> +leaw-VIDEO_FIRST_VESA(%bx), %cx >>> movw$0x4f01, %ax >>> int $0x10 >>> -addb$VIDEO_FIRST_VESA>>8, %bh >> Is this the redundant instruction you are talking about, or ... (near >> the end)? > No, here it's simply ... > >> I think I follow this as "no logical change", by leaving %bx intact >> throughout, and only clearing %ch as part of the %bx=>%cx copy. > ... as you say. > >>> --- a/xen/arch/x86/boot/wakeup.S >>> +++ b/xen/arch/x86/boot/wakeup.S >>> @@ -30,7 +30,7 @@ ENTRY(wakeup_start) >>> jne bogus_real_magic >>> >>> # for acpi_sleep=s3_bios >>> -testl $1, wakesym(video_flags) >>> +testb $1, wakesym(video_flags) >> video_flags is currently .long, and, AFAICT, uses 2 bits even after this >> series. We could get better code reduction by shrinking it to .byte. > Well, the goal of this patch is to play with the assembly code. To > do what you suggest I'd have to touch a C (or really .h) file as > well. I'm fine doing this change though, but preferably in a separate > patch. This is fine, and may indeed want to wait until David has finished trampoline work. > >>> @@ -38,9 +38,9 @@ ENTRY(wakeup_start) >>> movw%ax, %ss# Need this? How to ret if clobbered? >>> >>> 1: # for acpi_sleep=s3_mode >>> -testl $2, wakesym(video_flags) >>> +testb $2, wakesym(video_flags) >>> jz 1f >>> -movlwakesym(video_mode), %eax >>> +movwwakesym(video_mode), %ax >> Similarly, video_mode can become .word, I think. > But a word is less efficient to access (because of the operand size > override), so I'd prefer to not shrink this one. Just let me know > whether you agree, and I'll cook up a patch accordingly. This is 16 bit code so it is movl which has the operand size prefix, not movw. It is extern'd in C, but not wrapped in bootsym(), and I can't see it being read anywhere. > >>> @@ -55,48 +55,26 @@ ENTRY(wakeup_start) >>> lmsw%ax # Turn on CR0.PE >>> ljmpl $BOOT_CS32, $bootsym_rel(wakeup_32, 6) >>> >>> -/* This code uses an extended set of video mode numbers. These include: >>> - * Aliases for standard modes >>> - * NORMAL_VGA (-1) >>> - * EXTENDED_VGA (-2) >>> - * ASK_VGA (-3) >>> - * Video modes numbered by menu position -- NOT RECOMMENDED because of lack >>> - * of compatibility when extending the table. These are between 0x00 and >>> 0xff. >>> - */ >>> -#define VIDEO_FIRST_MENU 0x >>> - >>> -/* Standard BIOS video modes (BIOS number + 0x0100) */ >>> -#define VIDEO_FIRST_BIOS 0x0100 >>> - >>> -/* VESA BIOS video modes (VESA number + 0x0200) */ >>> -#define VIDEO_FIRST_VESA 0x0200 >>> - >>> -/* Video7 special modes (BIOS number + 0x0900) */ >>> -#define VIDEO_FIRST_V7 0x0900 >>> - >>> # Setting of user mode (AX=mode ID) => CF=success >>> mode_setw: >>> movw%ax, %bx >>> cmpb$VIDEO_FIRST_VESA>>8, %ah >>> jnc check_vesaw >>> -decb%ah >> ... or is this the no functional change? If so, I'm not sure I agree, >> given the clc below. > How does the CLC matter? CF, as the comment says, indicates success. > Whether or not there's a DEC ahead of it (which doesn't even alter > CF) doesn't matter, does it? I'd forgotten that dec left CF intact, and was thinking more like sub $1, where it would matter. > In any event - yes, that's the dead insn. I'll mention the function > name in the description. Please do. ~Andrew ___ Xen-devel mailing list
Re: [Xen-devel] [PATCH 3/3] x86: a little bit of 16-bit video mode setting code cleanup
On 29.08.2019 16:38, Andrew Cooper wrote: > On 29/08/2019 15:23, Jan Beulich wrote: >> On 29.08.2019 16:08, Andrew Cooper wrote: >>> On 14/06/2019 12:38, Jan Beulich wrote: @@ -38,9 +38,9 @@ ENTRY(wakeup_start) movw%ax, %ss# Need this? How to ret if clobbered? 1: # for acpi_sleep=s3_mode -testl $2, wakesym(video_flags) +testb $2, wakesym(video_flags) jz 1f -movlwakesym(video_mode), %eax +movwwakesym(video_mode), %ax >>> Similarly, video_mode can become .word, I think. >> But a word is less efficient to access (because of the operand size >> override), so I'd prefer to not shrink this one. Just let me know >> whether you agree, and I'll cook up a patch accordingly. > > This is 16 bit code so it is movl which has the operand size prefix, not > movw. > > It is extern'd in C, but not wrapped in bootsym(), and I can't see it > being read anywhere. Oh, indeed. I'll ditch the extern then. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 08/12] livepatch: Add support for inline asm hotpatching expectations
On 29. Aug 2019, at 16:34, Konrad Rzeszutek Wilk mailto:kon...@darnok.org>> wrote: …snip... + +struct livepatch_func __section(".livepatch.funcs") livepatch_exceptions = { +.version = LIVEPATCH_PAYLOAD_VERSION, +.name = livepatch_exceptions_str, +.new_addr = xen_hello_world, +.old_addr = xen_extra_version, +.new_size = EXPECT_BYTES_COUNT, +.old_size = EXPECT_BYTES_COUNT, +.expect = { +.enabled = 1, +.len = EXPECT_BYTES_COUNT, +.data = EXPECT_BYTES +}, + +}; When I compile with 32-bit ARM 'make tests' I get: arm-eabi-ld-EL -EL --build-id=sha1 -r -o xen_action_hooks_norevert.livepatch xen_action_hooks_marker.o xen_hello_world_func.o note.o xen_note.o make[3]: Circular expect_config.h <- xen_expectations.o dependency dropped. objdump: can't disassemble for architecture UNKNOWN! (set -e; \ echo "#define EXPECT_BYTES {"; \ echo "#define EXPECT_BYTES_COUNT 6") > expect_config.h arm-eabi-gcc -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include /home/konrad/A20/xen.git/xen/include/xen/config.h '-D__OBJECT_FILE__="xen_expectations.o"' -Wa,--strip-local-absolute -g -MMD -MF ./.xen_expectations.o.d -msoft-float -mcpu=cortex-a15 -DCONFIG_EARLY_PRINTK -DEARLY_PRINTK_INC=\"debug-8250.inc\" -DEARLY_PRINTK_BAUD= -DEARLY_UART_BASE_ADDRESS=0x01c28000 -DEARLY_UART_REG_SHIFT=2 -I/home/konrad/A20/xen.git/xen/include -fno-stack-protector -fno-exceptions -Wnested-externs -DGCC_HAS_VISIBILITY_ATTRIBUTE -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -c xen_expectations.c -o xen_expectations.o xen_expectations.c:31:2: error: expected '}' before ';' token }; ^ xen_expectations.c:18:76: note: to match this '{' struct livepatch_func __section(".livepatch.funcs") livepatch_exceptions = { ^ make[3]: *** [/home/konrad/A20/xen.git/xen/Rules.mk:202: xen_expectations.o] Error 1 make[3]: Leaving directory '/home/konrad/A20/xen.git/xen/test/livepatch' And this is what expect_config.h looks like: #define EXPECT_BYTES { #define EXPECT_BYTES_COUNT 6 Hi Konrad, I think I fixed it with the following change on top: [DIFF START] diff --git a/xen/test/livepatch/Makefile b/xen/test/livepatch/Makefile index 067861903f..20bdeb6c6d 100644 --- a/xen/test/livepatch/Makefile +++ b/xen/test/livepatch/Makefile @@ -186,10 +186,17 @@ xen_actions_hooks_norevert.o: config.h $(LIVEPATCH_ACTION_HOOKS_NOREVERT): xen_action_hooks_marker.o xen_hello_world_func.o note.o xen_note.o $(LD) $(LDFLAGS) $(build_id_linker) -r -o $(LIVEPATCH_ACTION_HOOKS_NOREVERT) $^ -CODE_GET_EXPECT=$(shell objdump -d --insn-width=1 $(1) | grep -A6 -E '<'$(2)'>:' | tail -n +2 | awk 'BEGIN {printf "{"} {printf "0x%s,", $$2}' | sed 's/,$$/}/g') +ifeq ($(findstring arm, $(TARGET_ARCH) $(ARCH)),arm) +INSN_WIDTH := 4 +INSN_PER_LINE := 2 +else +INSN_WIDTH := 1 +INSN_PER_LINE := 8 +endif +CODE_GET_EXPECT=$(shell $(OBJDUMP) -d --insn-width=$(INSN_WIDTH) $(1) | sed -n -e '/<'$(2)'>:$$/,/^$$/ p' | tail -n +2 | head -n $(INSN_PER_LINE) | awk '{printf "%s", $$2}' | sed 's/.\{2\}/0x&,/g' | sed 's/^/{/;s/,$$/}/g') .PHONY: expect_config.h expect_config.h: EXPECT_BYTES=$(call CODE_GET_EXPECT,$(BASEDIR)/xen-syms,xen_extra_version) -expect_config.h: EXPECT_BYTES_COUNT=6 +expect_config.h: EXPECT_BYTES_COUNT=8 expect_config.h: xen_expectations.o (set -e; \ echo "#define EXPECT_BYTES $(EXPECT_BYTES)"; \ [DIFF END] I will test that some more and add it to the v3 of the patchset. Best Regards, Pawel Wieczorkiewicz Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 1/2] xen: Switch parameter in get_page_from_gfn to use typesafe gfn
On 19.08.2019 16:26, Julien Grall wrote: > No functional change intended. > > Only reasonable clean-ups are done in this patch. The rest will use _gfn > for the time being. > > The code in get_page_from_gfn is slightly reworked to also handle a bad > behavior because it is not safe to use mfn_to_page(...) because > mfn_valid(...) succeeds. I guess the 2nd "because" is meant to be "before", but even then I don't think I can easily agree: mfn_to_page() is just calculations, no dereference. > --- a/xen/arch/x86/hvm/domain.c > +++ b/xen/arch/x86/hvm/domain.c > @@ -296,8 +296,10 @@ int arch_set_info_hvm_guest(struct vcpu *v, const > vcpu_hvm_context_t *ctx) > if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) ) > { > /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ > -struct page_info *page = get_page_from_gfn(v->domain, > - v->arch.hvm.guest_cr[3] >> PAGE_SHIFT, > +struct page_info *page; > + > +page = get_page_from_gfn(v->domain, > + gaddr_to_gfn(v->arch.hvm.guest_cr[3]), > NULL, P2M_ALLOC); Iirc I've said so before: I consider use of gaddr_to_gfn() here more misleading than a plain shift by PAGE_SHIFT. Neither is really correct, but in no event does CR3 as a whole hold an address. (Same elsewhere then, and sadly in quite a few places.) > --- a/xen/common/event_fifo.c > +++ b/xen/common/event_fifo.c > @@ -361,7 +361,7 @@ static const struct evtchn_port_ops evtchn_port_ops_fifo = > .print_state = evtchn_fifo_print_state, > }; > > -static int map_guest_page(struct domain *d, uint64_t gfn, void **virt) > +static int map_guest_page(struct domain *d, gfn_t gfn, void **virt) > { > struct page_info *p; > > @@ -422,7 +422,7 @@ static int setup_control_block(struct vcpu *v) > return 0; > } > > -static int map_control_block(struct vcpu *v, uint64_t gfn, uint32_t offset) > +static int map_control_block(struct vcpu *v, gfn_t gfn, uint32_t offset) > { > void *virt; > unsigned int i; Just as a remark (no action expected) - this makes a truncation issue pretty apparent: On 32-bit platforms the upper 32 bits of a passed in GFN get ignored. Correct (imo) behavior would be to reject the upper bits being non-zero. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [xen-unstable-smoke test] 140800: tolerable all pass - PUSHED
flight 140800 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/140800/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass version targeted for testing: xen 41c7700a00011ad08be3c9d71126b67e08e58ac3 baseline version: xen f1a94202128e677be5c14f84492f80a12dba9571 Last test of basis 140776 2019-08-28 21:01:11 Z0 days Testing same since 140800 2019-08-29 14:00:34 Z0 days1 attempts People who touched revisions under test: Andrew Cooper George Dunlap Jan Beulich Roger Pau Monné jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git f1a9420212..41c7700a00 41c7700a00011ad08be3c9d71126b67e08e58ac3 -> smoke ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH RFC 2/3] x86/ACPI: restore VESA mode upon resume from S3
On 29.08.2019 16:45, Andrew Cooper wrote: > On 14/06/2019 12:37, Jan Beulich wrote: >> --- >> I'm wondering actually whether the user having to explicitly request the >> mode restoration is a good model: Why would we _not_ want to restore the >> mode we've set during boot? In the worst case Dom0 kernel or X will >> change the mode another time. > > I think I agree. I can't see anything good which can come from offering > a choice here, and I am all for reducing the quantity of 16bit VGA code > we have. Well, mode restoration may of course hang due to BIOS issues. But in such a case the question is how sensible it is to use S3 on such a system. Also note that adjusting this isn't going to reduce the quantity of 16-bit code; it would only be a change to what video_flags defaults to. Right now, without the command line option provided, mode restoration would happen only if reset_videomode_after_s3() gets invoked, i.e. just on a single Toshiba laptop model which I guess didn't even have 64-bit capable CPUs. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [RFC Patch] xen/pt: Emulate FLR capability
On Thu, Aug 29, 2019 at 12:03:44PM +0200, Jan Beulich wrote: >On 29.08.2019 11:02, Chao Gao wrote: >> Currently, for a HVM on Xen, no reset method is virtualized. So in a VM's >> perspective, assigned devices cannot be reset. But some devices rely on PCI >> reset to recover from hardware hangs. When being assigned to a VM, those >> devices cannot be reset and won't work any longer if a hardware hang occurs. >> We have to reboot VM to trigger PCI reset on host to recover the device. > >Did you consider a hot-unplug, reset (by host), hot-plug cycle instead? Yes. I considered this means. But it needs host to initiate this action. However, when a device needs reset is determined by the device driver in VM. So in practice, VM still needs a way to notify host to do unplug/reset/plug. As the standard FLR capability can meet the requirement, I don't try to invent one. > >> +static int xen_pt_devctl_reg_write(XenPCIPassthroughState *s, >> + XenPTReg *cfg_entry, uint16_t *val, >> + uint16_t dev_value, uint16_t valid_mask) >> +{ >> +if (s->real_device.is_resetable && (*val & PCI_EXP_DEVCTL_BCR_FLR)) { >> +xen_pt_reset(s); >> +} >> +return xen_pt_word_reg_write(s, cfg_entry, val, dev_value, valid_mask); > >I think you also need to clear the bit before handing on the request, >such that reads will always observe it clear. Will do. Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 01/11] checkpatch: check for nested (un)?likely() calls
IS_ERR(), IS_ERR_OR_NULL(), IS_ERR_VALUE() and WARN*() already contain unlikely() optimization internally. Thus, there is no point in calling these functions and defines under likely()/unlikely(). This check is based on the coccinelle rule developed by Enrico Weigelt https://lore.kernel.org/lkml/1559767582-11081-1-git-send-email-i...@metux.net/ Signed-off-by: Denis Efremov Cc: Joe Perches Cc: Andrew Morton Cc: Andy Whitcroft --- scripts/checkpatch.pl | 6 ++ 1 file changed, 6 insertions(+) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 93a7edfe0f05..56969ce06df4 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -6480,6 +6480,12 @@ sub process { "Using $1 should generally have parentheses around the comparison\n" . $herecurr); } +# nested likely/unlikely calls + if ($line =~ /\b(?:(?:un)?likely)\s*\(\s*!?\s*(IS_ERR(?:_OR_NULL|_VALUE)?|WARN)/) { + WARN("LIKELY_MISUSE", +"nested (un)?likely() calls, $1 already uses unlikely() internally\n" . $herecurr); + } + # whine mightly about in_atomic if ($line =~ /\bin_atomic\s*\(/) { if ($realfile =~ m@^drivers/@) { -- 2.21.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v6 10/10] introduce a 'passthrough' configuration option to xl.cfg...
> -Original Message- > From: Roger Pau Monne > Sent: 23 August 2019 15:17 > To: Paul Durrant > Cc: xen-devel@lists.xenproject.org; Ian Jackson ; Wei > Liu ; Andrew > Cooper ; George Dunlap ; > Jan Beulich > ; Julien Grall ; Konrad Rzeszutek > Wilk > ; Stefano Stabellini ; Tim > (Xen.org) ; > Anthony Perard ; Volodymyr Babchuk > > Subject: Re: [PATCH v6 10/10] introduce a 'passthrough' configuration option > to xl.cfg... > > On Fri, Aug 16, 2019 at 06:20:01PM +0100, Paul Durrant wrote: > > ...and hence the ability to disable IOMMU mappings, and control EPT > > sharing. > > > > This patch introduces a new 'libxl_passthrough' enumeration into > > libxl_domain_create_info. The value will be set by xl either when it parses > > a new 'passthrough' option in xl.cfg, or implicitly if there is passthrough > > hardware specified for the domain. > > > > If the value of the passthrough configuration option is 'disabled' then > > the XEN_DOMCTL_CDF_iommu flag will be clear in the xen_domctl_createdomain > > flags, thus allowing the toolstack to control whether the domain gets > > IOMMU mappings or not (where previously they were globally set). > > > > If the value of the passthrough configuration option is 'sync_pt' then > > a new 'iommu_opts' field in xen_domctl_createdomain will be set with the > > value XEN_DOMCTL_IOMMU_no_sharept. This will override the global default > > set in iommu_hap_pt_share, thus allowing the toolstack to control whether > > EPT sharing is used for the domain. > > > > Signed-off-by: Paul Durrant > > --- > > Cc: Ian Jackson > > Cc: Wei Liu > > Cc: Andrew Cooper > > Cc: George Dunlap > > Cc: Jan Beulich > > Cc: Julien Grall > > Cc: Konrad Rzeszutek Wilk > > Cc: Stefano Stabellini > > Cc: Tim Deegan > > Cc: Anthony PERARD > > Cc: Volodymyr Babchuk > > Cc: "Roger Pau Monné" > > > > Previously part of series > > https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg02267.html > > > > v6: > > - Remove the libxl_physinfo() call since it's usefulness is limited to x86 > > > > v5: > > - Expand xen_domctl_createdomain flags field and hence bump interface > >version > > - Fix spelling mistakes in context line > > --- > > docs/man/xl.cfg.5.pod.in| 52 + > > tools/libxl/libxl.h | 5 > > tools/libxl/libxl_create.c | 9 ++ > > tools/libxl/libxl_types.idl | 7 + > > tools/xl/xl_parse.c | 38 > > xen/arch/arm/domain.c | 10 ++- > > xen/arch/x86/domain.c | 2 +- > > xen/common/domain.c | 7 + > > xen/drivers/passthrough/iommu.c | 13 - > > xen/include/public/domctl.h | 6 +++- > > xen/include/xen/iommu.h | 19 > > 11 files changed, 158 insertions(+), 10 deletions(-) > > > > diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in > > index c99d40307e..fd35685e9e 100644 > > --- a/docs/man/xl.cfg.5.pod.in > > +++ b/docs/man/xl.cfg.5.pod.in > > @@ -605,6 +605,58 @@ option should only be used with a trusted device tree. > > Note that the partial device tree should avoid using the phandle 65000 > > which is reserved by the toolstack. > > > > +=item B > > + > > +Specify whether IOMMU mappings are enabled for the domain and hence whether > > +it will be enabled for passthrough hardware. Valid values for this option > > +are: > > + > > +=over 4 > > + > > +=item B > > + > > +IOMMU mappings are disabled for the domain and so hardware may not be > > +passed through. > > + > > +This option is the default if no passthrough hardware is specified > > +in the domain's configuration. > > + > > +=item B > > I would maybe name this exclusive_pt, but historically it's been named > sync_pt. > Yes, I prefer to stick with the incumbent name. [snip] > > diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c > > index e105bda2bb..c904604008 100644 > > --- a/tools/xl/xl_parse.c > > +++ b/tools/xl/xl_parse.c > > @@ -2326,6 +2326,44 @@ skip_vfb: > > } > > } > > > > +if (!xlu_cfg_get_string(config, "passthrough", , 0)) { > > +libxl_passthrough o; > > + > > +e = libxl_passthrough_from_string(buf, ); > > +if (e) { > > +fprintf(stderr, > > +"ERROR: unknown passthrough option '%s'\n", > > +buf); > > +exit(-ERROR_FAIL); > > +} > > + > > +switch (o) { > > +case LIBXL_PASSTHROUGH_DISABLED: > > +if (d_config->num_pcidevs || d_config->num_dtdevs) { > > +fprintf(stderr, > > +"ERROR: passthrough disabled but devices are > > specified\n"); > > +exit(-ERROR_FAIL); > > +} > > Don't you need a break here? > > > +case LIBXL_PASSTHROUGH_SHARE_PT: > > +if (c_info->type == LIBXL_DOMAIN_TYPE_PV) { > > +fprintf(stderr, > > +"ERROR:
Re: [Xen-devel] [PATCH v4 2/2] xen/domain: Use typesafe gfn in map_vcpu_info
On 23.08.2019 16:16, Andrew Cooper wrote: > On 19/08/2019 15:26, Julien Grall wrote: >> diff --git a/xen/include/public/vcpu.h b/xen/include/public/vcpu.h >> index 3623af932f..dc4c6a72a0 100644 >> --- a/xen/include/public/vcpu.h >> +++ b/xen/include/public/vcpu.h >> @@ -182,7 +182,7 @@ DEFINE_XEN_GUEST_HANDLE(vcpu_set_singleshot_timer_t); >> */ >> #define VCPUOP_register_vcpu_info 10 /* arg == vcpu_register_vcpu_info_t >> */ >> struct vcpu_register_vcpu_info { >> -uint64_t mfn;/* mfn of page to place vcpu_info */ >> +uint64_t mfn;/* gfn of page to place vcpu_info */ > > I wonder if we can do some __XEN_INTERFACE_VERSION__ magic here to > properly change the name to gfn. > > Leaving it as mfn is going to be an ongoing source of confusion. This would be pretty nice indeed. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 08/12] livepatch: Add support for inline asm hotpatching expectations
> +CODE_GET_EXPECT=$(shell objdump -d --insn-width=1 $(1) | grep -A6 -E > '<'$(2)'>:' | tail -n +2 | awk 'BEGIN {printf "{"} {printf "0x%s,", $$2}' | > sed 's/,$$/}/g') Ony my Hikey 960 when I compile using an native compiler I get: gcc -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include /home/xen.git/xen/include/xen/config.h '-D__OBJECT_FILE__="xen_expectations.o"' -Wa,--strip-local-absolute -g -MMD -MF ./.xen_expectations.o.d -mcpu=generic -mgeneral-regs-only -I/home/xen.git/xen/include -fno-stack-protector -fno-exceptions -Wnested-externs -DGCC_HAS_VISIBILITY_ATTRIBUTE -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -c xen_expectations.c -o xen_expectations.o /home/xen.git/xen/Rules.mk:202: recipe for target 'xen_expectations.o' failed make[3]: Circular expect_config.h <- xen_expectations.o dependency dropped. In file included from xen_expectations.c:6:0: expect_config.h:1:23: error: large integer implicitly truncated to unsigned type [-Werror=overflow] #define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x c09112e0} ^ xen_expectations.c:28:17: note: in expansion of macro ‘EXPECT_BYTES’ .data = EXPECT_BYTES ^~~~ expect_config.h:1:34: error: large integer implicitly truncated to unsigned type [-Werror=overflow] #define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x c09112e0} ^ xen_expectations.c:28:17: note: in expansion of macro ‘EXPECT_BYTES’ .data = EXPECT_BYTES ^~~~ expect_config.h:1:45: error: large integer implicitly truncated to unsigned type [-Werror=overflow] #define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x c09112e0} … make[3]: Leaving directory '/home/xen.git/xen/test/livepatch' Makefile:11: recipe for target 'build' failed make[2]: Leaving directory '/home/xen.git/xen/test' Makefile:85: recipe for target '_tests' failed make[1]: Leaving directory '/home/xen.git/xen' Makefile:45: recipe for target 'tests' failed root@hikey960:/home/xen.git/xen# cat test/livepatch/expect_config.h #define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0xc09112e0} #define EXPECT_BYTES_COUNT 6 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 08/12] livepatch: Add support for inline asm hotpatching expectations
On 29. Aug 2019, at 17:58, Konrad Rzeszutek Wilk mailto:konrad.w...@oracle.com>> wrote: +CODE_GET_EXPECT=$(shell objdump -d --insn-width=1 $(1) | grep -A6 -E '<'$(2)'>:' | tail -n +2 | awk 'BEGIN {printf "{"} {printf "0x%s,", $$2}' | sed 's/,$$/}/g') Ony my Hikey 960 when I compile using an native compiler I get: gcc -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O1 -fno-omit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include /home/xen.git/xen/include/xen/config.h '-D__OBJECT_FILE__="xen_expectations.o"' -Wa,--strip-local-absolute -g -MMD -MF ./.xen_expectations.o.d -mcpu=generic -mgeneral-regs-only -I/home/xen.git/xen/include -fno-stack-protector -fno-exceptions -Wnested-externs -DGCC_HAS_VISIBILITY_ATTRIBUTE -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -c xen_expectations.c -o xen_expectations.o /home/xen.git/xen/Rules.mk:202: recipe for target 'xen_expectations.o' failed make[3]: Circular expect_config.h <- xen_expectations.o dependency dropped. In file included from xen_expectations.c:6:0: expect_config.h:1:23: error: large integer implicitly truncated to unsigned type [-Werror=overflow] #define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x c09112e0} ^ xen_expectations.c:28:17: note: in expansion of macro ‘EXPECT_BYTES’ .data = EXPECT_BYTES ^~~~ expect_config.h:1:34: error: large integer implicitly truncated to unsigned type [-Werror=overflow] #define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x c09112e0} ^ xen_expectations.c:28:17: note: in expansion of macro ‘EXPECT_BYTES’ .data = EXPECT_BYTES ^~~~ expect_config.h:1:45: error: large integer implicitly truncated to unsigned type [-Werror=overflow] #define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0x c09112e0} … make[3]: Leaving directory '/home/xen.git/xen/test/livepatch' Makefile:11: recipe for target 'build' failed make[2]: Leaving directory '/home/xen.git/xen/test' Makefile:85: recipe for target '_tests' failed make[1]: Leaving directory '/home/xen.git/xen' Makefile:45: recipe for target 'tests' failed root@hikey960:/home/xen.git/xen# cat test/livepatch/expect_config.h #define EXPECT_BYTES {0xf260,0x00f2,0xe000f000,0x12e000f0,0x9112e000,0xc09112e0} #define EXPECT_BYTES_COUNT 6 Could you please try with the patch attached? Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879 0001-fixup-livepatch-Add-support-for-inline-asm-hotpatchi.patch Description: 0001-fixup-livepatch-Add-support-for-inline-asm-hotpatchi.patch If it still fails, could you please send me the output for the following command with this build?objdump -d xen-syms | sed -n -e '/:$/,/^$/ p' | tail -n +2 Best Regards,Pawel Wieczorkiewicz ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 1/2] xen: Switch parameter in get_page_from_gfn to use typesafe gfn
Hi, On 29/08/2019 17:41, Jan Beulich wrote: > On 19.08.2019 16:26, Julien Grall wrote: >> No functional change intended. >> >> Only reasonable clean-ups are done in this patch. The rest will use _gfn >> for the time being. >> >> The code in get_page_from_gfn is slightly reworked to also handle a bad >> behavior because it is not safe to use mfn_to_page(...) because >> mfn_valid(...) succeeds. > > I guess the 2nd "because" is meant to be "before", but even then I > don't think I can easily agree: mfn_to_page() is just calculations, > no dereference. Regardless the s/because/before/. Do you disagree on the wording or the change? If the former, I just paraphrased what Andrew wrote in the previous version: "This unfortunately propagates some bad behaviour, because it is not safe to use mfn_to_page(mfn); before mfn_valid(mfn) succeeds. (In practice it works because mfn_to_page() is just pointer arithmetic.)" This is x86 code, so please agree with Andrew on the approach/wording. > >> --- a/xen/arch/x86/hvm/domain.c >> +++ b/xen/arch/x86/hvm/domain.c >> @@ -296,8 +296,10 @@ int arch_set_info_hvm_guest(struct vcpu *v, const >> vcpu_hvm_context_t *ctx) >> if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) ) >> { >> /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ >> -struct page_info *page = get_page_from_gfn(v->domain, >> - v->arch.hvm.guest_cr[3] >> PAGE_SHIFT, >> +struct page_info *page; >> + >> +page = get_page_from_gfn(v->domain, >> + gaddr_to_gfn(v->arch.hvm.guest_cr[3]), >>NULL, P2M_ALLOC); > > Iirc I've said so before: I consider use of gaddr_to_gfn() here more > misleading than a plain shift by PAGE_SHIFT. Neither is really correct, > but in no event does CR3 as a whole hold an address. (Same elsewhere > then, and sadly in quite a few places.) Well, this code has not changed since v3 and you acked it... I only dropped the ack here because the previous version was sent 9 months ago. I also can't see such comment made on any version of this series. Anyway, I am more than happy to switch to _gfn((v->arch.hvm.guest_cr[3] >> PAGE_SHIFT)) if you prefer it. > >> --- a/xen/common/event_fifo.c >> +++ b/xen/common/event_fifo.c >> @@ -361,7 +361,7 @@ static const struct evtchn_port_ops evtchn_port_ops_fifo >> = >> .print_state = evtchn_fifo_print_state, >> }; >> >> -static int map_guest_page(struct domain *d, uint64_t gfn, void **virt) >> +static int map_guest_page(struct domain *d, gfn_t gfn, void **virt) >> { >> struct page_info *p; >> >> @@ -422,7 +422,7 @@ static int setup_control_block(struct vcpu *v) >> return 0; >> } >> >> -static int map_control_block(struct vcpu *v, uint64_t gfn, uint32_t offset) >> +static int map_control_block(struct vcpu *v, gfn_t gfn, uint32_t offset) >> { >> void *virt; >> unsigned int i; > > Just as a remark (no action expected) - this makes a truncation issue > pretty apparent: On 32-bit platforms the upper 32 bits of a passed in > GFN get ignored. Correct (imo) behavior would be to reject the upper > bits being non-zero. This is not only here but on pretty much all the hypercalls taking a gfn (on Arm it is 64-bit regardless the bitness). I agree this is not nice, but I am afraid this is likely another can of worm that I am not to open it yet. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v6 10/10] introduce a 'passthrough' configuration option to xl.cfg...
> -Original Message- > From: Jan Beulich > Sent: 29 August 2019 15:07 > To: Paul Durrant > Cc: xen-devel@lists.xenproject.org; Julien Grall ; > Andrew Cooper > ; Anthony Perard ; > Roger Pau Monne > ; Volodymyr Babchuk ; > George Dunlap > ; Ian Jackson ; Stefano > Stabellini > ; Konrad Rzeszutek Wilk ; Tim > (Xen.org) ; > Wei Liu > Subject: Re: [PATCH v6 10/10] introduce a 'passthrough' configuration option > to xl.cfg... > > On 16.08.2019 19:20, Paul Durrant wrote: > > ...and hence the ability to disable IOMMU mappings, and control EPT > > sharing. > > > > This patch introduces a new 'libxl_passthrough' enumeration into > > libxl_domain_create_info. The value will be set by xl either when it parses > > a new 'passthrough' option in xl.cfg, or implicitly if there is passthrough > > hardware specified for the domain. > > > > If the value of the passthrough configuration option is 'disabled' then > > the XEN_DOMCTL_CDF_iommu flag will be clear in the xen_domctl_createdomain > > flags, thus allowing the toolstack to control whether the domain gets > > IOMMU mappings or not (where previously they were globally set). > > > > If the value of the passthrough configuration option is 'sync_pt' then > > a new 'iommu_opts' field in xen_domctl_createdomain will be set with the > > value XEN_DOMCTL_IOMMU_no_sharept. This will override the global default > > set in iommu_hap_pt_share, thus allowing the toolstack to control whether > > EPT sharing is used for the domain. > > > > Signed-off-by: Paul Durrant > > Reviewed-by: Jan Beulich Thanks. > with a question/suggestion and a nit: > > > --- a/xen/common/domain.c > > +++ b/xen/common/domain.c > > @@ -308,6 +308,13 @@ static int sanitise_domain_config(struct > > xen_domctl_createdomain *config) > > return -EINVAL; > > } > > > > +if ( !(config->flags & XEN_DOMCTL_CDF_iommu) && config->iommu_opts ) > > +{ > > +dprintk(XENLOG_INFO, > > +"IOMMU options specified but IOMMU not enabled\n"); > > +return -EINVAL; > > +} > > Seeing this I wonder whether XEN_DOMCTL_CDF_iommu wouldn't better be > bit 0 of iommu_opts. > I debated this with myself and I prefer to stick with separate flag and options. > > @@ -70,6 +70,10 @@ struct xen_domctl_createdomain { > > > > uint32_t flags; > > > > +#define _XEN_DOMCTL_IOMMU_no_sharept 0 > > +#define XEN_DOMCTL_IOMMU_no_sharept (1U<<_XEN_DOMCTL_IOMMU_no_sharept) > > Please can you add the missing blanks around << ? > Sure. Paul > Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [linux-4.19 test] 140757: regressions - FAIL
flight 140757 linux-4.19 real [real] http://logs.test-lab.xenproject.org/osstest/logs/140757/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-xsm 6 xen-buildfail REGR. vs. 129313 build-armhf-pvops 6 kernel-build fail REGR. vs. 129313 Tests which are failing intermittently (not blocking): test-amd64-amd64-xl-pvshim 20 guest-start/debian.repeat fail in 140708 pass in 140638 test-amd64-i386-qemut-rhel6hvm-amd 12 guest-start/redhat.repeat fail in 140708 pass in 140757 test-amd64-amd64-xl-pvshim 12 guest-startfail pass in 140708 test-arm64-arm64-examine 11 examine-serial/bootloader fail pass in 140708 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked n/a test-amd64-amd64-xl-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-xsm1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-armhf-armhf-xl-vhd 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-debianhvm-i386-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit1 1 build-check(1) blocked n/a test-armhf-armhf-examine 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 18 guest-start/debianhvm.repeat fail in 140638 blocked in 129313 test-amd64-i386-libvirt-xsm 13 migrate-support-check fail in 140708 never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-check fail in 140708 never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail in 140708 never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail in 140708 never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim12 guest-start fail never pass test-arm64-arm64-xl-seattle 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
[Xen-devel] [PATCH v3 04/11] xen/events: Remove unlikely() from WARN() condition
"unlikely(WARN(x))" is excessive. WARN() already uses unlikely() internally. Signed-off-by: Denis Efremov Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Joe Perches Cc: Andrew Morton Cc: xen-devel@lists.xenproject.org --- drivers/xen/events/events_base.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 2e8570c09789..6c8843968a52 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -247,7 +247,7 @@ static void xen_irq_info_cleanup(struct irq_info *info) */ unsigned int evtchn_from_irq(unsigned irq) { - if (unlikely(WARN(irq >= nr_irqs, "Invalid irq %d!\n", irq))) + if (WARN(irq >= nr_irqs, "Invalid irq %d!\n", irq)) return 0; return info_for_irq(irq)->evtchn; -- 2.21.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/mmcfg: add "force" option for MCFG
On Wed, Aug 28, 2019 at 04:24:22PM +0100, Igor Druzhinin wrote: > If MCFG area is not reserved in E820 Xen by default will defer its usage > until Dom0 registers it explicitly after ACPI parser recognizes it as > a reserved resource in DSDT. Having it reserved in E820 is not > mandatory according to "PCI Firmware Specification, rev 3.2" (par. 4.1.2) > and firmware is free to keep a hole E820 in that place. > > Unfortunately, keeping it disabled at this point makes Xen fail to > recognize many of PCIe extended capabilities early enough for newly added > devices. Namely, (1) some of VT-d quirks are not applied during Dom0 > device handoff, (2) currently MCFG reservation report comes > too late from Dom0 after some of PCI devices being registered in Xen > by Dom0 kernel that break initialization of a number of PCIe capabilities > (e.g. SR-IOV VF BAR sizing). > > Since introduction of ACPI parser in Xen is not possible add a "force" > option that will allow Xen to use MCFG area even it's not properly > reserved in E820. > > Signed-off-by: Igor Druzhinin Thanks! I think this is the best way to solve the issue. > --- > > I think we need to have this option to at least have a way to workaround > problem (1). Problem (2) could be solved in Dom0 kernel by invoking > xen_mcfg_late() earlier but before the first PCI bus enumertaion which > currently happens somwhere during ACPI scan I think. > > The question is what the defult value for this option should be? Have you tested this against a variety of hardware? Have you found any box that has a wrong MMCFG area in the MCFG static table? (ie: one that doesn't match the motherboard resource reservation in the dynamic ACPI tables?) > > --- > docs/misc/xen-command-line.pandoc | 8 +--- > xen/arch/x86/e820.c | 20 > xen/arch/x86/x86_64/mmconfig-shared.c | 34 -- > xen/include/asm-x86/e820.h| 1 + > 4 files changed, 46 insertions(+), 17 deletions(-) > > diff --git a/docs/misc/xen-command-line.pandoc > b/docs/misc/xen-command-line.pandoc > index 7c72e31..f13b10c 100644 > --- a/docs/misc/xen-command-line.pandoc > +++ b/docs/misc/xen-command-line.pandoc > @@ -1424,11 +1424,13 @@ ordinary DomU, control domain, hardware domain, and - > when supported > by the platform - DomU with pass-through device assigned). > > ### mmcfg (x86) > -> `= [,amd-fam10]` > +> `= List of [ , force, amd-fam10 ]` > > -> Default: `1` > +> Default: `true,force` > > -Specify if the MMConfig space should be enabled. > +Specify if the MMConfig space should be enabled. If force option is specified > +(default) MMConfig space will be used by Xen early in boot even if it's > +not reserved by firmware in the E820 table. > > ### mmio-relax (x86) > > `= | all` > diff --git a/xen/arch/x86/e820.c b/xen/arch/x86/e820.c > index 8e8a2c4..edef72a 100644 > --- a/xen/arch/x86/e820.c > +++ b/xen/arch/x86/e820.c > @@ -37,6 +37,26 @@ struct e820map e820; > struct e820map __initdata e820_raw; > > /* > + * This function checks if any part of the range is mapped > + * with type. > + */ > +int __init e820_any_mapped(u64 start, u64 end, unsigned type) Please use uint64_t and unsigned int. Also it looks like this function wants to return a bool instead of an int. > +{ > + int i; unsigned int. > + > + for (i = 0; i < e820.nr_map; i++) { Coding style. Some parts of this file are using the Linux coding style I think, but new additions should be using the Xen coding style, see e820_change_range_type for example. > + struct e820entry *ei = [i]; const. > + > + if (type && ei->type != type) > + continue; > + if (ei->addr >= end || ei->addr + ei->size <= start) > + continue; > + return 1; > + } > + return 0; > +} > + > +/* > * This function checks if the entire range is mapped with type. > * > * Note: this function only works correct if the e820 table is sorted and > diff --git a/xen/arch/x86/x86_64/mmconfig-shared.c > b/xen/arch/x86/x86_64/mmconfig-shared.c > index cc08b52..9fc0865 100644 > --- a/xen/arch/x86/x86_64/mmconfig-shared.c > +++ b/xen/arch/x86/x86_64/mmconfig-shared.c > @@ -26,33 +26,34 @@ > > #include "mmconfig.h" > > +static bool_t __read_mostly force_mmcfg = true; > unsigned int pci_probe = PCI_PROBE_CONF1 | PCI_PROBE_MMCONF; > > static int __init parse_mmcfg(const char *s) > { > const char *ss; > -int rc = 0; > +int val, rc = 0; > > do { > ss = strchr(s, ','); > if ( !ss ) > ss = strchr(s, '\0'); > > -switch ( parse_bool(s, ss) ) > -{ > -case 0: > -pci_probe &= ~PCI_PROBE_MMCONF; > -break; > -case 1: > -break; > -default: > -if ( !cmdline_strcmp(s, "amd_fam10") || > - !cmdline_strcmp(s,
Re: [Xen-devel] [PATCH v6 06/10] domain: introduce XEN_DOMCTL_CDF_iommu flag
> -Original Message- > From: Roger Pau Monne > Sent: 23 August 2019 11:33 > To: Paul Durrant > Cc: xen-devel@lists.xenproject.org; Ian Jackson ; Wei > Liu ; > Anthony Perard ; Andrew Cooper > ; George Dunlap > ; Jan Beulich ; Julien Grall > ; > Konrad Rzeszutek Wilk ; Stefano Stabellini > ; Tim > (Xen.org) ; Volodymyr Babchuk > Subject: Re: [PATCH v6 06/10] domain: introduce XEN_DOMCTL_CDF_iommu flag > > On Fri, Aug 16, 2019 at 06:19:57PM +0100, Paul Durrant wrote: > > This patch introduces a common domain creation flag to determine whether > > the domain is permitted to make use of the IOMMU. Currently the flag is > > always set (for both dom0 and domU) if the IOMMU is globally enabled > > (i.e. iommu_enabled == 1). sanitise_domain_config() is modified to reject > > the flag if !iommu_enabled. > > > > A new helper function, is_iommu_enabled(), is added to test the flag and > > iommu_domain_init() will return immediately if !is_iommu_enabled(). This is > > slightly different to the previous behaviour based on !iommu_enabled where > > the call to arch_iommu_domain_init() was made regardless, however it appears > > that this call was only necessary to initialize the dt_devices list for ARM > > such that iommu_release_dt_devices() can be called unconditionally by > > domain_relinquish_resources(). Adding a simple check of is_iommu_enabled() > > into iommu_release_dt_devices() keeps this unconditional call working. > > > > No functional change should be observed with this patch applied. > > > > Subsequent patches will allow the toolstack to control whether use of the > > IOMMU is enabled for a domain. > > > > NOTE: The introduction of the is_iommu_enabled() helper function might > > seem excessive but its use is expected to increase with subsequent > > patches. Also, having iommu_domain_init() bail before calling > > arch_iommu_domain_init() is not strictly necessary, but I think the > > consequent addition of the call to is_iommu_enabled() in > > iommu_release_dt_devices() makes the code clearer. > > > > Signed-off-by: Paul Durrant > > I have one ARM-related question and one 'nice to have', but the code > LGTM: > > Reviewed-by: Roger Pau Monné > > > --- > > Cc: Ian Jackson > > Cc: Wei Liu > > Cc: Anthony PERARD > > Cc: Andrew Cooper > > Cc: George Dunlap > > Cc: Jan Beulich > > Cc: Julien Grall > > Cc: Konrad Rzeszutek Wilk > > Cc: Stefano Stabellini > > Cc: Tim Deegan > > Cc: Volodymyr Babchuk > > Cc: "Roger Pau Monné" > > > > Previously part of series > > https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg02267.html > > > > v6: > > - Remove the toolstack parts as there's no nice method of testing whether > >the IOMMU is enabled in an architecture-neutral way > > > > v5: > > - Move is_iommu_enabled() check into iommu_domain_init() > > - Reject XEN_DOMCTL_CDF_iommu in sanitise_domain_config() if !iommu_enabled > > - Use evaluate_nospec() in defintion of is_iommu_enabled() > > --- > > xen/arch/arm/setup.c | 3 +++ > > xen/arch/x86/setup.c | 3 +++ > > xen/common/domain.c | 9 - > > xen/common/domctl.c | 8 > > xen/drivers/passthrough/device_tree.c | 3 +++ > > xen/drivers/passthrough/iommu.c | 6 +++--- > > xen/include/public/domctl.h | 4 > > xen/include/xen/sched.h | 5 + > > 8 files changed, 37 insertions(+), 4 deletions(-) > > > > diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c > > index 2c5d1372c0..20021ee0ca 100644 > > --- a/xen/arch/arm/setup.c > > +++ b/xen/arch/arm/setup.c > > @@ -914,6 +914,9 @@ void __init start_xen(unsigned long boot_phys_offset, > > dom0_cfg.arch.tee_type = tee_get_type(); > > dom0_cfg.max_vcpus = dom0_max_vcpus(); > > > > +if ( iommu_enabled ) > > +dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu; > > + > > dom0 = domain_create(0, _cfg, true); > > if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) ) > > panic("Error creating domain 0\n"); > > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c > > index d0b35b0ce2..fa226a2bab 100644 > > --- a/xen/arch/x86/setup.c > > +++ b/xen/arch/x86/setup.c > > @@ -1733,6 +1733,9 @@ void __init noreturn __start_xen(unsigned long mbi_p) > > } > > dom0_cfg.max_vcpus = dom0_max_vcpus(); > > > > +if ( iommu_enabled ) > > +dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu; > > + > > /* Create initial domain 0. */ > > dom0 = domain_create(get_initial_domain_id(), _cfg, !pv_shim); > > if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) ) > > diff --git a/xen/common/domain.c b/xen/common/domain.c > > index 76e6976617..e832a5c4aa 100644 > > --- a/xen/common/domain.c > > +++ b/xen/common/domain.c > > @@ -301,7 +301,8 @@ static int sanitise_domain_config(struct > > xen_domctl_createdomain *config) > > XEN_DOMCTL_CDF_hap | > >
Re: [Xen-devel] [PATCH v6 08/10] remove late (on-demand) construction of IOMMU page tables
> -Original Message- > From: Roger Pau Monne > Sent: 23 August 2019 12:34 > To: Paul Durrant > Cc: xen-devel@lists.xenproject.org; Alexandru Isaila > ; Stefano Stabellini > ; Julien Grall ; Volodymyr > Babchuk > ; Andrew Cooper ; > George Dunlap > ; Ian Jackson ; Jan Beulich > ; > Konrad Rzeszutek Wilk ; Tim (Xen.org) ; > Wei Liu ; > Tamas K Lengyel ; Razvan Cojocaru > ; Petre Pircalabu > > Subject: Re: [PATCH v6 08/10] remove late (on-demand) construction of IOMMU > page tables > > On Fri, Aug 16, 2019 at 06:19:59PM +0100, Paul Durrant wrote: > > Now that there is a per-domain IOMMU enable flag, which should be enabled if > > any device is going to be passed through, stop deferring page table > > construction until the assignment is done. Also don't tear down the tables > > again when the last device is de-assigned; defer that task until domain > > destruction. > > > > This allows the has_iommu_pt() helper and iommu_status enumeration to be > > removed. Calls to has_iommu_pt() are simply replaced by calls to > > is_iommu_enabled(). Remaining open-code tests of iommu_hap_pt_share can also > > be replaced by calls to iommu_use_hap_pt(). > > The arch_iommu_populate_page_table() and iommu_construct() functions become > > redundant, as does the 'strict mode' dom0 page_list mapping code in > > iommu_hwdom_init(), and iommu_teardown() can be made static is its only > > remaining caller, iommu_domain_destroy(), is within the same source > > module. > > > > All in all, about 220 lines of code are removed. > > > > NOTE: This patch will cause a small amount of extra resource to be used > > to accommodate IOMMU page tables that may never be used, since the > > per-domain IOMMU flag enable flag is currently set to the value > > of the global iommu_enable flag. A subsequent patch will add an > > option to the toolstack to allow it to be turned off if there is > > no intention to assign passthrough hardware to the domain. > > > > Signed-off-by: Paul Durrant > > Reviewed-by: Alexandru Isaila > > --- > > Cc: Stefano Stabellini > > Cc: Julien Grall > > Cc: Volodymyr Babchuk > > Cc: Andrew Cooper > > Cc: George Dunlap > > Cc: Ian Jackson > > Cc: Jan Beulich > > Cc: Konrad Rzeszutek Wilk > > Cc: Tim Deegan > > Cc: Wei Liu > > Cc: "Roger Pau Monné" > > Cc: Tamas K Lengyel > > Cc: George Dunlap > > Cc: Razvan Cojocaru > > Cc: Petre Pircalabu > > > > Previously part of series > > https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg02267.html > > > > v5: > > - Minor style fixes > > --- > > xen/arch/arm/p2m.c| 2 +- > > xen/arch/x86/dom0_build.c | 2 +- > > xen/arch/x86/hvm/mtrr.c | 5 +- > > xen/arch/x86/mm/mem_sharing.c | 2 +- > > xen/arch/x86/mm/paging.c | 2 +- > > xen/arch/x86/x86_64/mm.c | 2 +- > > xen/common/memory.c | 4 +- > > xen/common/vm_event.c | 2 +- > > xen/drivers/passthrough/device_tree.c | 11 --- > > xen/drivers/passthrough/iommu.c | 134 ++ > > xen/drivers/passthrough/pci.c | 12 --- > > xen/drivers/passthrough/vtd/iommu.c | 10 +- > > xen/drivers/passthrough/x86/iommu.c | 95 -- > > xen/include/asm-arm/iommu.h | 2 +- > > xen/include/asm-x86/iommu.h | 2 +- > > xen/include/xen/iommu.h | 16 --- > > xen/include/xen/sched.h | 2 - > > 17 files changed, 42 insertions(+), 263 deletions(-) > > > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c > > index 7f1442932a..692565757e 100644 > > --- a/xen/arch/arm/p2m.c > > +++ b/xen/arch/arm/p2m.c > > @@ -1056,7 +1056,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m, > > !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) ) > > p2m_free_entry(p2m, orig_pte, level); > > > > -if ( has_iommu_pt(p2m->domain) && > > +if ( is_iommu_enabled(p2m->domain) && > > (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) ) > > { > > unsigned int flush_flags = 0; > > diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c > > index d381784edd..7cfab2dc25 100644 > > --- a/xen/arch/x86/dom0_build.c > > +++ b/xen/arch/x86/dom0_build.c > > @@ -365,7 +365,7 @@ unsigned long __init dom0_compute_nr_pages( > > } > > > > need_paging = is_hvm_domain(d) && > > -(!iommu_hap_pt_share || !paging_mode_hap(d)); > > +(!iommu_use_hap_pt(d) || !paging_mode_hap(d)); > > I'm not sure this is correct, at the point where dom0_compute_nr_pages > gets called the iommu has not been initialized yet (the call to > iommu_hwdom_init is done afterwards), so the iommu status field which > is used by iommu_use_hap_pt is not yet initialized. Note that this patch removes the iommu status field. > > > for ( ; ; need_paging = false ) > > { > > nr_pages = get_memsize(_size,
Re: [Xen-devel] [PATCH v2] Partially revert "x86/mm: Clean IOMMU flags from p2m-pt code"
On 29.08.2019 11:49, Roger Pau Monne wrote: > This partially reverts commit > 854a49a7486a02edae5b3e53617bace526e9c1b1 by re-adding the logic that > propagates changes to the domain physmap done by p2m_pt_set_entry into > the iommu page tables. Without this logic changes to the guest physmap > are not propagated to the iommu, leaving stale iommu entries that can > leak data, or failing to add new entries. > > Note that this commit doesn't re-introduce iommu flags to the cpu page > table entries, since the logic to add/remove entries to the iommu page > tables is based on the p2m type and the mfn. > > Fixes: 854a49a7486a02 ('x86/mm: Clean IOMMU flags from p2m-pt code') > Signed-off-by: Roger Pau Monné > --- > Cc: Alexandru Stefan ISAILA > --- > Changes since v1: > - Remove the share-pt branch, there's no sharing on AMD hardware. > --- Reviewed-by: Alexandru Isaila ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
On 19.08.2019 03:25, Chao Gao wrote: > Both are loading the cached patch. Since APs call the unified function, > microcode_update_one(), during wakeup, the 'start_update' parameter > which originally used to distinguish BSP and APs is redundant. So remove > this parameter. > > Signed-off-by: Chao Gao > --- > Note that here is a functional change: resuming a CPU would call > ->end_update() now while previously it wasn't. Not quite sure > whether it is correct. I think it is, as implied by the other response I've sent. But it should then (as also said) include calling ->start_update() as well. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V3 8/8] iommu/arm: Add Renesas IPMMU-VMSA support
On 29.08.19 11:37, Yoshihiro Shimoda wrote: Hi Oleksandr-san, Hi Shimoda-san. From: Oleksandr Tyshchenko, Sent: Wednesday, August 21, 2019 3:10 AM From: Oleksandr Tyshchenko The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU) which provides address translation and access protection functionalities to processing units and interconnect networks. Please note, current driver is supposed to work only with newest Gen3 SoCs revisions which IPMMU hardware supports stage 2 translation This is still "Gen3", so that please replace it with "R-Car Gen3"... Will do. table format and is able to use CPU's P2M table as is if one is 3-level page table (up to 40 bit IPA). The major differences compare to the Linux driver are: 1. Stage 1/Stage 2 translation. Linux driver supports Stage 1 translation only (with Stage 1 translation table format). It manages page table by itself. But Xen driver supports Stage 2 translation (with Stage 2 translation table format) to be able to share the P2M with the CPU. Stage 1 translation is always bypassed in Xen driver. So, Xen driver is supposed to be used with newest R-Car Gen3 SoC revisions only (H3 ES3.0, M3-W+, etc.) which IPMMU H/W supports stage 2 translation table format. 2. AArch64 support. Linux driver uses VMSAv8-32 mode, while Xen driver enables Armv8 VMSAv8-64 mode to cover up to 40 bit input address. 3. Context bank (sets of page table) usage. In Xen, each context bank is mapped to one Xen domain. So, all devices being pass throughed to the same Xen domain share the same context bank. 4. IPMMU device tracking. In Xen, all IOMMU devices are managed by single driver instance. So, driver uses global list to keep track of registered IPMMU devices. Signed-off-by: Oleksandr Tyshchenko CC: Julien Grall CC: Yoshihiro Shimoda About this hardware handling, this patch seems good to me. But, since I'm not familiar about Xen passthrough framework, I think I cannot add my Reviewed-by tag into this patch. What do you think? I am not completely sure regarding that, but I have seen cases when the people give their R-b not for the entire patch, but for a part of it. Especially, when patch touches many sub-systems/archs. Would you mind if I specify what your R-b covers? -> [for the IPMMU H/W bits] Thank you for the review. Best regards, Yoshihiro Shimoda -- Regards, Oleksandr Tyshchenko ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86: clear RDRAND CPUID bit on AMD family 15h/16h
On 29/08/2019 10:28, Jan Beulich wrote: > Inspired by Linux commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24: > > There have been reports of RDRAND issues after resuming from suspend on > some AMD family 15h and family 16h systems. This issue stems from a BIOS > not performing the proper steps during resume to ensure RDRAND continues > to function properly. > > Update the CPU initialization to clear the RDRAND CPUID bit for any family > 15h and 16h processor that supports RDRAND. If it is known that the family > 15h or family 16h system does not have an RDRAND resume issue or that the > system will not be placed in suspend, the "cpuid=rdrand" kernel parameter > can be used to stop the clearing of the RDRAND CPUID bit. > > Note, that clearing the RDRAND CPUID bit does not prevent a processor > that normally supports the RDRAND instruction from executing it. So any > code that determined the support based on family and model won't #UD. > > Signed-off-by: Jan Beulich > --- > Slightly RFC, in particular because of the change to parse_xen_cpuid(): > Alternative approach suggestions are welcome. This issue was on my radar, but I hadn't got around to starting a patch yet. I was wondering if we could perhaps default it to whether S3 was available, but looking at the code which prints "ACPI sleep modes: S3", it doesn't appear to be related to any real ACPI information. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 06/11] swiotlb-xen: always use dma-direct helpers to alloc coherent pages
+ Boris, Juergen On Mon, 26 Aug 2019, Christoph Hellwig wrote: > x86 currently calls alloc_pages, but using dma-direct works as well > there, with the added benefit of using the CMA pool if available. > The biggest advantage is of course to remove a pointless bit of > architecture specific code. > > Signed-off-by: Christoph Hellwig > --- > arch/x86/include/asm/xen/page-coherent.h | 16 > drivers/xen/swiotlb-xen.c| 7 +++ > include/xen/arm/page-coherent.h | 12 > 3 files changed, 3 insertions(+), 32 deletions(-) > > diff --git a/arch/x86/include/asm/xen/page-coherent.h > b/arch/x86/include/asm/xen/page-coherent.h > index 116777e7f387..8ee33c5edded 100644 > --- a/arch/x86/include/asm/xen/page-coherent.h > +++ b/arch/x86/include/asm/xen/page-coherent.h > @@ -5,22 +5,6 @@ > #include > #include > > -static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t > size, > - dma_addr_t *dma_handle, gfp_t flags, > - unsigned long attrs) > -{ > - void *vstart = (void*)__get_free_pages(flags, get_order(size)); > - *dma_handle = virt_to_phys(vstart); This is where we need Boris and Juergen's opinion. From an ARM POV it looks OK. > - return vstart; > -} > - > -static inline void xen_free_coherent_pages(struct device *hwdev, size_t size, > - void *cpu_addr, dma_addr_t dma_handle, > - unsigned long attrs) > -{ > - free_pages((unsigned long) cpu_addr, get_order(size)); > -} > - > static inline void xen_dma_map_page(struct device *hwdev, struct page *page, >dma_addr_t dev_addr, unsigned long offset, size_t size, >enum dma_data_direction dir, unsigned long attrs) { } > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index b8808677ae1d..f9dd4cb6e4b3 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -299,8 +299,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t > size, >* address. In fact on ARM virt_to_phys only works for kernel direct >* mapped RAM memory. Also see comment below. >*/ > - ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs); > - > + ret = dma_direct_alloc(hwdev, size, dma_handle, flags, attrs); > if (!ret) > return ret; > > @@ -319,7 +318,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t > size, > else { > if (xen_create_contiguous_region(phys, order, >fls64(dma_mask), dma_handle) > != 0) { > - xen_free_coherent_pages(hwdev, size, ret, > (dma_addr_t)phys, attrs); > + dma_direct_free(hwdev, size, ret, (dma_addr_t)phys, > attrs); > return NULL; > } > SetPageXenRemapped(virt_to_page(ret)); > @@ -351,7 +350,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t > size, void *vaddr, > TestClearPageXenRemapped(virt_to_page(vaddr))) > xen_destroy_contiguous_region(phys, order); > > - xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs); > + dma_direct_free(hwdev, size, vaddr, (dma_addr_t)phys, attrs); > } > > /* > diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h > index a840d6949a87..0e244f4fec1a 100644 > --- a/include/xen/arm/page-coherent.h > +++ b/include/xen/arm/page-coherent.h > @@ -16,18 +16,6 @@ void __xen_dma_sync_single_for_cpu(struct device *hwdev, > void __xen_dma_sync_single_for_device(struct device *hwdev, > dma_addr_t handle, size_t size, enum dma_data_direction dir); > > -static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t > size, > - dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs) > -{ > - return dma_direct_alloc(hwdev, size, dma_handle, flags, attrs); > -} > - > -static inline void xen_free_coherent_pages(struct device *hwdev, size_t size, > - void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs) > -{ > - dma_direct_free(hwdev, size, cpu_addr, dma_handle, attrs); > -} > - > static inline void xen_dma_sync_single_for_cpu(struct device *hwdev, > dma_addr_t handle, size_t size, enum dma_data_direction dir) > { > -- > 2.20.1 > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 03/11] xen/arm: simplify dma_cache_maint
On Tue, 27 Aug 2019, Christoph Hellwig wrote: > And this was still buggy I think, it really needs some real Xen/Arm > testing which I can't do. Hopefully better version below: > > -- > >From 5ad4b6e291dbb49f65480c9b769414931cbd485a Mon Sep 17 00:00:00 2001 > From: Christoph Hellwig > Date: Wed, 24 Jul 2019 15:26:08 +0200 > Subject: xen/arm: simplify dma_cache_maint > > Calculate the required operation in the caller, and pass it directly > instead of recalculating it for each page, and use simple arithmetics > to get from the physical address to Xen page size aligned chunks. > > Signed-off-by: Christoph Hellwig > --- > arch/arm/xen/mm.c | 61 --- > 1 file changed, 21 insertions(+), 40 deletions(-) > > diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c > index 90574d89d0d4..2fde161733b0 100644 > --- a/arch/arm/xen/mm.c > +++ b/arch/arm/xen/mm.c > @@ -35,64 +35,45 @@ unsigned long xen_get_swiotlb_free_pages(unsigned int > order) > return __get_free_pages(flags, order); > } > > -enum dma_cache_op { > - DMA_UNMAP, > - DMA_MAP, > -}; > static bool hypercall_cflush = false; > > -/* functions called by SWIOTLB */ > - > -static void dma_cache_maint(dma_addr_t handle, unsigned long offset, > - size_t size, enum dma_data_direction dir, enum dma_cache_op op) > +/* buffers in highmem or foreign pages cannot cross page boundaries */ > +static void dma_cache_maint(dma_addr_t handle, size_t size, u32 op) > { > struct gnttab_cache_flush cflush; > - unsigned long xen_pfn; > - size_t left = size; > > - xen_pfn = (handle >> XEN_PAGE_SHIFT) + offset / XEN_PAGE_SIZE; > - offset %= XEN_PAGE_SIZE; > + cflush.a.dev_bus_addr = handle & XEN_PAGE_MASK; > + cflush.offset = xen_offset_in_page(handle); > + cflush.op = op; > > do { > - size_t len = left; > - > - /* buffers in highmem or foreign pages cannot cross page > - * boundaries */ > - if (len + offset > XEN_PAGE_SIZE) > - len = XEN_PAGE_SIZE - offset; > - > - cflush.op = 0; > - cflush.a.dev_bus_addr = xen_pfn << XEN_PAGE_SHIFT; > - cflush.offset = offset; > - cflush.length = len; > - > - if (op == DMA_UNMAP && dir != DMA_TO_DEVICE) > - cflush.op = GNTTAB_CACHE_INVAL; > - if (op == DMA_MAP) { > - if (dir == DMA_FROM_DEVICE) > - cflush.op = GNTTAB_CACHE_INVAL; > - else > - cflush.op = GNTTAB_CACHE_CLEAN; > - } > - if (cflush.op) > - HYPERVISOR_grant_table_op(GNTTABOP_cache_flush, > , 1); > + if (size + cflush.offset > XEN_PAGE_SIZE) > + cflush.length = XEN_PAGE_SIZE - cflush.offset; > + else > + cflush.length = size; > + > + HYPERVISOR_grant_table_op(GNTTABOP_cache_flush, , 1); > > - offset = 0; > - xen_pfn++; > - left -= len; > - } while (left); > + cflush.offset = 0; > + cflush.a.dev_bus_addr += cflush.length; > + size -= cflush.length; Yes that's better Reviewed-by: Stefano Stabellini > + } while (size); > } > > static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t > handle, > size_t size, enum dma_data_direction dir) > { > - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, > DMA_UNMAP); > + if (dir != DMA_TO_DEVICE) > + dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL); > } > > static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t > handle, > size_t size, enum dma_data_direction dir) > { > - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, > DMA_MAP); > + if (dir == DMA_FROM_DEVICE) > + dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL); > + else > + dma_cache_maint(handle, size, GNTTAB_CACHE_CLEAN); > } > > void __xen_dma_map_page(struct device *hwdev, struct page *page, > -- > 2.20.1 > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 08/12] livepatch: Add support for inline asm hotpatching expectations
> Ah, I forgot about endianness of course... > I am sending an improved patch. I hope it works this time as expected. Works nicely! (Tested only on ARM64, hadn't tried ARM32 yet). ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 07/11] swiotlb-xen: use the same foreign page check everywhere
On Mon, 26 Aug 2019, Christoph Hellwig wrote: > xen_dma_map_page uses a different and more complicated check for foreign > pages than the other three cache maintainance helpers. Switch it to the > simpler pfn_valid method a well, and document the scheme with a single > improved comment in xen_dma_map_page. > > Signed-off-by: Christoph Hellwig Reviewed-by: Stefano Stabellini > --- > include/xen/arm/page-coherent.h | 31 +-- > 1 file changed, 9 insertions(+), 22 deletions(-) > > diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h > index 0e244f4fec1a..07c104dbc21f 100644 > --- a/include/xen/arm/page-coherent.h > +++ b/include/xen/arm/page-coherent.h > @@ -41,23 +41,17 @@ static inline void xen_dma_map_page(struct device *hwdev, > struct page *page, >dma_addr_t dev_addr, unsigned long offset, size_t size, >enum dma_data_direction dir, unsigned long attrs) > { > - unsigned long page_pfn = page_to_xen_pfn(page); > - unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr); > - unsigned long compound_pages = > - (1< - bool local = (page_pfn <= dev_pfn) && > - (dev_pfn - page_pfn < compound_pages); > + unsigned long pfn = PFN_DOWN(dev_addr); > > /* > - * Dom0 is mapped 1:1, while the Linux page can span across > - * multiple Xen pages, it's not possible for it to contain a > - * mix of local and foreign Xen pages. So if the first xen_pfn > - * == mfn the page is local otherwise it's a foreign page > - * grant-mapped in dom0. If the page is local we can safely > - * call the native dma_ops function, otherwise we call the xen > - * specific function. > + * Dom0 is mapped 1:1, and while the Linux page can span across multiple > + * Xen pages, it is not possible for it to contain a mix of local and > + * foreign Xen pages. Calling pfn_valid on a foreign mfn will always > + * return false, so if pfn_valid returns true the pages is local and we > + * can use the native dma-direct functions, otherwise we call the Xen > + * specific version. >*/ > - if (local) > + if (pfn_valid(pfn)) > dma_direct_map_page(hwdev, page, offset, size, dir, attrs); > else > __xen_dma_map_page(hwdev, page, dev_addr, offset, size, dir, > attrs); > @@ -67,14 +61,7 @@ static inline void xen_dma_unmap_page(struct device > *hwdev, dma_addr_t handle, > size_t size, enum dma_data_direction dir, unsigned long attrs) > { > unsigned long pfn = PFN_DOWN(handle); > - /* > - * Dom0 is mapped 1:1, while the Linux page can be spanned accross > - * multiple Xen page, it's not possible to have a mix of local and > - * foreign Xen page. Dom0 is mapped 1:1, so calling pfn_valid on a > - * foreign mfn will always return false. If the page is local we can > - * safely call the native dma_ops function, otherwise we call the xen > - * specific function. > - */ > + > if (pfn_valid(pfn)) > dma_direct_unmap_page(hwdev, handle, size, dir, attrs); > else > -- > 2.20.1 > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 09/11] swiotlb-xen: remove page-coherent.h
On Mon, 26 Aug 2019, Christoph Hellwig wrote: > The only thing left of page-coherent.h is two functions implemented by > the architecture for non-coherent DMA support that are never called for > fully coherent architectures. Just move the prototypes for those to > swiotlb-xen.h instead. > > Signed-off-by: Christoph Hellwig Reviewed-by: Stefano Stabellini > --- > arch/arm/include/asm/xen/page-coherent.h | 2 -- > arch/arm64/include/asm/xen/page-coherent.h | 2 -- > arch/x86/include/asm/xen/page-coherent.h | 11 --- > drivers/xen/swiotlb-xen.c | 3 --- > include/Kbuild | 1 - > include/xen/arm/page-coherent.h| 10 -- > include/xen/swiotlb-xen.h | 6 ++ > 7 files changed, 6 insertions(+), 29 deletions(-) > delete mode 100644 arch/arm/include/asm/xen/page-coherent.h > delete mode 100644 arch/arm64/include/asm/xen/page-coherent.h > delete mode 100644 arch/x86/include/asm/xen/page-coherent.h > delete mode 100644 include/xen/arm/page-coherent.h > > diff --git a/arch/arm/include/asm/xen/page-coherent.h > b/arch/arm/include/asm/xen/page-coherent.h > deleted file mode 100644 > index 27e984977402.. > --- a/arch/arm/include/asm/xen/page-coherent.h > +++ /dev/null > @@ -1,2 +0,0 @@ > -/* SPDX-License-Identifier: GPL-2.0 */ > -#include > diff --git a/arch/arm64/include/asm/xen/page-coherent.h > b/arch/arm64/include/asm/xen/page-coherent.h > deleted file mode 100644 > index 27e984977402.. > --- a/arch/arm64/include/asm/xen/page-coherent.h > +++ /dev/null > @@ -1,2 +0,0 @@ > -/* SPDX-License-Identifier: GPL-2.0 */ > -#include > diff --git a/arch/x86/include/asm/xen/page-coherent.h > b/arch/x86/include/asm/xen/page-coherent.h > deleted file mode 100644 > index c9c8398a31ff.. > --- a/arch/x86/include/asm/xen/page-coherent.h > +++ /dev/null > @@ -1,11 +0,0 @@ > -/* SPDX-License-Identifier: GPL-2.0 */ > -#ifndef _ASM_X86_XEN_PAGE_COHERENT_H > -#define _ASM_X86_XEN_PAGE_COHERENT_H > - > -static inline void xen_dma_sync_single_for_cpu(struct device *hwdev, > - dma_addr_t handle, size_t size, enum dma_data_direction dir) { } > - > -static inline void xen_dma_sync_single_for_device(struct device *hwdev, > - dma_addr_t handle, size_t size, enum dma_data_direction dir) { } > - > -#endif /* _ASM_X86_XEN_PAGE_COHERENT_H */ > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index a642e284f1e2..95911ff9c11c 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -35,9 +35,6 @@ > #include > #include > > -#include > -#include > - > #include > /* > * Used to do a quick range check in swiotlb_tbl_unmap_single and > diff --git a/include/Kbuild b/include/Kbuild > index c38f0d46b267..cce5cf6abf89 100644 > --- a/include/Kbuild > +++ b/include/Kbuild > @@ -1189,7 +1189,6 @@ header-test-+= video/vga.h > header-test- += video/w100fb.h > header-test- += xen/acpi.h > header-test- += xen/arm/hypercall.h > -header-test- += xen/arm/page-coherent.h > header-test- += xen/arm/page.h > header-test- += xen/balloon.h > header-test- += xen/events.h > diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h > deleted file mode 100644 > index 635492d41ebe.. > --- a/include/xen/arm/page-coherent.h > +++ /dev/null > @@ -1,10 +0,0 @@ > -/* SPDX-License-Identifier: GPL-2.0 */ > -#ifndef _XEN_ARM_PAGE_COHERENT_H > -#define _XEN_ARM_PAGE_COHERENT_H > - > -void xen_dma_sync_for_cpu(struct device *dev, dma_addr_t handle, > - phys_addr_t paddr, size_t size, enum dma_data_direction dir); > -void xen_dma_sync_for_device(struct device *dev, dma_addr_t handle, > - phys_addr_t paddr, size_t size, enum dma_data_direction dir); > - > -#endif /* _XEN_ARM_PAGE_COHERENT_H */ > diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h > index 5e4b83f83dbc..a7c642872568 100644 > --- a/include/xen/swiotlb-xen.h > +++ b/include/xen/swiotlb-xen.h > @@ -2,8 +2,14 @@ > #ifndef __LINUX_SWIOTLB_XEN_H > #define __LINUX_SWIOTLB_XEN_H > > +#include > #include > > +void xen_dma_sync_for_cpu(struct device *dev, dma_addr_t handle, > + phys_addr_t paddr, size_t size, enum dma_data_direction dir); > +void xen_dma_sync_for_device(struct device *dev, dma_addr_t handle, > + phys_addr_t paddr, size_t size, enum dma_data_direction dir); > + > extern int xen_swiotlb_init(int verbose, bool early); > extern const struct dma_map_ops xen_swiotlb_dma_ops; > > -- > 2.20.1 > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 08/11] swiotlb-xen: simplify cache maintainance
On Mon, 26 Aug 2019, Christoph Hellwig wrote: > Now that we know we always have the dma-noncoherent.h helpers available > if we are on an architecture with support for non-coherent devices, > we can just call them directly, and remove the calls to the dma-direct > routines, including the fact that we call the dma_direct_map_page > routines but ignore the value returned from it. Instead we now have > Xen wrappers for the arch_sync_dma_for_{device,cpu} helpers that call > the special Xen versions of those routines for foreign pages. > > Note that the new helpers get the physical address passed in addition > to the dma address to avoid another translation for the local cache > maintainance. The pfn_valid checks remain on the dma address as in > the old code, even if that looks a little funny. > > Signed-off-by: Christoph Hellwig > > --- > arch/arm/xen/mm.c| 64 ++ > arch/x86/include/asm/xen/page-coherent.h | 11 > drivers/xen/swiotlb-xen.c| 20 +++ > include/xen/arm/page-coherent.h | 69 ++-- > 4 files changed, 31 insertions(+), 133 deletions(-) WOW nice! Now I really can see why this series was worth doing :-) Reviewed-by: Stefano Stabellini > diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c > index b7d53415532b..7096652f5a1e 100644 > --- a/arch/arm/xen/mm.c > +++ b/arch/arm/xen/mm.c > @@ -61,63 +61,33 @@ static void dma_cache_maint(dma_addr_t handle, size_t > size, u32 op) > } while (size); > } > > -static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t > handle, > - size_t size, enum dma_data_direction dir) > +/* > + * Dom0 is mapped 1:1, and while the Linux page can span across multiple Xen > + * pages, it is not possible for it to contain a mix of local and foreign Xen > + * pages. Calling pfn_valid on a foreign mfn will always return false, so if > + * pfn_valid returns true the pages is local and we can use the native > + * dma-direct functions, otherwise we call the Xen specific version. > + */ > +void xen_dma_sync_for_cpu(struct device *dev, dma_addr_t handle, > + phys_addr_t paddr, size_t size, enum dma_data_direction dir) > { > - if (dir != DMA_TO_DEVICE) > + if (pfn_valid(PFN_DOWN(handle))) > + arch_sync_dma_for_cpu(dev, paddr, size, dir); > + else if (dir != DMA_TO_DEVICE) > dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL); > } > > -static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t > handle, > - size_t size, enum dma_data_direction dir) > +void xen_dma_sync_for_device(struct device *dev, dma_addr_t handle, > + phys_addr_t paddr, size_t size, enum dma_data_direction dir) > { > - if (dir == DMA_FROM_DEVICE) > + if (pfn_valid(PFN_DOWN(handle))) > + arch_sync_dma_for_device(dev, paddr, size, dir); > + else if (dir == DMA_FROM_DEVICE) > dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL); > else > dma_cache_maint(handle, size, GNTTAB_CACHE_CLEAN); > } > > -void __xen_dma_map_page(struct device *hwdev, struct page *page, > - dma_addr_t dev_addr, unsigned long offset, size_t size, > - enum dma_data_direction dir, unsigned long attrs) > -{ > - if (dev_is_dma_coherent(hwdev)) > - return; > - if (attrs & DMA_ATTR_SKIP_CPU_SYNC) > - return; > - > - __xen_dma_page_cpu_to_dev(hwdev, dev_addr, size, dir); > -} > - > -void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle, > - size_t size, enum dma_data_direction dir, > - unsigned long attrs) > - > -{ > - if (dev_is_dma_coherent(hwdev)) > - return; > - if (attrs & DMA_ATTR_SKIP_CPU_SYNC) > - return; > - > - __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir); > -} > - > -void __xen_dma_sync_single_for_cpu(struct device *hwdev, > - dma_addr_t handle, size_t size, enum dma_data_direction dir) > -{ > - if (dev_is_dma_coherent(hwdev)) > - return; > - __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir); > -} > - > -void __xen_dma_sync_single_for_device(struct device *hwdev, > - dma_addr_t handle, size_t size, enum dma_data_direction dir) > -{ > - if (dev_is_dma_coherent(hwdev)) > - return; > - __xen_dma_page_cpu_to_dev(hwdev, handle, size, dir); > -} > - > bool xen_arch_need_swiotlb(struct device *dev, > phys_addr_t phys, > dma_addr_t dev_addr) > diff --git a/arch/x86/include/asm/xen/page-coherent.h > b/arch/x86/include/asm/xen/page-coherent.h > index 8ee33c5edded..c9c8398a31ff 100644 > --- a/arch/x86/include/asm/xen/page-coherent.h > +++ b/arch/x86/include/asm/xen/page-coherent.h > @@ -2,17 +2,6 @@ > #ifndef _ASM_X86_XEN_PAGE_COHERENT_H > #define _ASM_X86_XEN_PAGE_COHERENT_H > > -#include > -#include > - > -static
Re: [Xen-devel] [PATCH 10/11] swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page
On Mon, 26 Aug 2019, Christoph Hellwig wrote: > No need for a no-op wrapper. > > Signed-off-by: Christoph Hellwig Reviewed-by: Stefano Stabellini > --- > drivers/xen/swiotlb-xen.c | 15 --- > 1 file changed, 4 insertions(+), 11 deletions(-) > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index 95911ff9c11c..384304a77020 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -414,9 +414,8 @@ static dma_addr_t xen_swiotlb_map_page(struct device > *dev, struct page *page, > * After this call, reads by the cpu to the buffer are guaranteed to see > * whatever the device wrote there. > */ > -static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr, > - size_t size, enum dma_data_direction dir, > - unsigned long attrs) > +static void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr, > + size_t size, enum dma_data_direction dir, unsigned long attrs) > { > phys_addr_t paddr = xen_bus_to_phys(dev_addr); > > @@ -430,13 +429,6 @@ static void xen_unmap_single(struct device *hwdev, > dma_addr_t dev_addr, > swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs); > } > > -static void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr, > - size_t size, enum dma_data_direction dir, > - unsigned long attrs) > -{ > - xen_unmap_single(hwdev, dev_addr, size, dir, attrs); > -} > - > static void > xen_swiotlb_sync_single_for_cpu(struct device *dev, dma_addr_t dma_addr, > size_t size, enum dma_data_direction dir) > @@ -477,7 +469,8 @@ xen_swiotlb_unmap_sg(struct device *hwdev, struct > scatterlist *sgl, int nelems, > BUG_ON(dir == DMA_NONE); > > for_each_sg(sgl, sg, nelems, i) > - xen_unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir, > attrs); > + xen_swiotlb_unmap_page(hwdev, sg->dma_address, sg_dma_len(sg), > + dir, attrs); > > } > > -- > 2.20.1 > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 11/11] arm64: use asm-generic/dma-mapping.h
On Mon, 26 Aug 2019, Christoph Hellwig wrote: > Now that the Xen special cases are gone nothing worth mentioning is > left in the arm64 file, so switch to use the > asm-generic version instead. > > Signed-off-by: Christoph Hellwig > Acked-by: Will Deacon Reviewed-by: Stefano Stabellini > --- > arch/arm64/include/asm/Kbuild| 1 + > arch/arm64/include/asm/dma-mapping.h | 22 -- > arch/arm64/mm/dma-mapping.c | 1 + > 3 files changed, 2 insertions(+), 22 deletions(-) > delete mode 100644 arch/arm64/include/asm/dma-mapping.h > > diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild > index c52e151afab0..98a5405c8558 100644 > --- a/arch/arm64/include/asm/Kbuild > +++ b/arch/arm64/include/asm/Kbuild > @@ -4,6 +4,7 @@ generic-y += delay.h > generic-y += div64.h > generic-y += dma.h > generic-y += dma-contiguous.h > +generic-y += dma-mapping.h > generic-y += early_ioremap.h > generic-y += emergency-restart.h > generic-y += hw_irq.h > diff --git a/arch/arm64/include/asm/dma-mapping.h > b/arch/arm64/include/asm/dma-mapping.h > deleted file mode 100644 > index 67243255a858.. > --- a/arch/arm64/include/asm/dma-mapping.h > +++ /dev/null > @@ -1,22 +0,0 @@ > -/* SPDX-License-Identifier: GPL-2.0-only */ > -/* > - * Copyright (C) 2012 ARM Ltd. > - */ > -#ifndef __ASM_DMA_MAPPING_H > -#define __ASM_DMA_MAPPING_H > - > -#ifdef __KERNEL__ > - > -#include > -#include > - > -#include > -#include > - > -static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type > *bus) > -{ > - return NULL; > -} > - > -#endif /* __KERNEL__ */ > -#endif /* __ASM_DMA_MAPPING_H */ > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index 4b244a037349..6578abcfbbc7 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -8,6 +8,7 @@ > #include > #include > #include > +#include > #include > > #include > -- > 2.20.1 > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [ovmf test] 140769: regressions - FAIL
flight 140769 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/140769/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qemuu-ovmf-amd64 18 guest-start/debianhvm.repeat fail REGR. vs. 140559 version targeted for testing: ovmf 30b4abc6e9141356639207d1cc280058a0229a36 baseline version: ovmf abc0155b034230128ad4aaa51ac05a315acfa7c1 Last test of basis 140559 2019-08-23 04:52:13 Z6 days Testing same since 140684 2019-08-26 22:09:16 Z2 days3 attempts People who touched revisions under test: Michael D Kinney Tim Lewis jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 fail test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. commit 30b4abc6e9141356639207d1cc280058a0229a36 Author: Michael D Kinney Date: Thu Aug 22 10:36:10 2019 +0800 EmulatorPkg/Win/Host: Fix SecPrint() log line endings Update use of SecPrint() to consistently use \n\r for line endings to fix formatting issues in the debug log. Cc: Jordan Justen Reviewed-by: Ray Ni Cc: Andrew Fish Tested-by: Tim Lewis Signed-off-by: Michael D Kinney commit 10ccc27c9548ab2550689e68f6f45dc48120bc79 Author: Michael D Kinney Date: Thu Aug 22 10:36:09 2019 +0800 EmulatorPkg/Win/Host: Fix image unload regression https://bugzilla.tianocore.org/show_bug.cgi?id=2104 When UEFI Applications or UEFI Drivers are unloaded, the PeCoffLoaderUnloadImageExtraAction() needs to unload the image using FreeLibrary() if the image was successfully loaded using LoadLibrrayEx(). This is a regression from the Nt32Pkg that supported unloading applications and drivers as well as loading the same application or driver multiple times. Cc: Jordan Justen Reviewed-by: Ray Ni Cc: Andrew Fish Tested-by: Tim Lewis Signed-off-by: Michael D Kinney ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 11/12] livepatch: Add metadata runtime retrieval mechanism
On Tue, Aug 27, 2019 at 08:46:23AM +, Pawel Wieczorkiewicz wrote: > Extend the livepatch list operation to fetch also payloads' metadata. > This is achieved by extending the sysctl list interface with 2 extra > guest handles: > * metadata - an array of arbitrary size strings > * metadata_len - an array of metadata strings' lengths (uin32_t each) > > Payloads' metadata is a string of arbitrary size and does not have an > upper bound limit. It may also vary in size between payloads. > > In order to let the userland allocate enough space for the incoming > data add a metadata total size field to the list sysctl operation and > fill it with total size of all payloads' metadata. > > Extend the libxc to handle the metadata back-to-back data transfers > as well as metadata length array data transfers. > > The xen-livepatch userland tool is extended to always display the > metadata for each received module. The metadata is received with the > following format: key=value\0key=value\0...key=value\0. The format is > modified to the following one: key=value;key=value;...key=value. > The new format allows to easily parse the metadata for a given module > by a machine. > > Signed-off-by: Pawel Wieczorkiewicz > Reviewed-by: Andra-Irina Paraschiv > Reviewed-by: Martin Pohlack > Reviewed-by: Norbert Manthey > --- > Changed since v1: > * added corresponding documentation > * make metadata optional (do not display it when given payload > does not have it) > > docs/misc/livepatch.pandoc| 31 - > tools/libxc/include/xenctrl.h | 22 +++ > tools/libxc/xc_misc.c | 64 > +++ > tools/misc/xen-livepatch.c| 43 +++-- > xen/common/livepatch.c| 22 +++ > xen/include/public/sysctl.h | 19 + > 6 files changed, 157 insertions(+), 44 deletions(-) And it may be worth folding this in: diff --git a/xen/test/livepatch/Makefile b/xen/test/livepatch/Makefile index a7857d3a2e..3f088e74b2 100644 --- a/xen/test/livepatch/Makefile +++ b/xen/test/livepatch/Makefile @@ -79,9 +79,17 @@ config.h: xen_hello_world_func.o xen_hello_world.o: config.h .PHONY: $(LIVEPATCH) -$(LIVEPATCH): xen_hello_world_func.o xen_hello_world.o note.o xen_note.o +$(LIVEPATCH): xen_hello_world_func.o xen_hello_world.o note.o xen_note.o modinfo.o $(LD) $(LDFLAGS) $(build_id_linker) -r -o $(LIVEPATCH) $^ +.PHONY: modinfo.o +modinfo.o: + (set -e; \ +printf "LIVEPATCH_RULEZ\0") > $@.bin + $(OBJCOPY) $(OBJCOPY_MAGIC) \ + --rename-section=.data=.modinfo,alloc,load,readonly,data,contents -S $@.bin $@ + #rm -f $@.bin + # # This target is only accessible if CONFIG_LIVEPATCH is defined, which # depends on $(build_id_linker) being available. Hence we do not .. which the test-case can test for. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel