[Xen-devel] /sys/hypervisor entries for Xen (Domain-0, PV, PVH and HVM)
Hi all, I'm working on fixing up the grub packages for Fedora in deducing the new BLS logic in Fedora and disabling it in non-compatible environments. BZ Report: https://bugzilla.redhat.com/show_bug.cgi?id=1703700 Currently, it seems that we can deduce the following two scenarios: in /sys/hypervisor: 1) type == xen && uuid == all zeros, then this is BLS safe (the Domain-0). 2) type == xen && uuid != all zeros, then this is BLS *unsafe* (covers PV, HVM and PVH guests). Is there any other variables that come into effect that could cause a variation in the above checks as to enable or disable BLS? Right now, I'm proposing that we try to disable the new BLS behaviour in Fedora for PV, HVM and PVH guests - as pygrub is not up to the task of booting them. We included HVM as it may be common for users to switch between HVM and PVH configurations for the same installed VM. Any comments either here or via the BZ report above would be welcome. Steven Haigh net...@crc.id.au https://www.crc.id.au +613 9001 6090 +614 1293 5897 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [xen-unstable-smoke test] 142459: tolerable all pass - PUSHED
flight 142459 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/142459/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass version targeted for testing: xen 98d1dac88f82c2b79d528faabe5e3fda8133e8bb baseline version: xen f8abe4fe3c247b069daa59d84d479e42822d93de Last test of basis 142403 2019-10-07 15:01:16 Z1 days Testing same since 142459 2019-10-08 22:01:41 Z0 days1 attempts People who touched revisions under test: Julien Grall Stefano Stabellini Stefano Stabellini jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git f8abe4fe3c..98d1dac88f 98d1dac88f82c2b79d528faabe5e3fda8133e8bb -> smoke ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [linux-4.4 test] 142430: regressions - FAIL
flight 142430 linux-4.4 real [real] http://logs.test-lab.xenproject.org/osstest/logs/142430/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 139698 test-amd64-amd64-xl-qemuu-ovmf-amd64 7 xen-boot fail REGR. vs. 139698 test-amd64-amd64-xl-pvshim 20 guest-start/debian.repeat fail REGR. vs. 139698 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail never pass test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail never pass test-amd64-amd64-xl-pvhv2-amd 12 guest-start fail never pass test-arm64-arm64-xl-credit1 7 xen-boot fail never pass test-arm64-arm64-xl-xsm 7 xen-boot fail never pass test-arm64-arm64-xl-credit2 7 xen-boot fail never pass test-arm64-arm64-libvirt-xsm 7 xen-boot fail never pass test-arm64-arm64-xl-seattle 7 xen-boot fail never pass test-arm64-arm64-xl 7 xen-boot fail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-arm64-arm64-xl-thunderx 7 xen-boot fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail never pass test-arm64-arm64-examine 8 reboot fail never pass test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass version targeted for testing: linuxc61ebb668f2ce3c22d1cfe6df28bd3198eabbdd7 baseline version: linuxdc16a7e5f36d65b25a1b66ade14356773ed52875 Last test of basis 139698
[Xen-devel] [PATCH v1 2/2] xen/efi: optionally call SetVirtualAddressMap()
Some UEFI implementations are not happy about running in 1:1 addressing, but really virtual address space. Specifically, some access EfiBootServices{Code,Data}, or even totally unmapped areas. Example crash of GetVariable() call on Thinkpad W540: Xen call trace: [<0080>] 0080 [<8c2b0398edaa>] 8c2b0398edaa Pagetable walk from 858483a1: L4[0x1ff] = Panic on CPU 0: FATAL PAGE FAULT [error_code=0002] Faulting linear address: 858483a1 Fix this by calling SetVirtualAddressMap() runtime service, giving it 1:1 map for areas marked as needed during runtime. The address space in which EFI runtime services are called is unchanged, but UEFI view of it may be. SetVirtualAddressMap() can be called only once, hence it's incompatible with kexec. For this reason, make it an optional feature, depending on !KEXEC. And to not inflate number of supported configurations, hide it behind EXPERT. Signed-off-by: Marek Marczykowski-Górecki --- xen/common/Kconfig| 13 + xen/common/efi/boot.c | 28 ++-- 2 files changed, 39 insertions(+), 2 deletions(-) diff --git a/xen/common/Kconfig b/xen/common/Kconfig index 16829f6..fe98f8a 100644 --- a/xen/common/Kconfig +++ b/xen/common/Kconfig @@ -88,6 +88,19 @@ config KEXEC If unsure, say Y. +config SET_VIRTUAL_ADDRESS_MAP +bool "EFI: call SetVirtualAddressMap()" if EXPERT = "y" +default n +depends on !KEXEC +---help--- + Call EFI SetVirtualAddressMap() runtime service to setup memory map for + further runtime services. According to UEFI spec, it isn't strictly + necessary, but many UEFI implementations misbehave when this call is + missing. On the other hand, this call can be made only once, which makes + it incompatible with kexec (kexec-ing this Xen from other Xen or Linux). + + If unsuser, say N. + config XENOPROF def_bool y prompt "Xen Oprofile Support" if EXPERT = "y" diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c index cddf3de..5c187db 100644 --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -1056,11 +1056,17 @@ static void __init efi_set_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop, UINTN gop efi_arch_video_init(gop, info_size, mode_info); } +#define INVALID_VIRTUAL_ADDRESS (0xBAAADUL << \ + (EFI_PAGE_SHIFT + BITS_PER_LONG - 32)) + static void __init efi_exit_boot(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable) { EFI_STATUS status; UINTN info_size = 0, map_key; bool retry; +#ifdef CONFIG_SET_VIRTUAL_ADDRESS_MAP +unsigned int i; +#endif efi_bs->GetMemoryMap(_size, NULL, _key, _mdesc_size, _ver); @@ -1098,6 +1104,26 @@ static void __init efi_exit_boot(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *Syste efi_ct = (void *)efi_ct + DIRECTMAP_VIRT_START; efi_memmap = (void *)efi_memmap + DIRECTMAP_VIRT_START; efi_fw_vendor = (void *)efi_fw_vendor + DIRECTMAP_VIRT_START; + +#ifdef CONFIG_SET_VIRTUAL_ADDRESS_MAP +for ( i = 0; i < efi_memmap_size; i += efi_mdesc_size ) +{ +EFI_MEMORY_DESCRIPTOR *desc = efi_memmap + i; + +if ( desc->Attribute & EFI_MEMORY_RUNTIME ) +desc->VirtualStart = desc->PhysicalStart; +else +desc->VirtualStart = INVALID_VIRTUAL_ADDRESS; +} +status = efi_rs->SetVirtualAddressMap(efi_memmap_size, efi_mdesc_size, + mdesc_ver, efi_memmap); +if ( status != EFI_SUCCESS ) +{ +printk(XENLOG_ERR "EFI: SetVirtualAddressMap() failed (%#lx), disabling runtime services\n", + status); +__clear_bit(EFI_RS, _flags); +} +#endif } static int __init __maybe_unused set_color(u32 mask, int bpp, u8 *pos, u8 *sz) @@ -1460,8 +1486,6 @@ static bool __init rt_range_valid(unsigned long smfn, unsigned long emfn) return true; } -#define INVALID_VIRTUAL_ADDRESS (0xBAAADUL << \ - (EFI_PAGE_SHIFT + BITS_PER_LONG - 32)) void __init efi_init_memory(void) { -- git-series 0.9.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v1 0/2] Optionally call EFI SetVirtualAddressMap()
Workaround buggy UEFI accessing boot services memory after ExitBootServices(). Patches discussed here: https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00701.html Cc: Juergen Gross Cc: Andrew Cooper Cc: George Dunlap Cc: Ian Jackson Cc: Jan Beulich Cc: Julien Grall Cc: Konrad Rzeszutek Wilk Cc: Stefano Stabellini Cc: Tim Deegan Cc: Wei Liu Marek Marczykowski-Górecki (2): efi: remove old SetVirtualAddressMap() arrangement xen/efi: optionally call SetVirtualAddressMap() xen/common/Kconfig| 13 - xen/common/efi/boot.c | 48 +++- 2 files changed, 39 insertions(+), 22 deletions(-) base-commit: 7a4e674905b3cbbe48e81c3222361a7f3579 -- git-series 0.9.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v1 1/2] efi: remove old SetVirtualAddressMap() arrangement
Remove unused (#ifdef-ed out) code. Reviving it in its current shape won't fly because: - SetVirtualAddressMap() needs to be mapped with 1:1 mapping, which isn't the case at this time - it uses directmap, which is going away soon - it uses directmap, which is mapped with NX, breaking EfiRuntimeServicesCode No functional change. Signed-off-by: Marek Marczykowski-Górecki --- xen/common/efi/boot.c | 20 1 file changed, 20 deletions(-) diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c index 7919378..cddf3de 100644 --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -29,9 +29,6 @@ #undef __ASSEMBLY__ #endif -/* Using SetVirtualAddressMap() is incompatible with kexec: */ -#undef USE_SET_VIRTUAL_ADDRESS_MAP - #define EFI_REVISION(major, minor) (((major) << 16) | (minor)) #define SMBIOS3_TABLE_GUID \ @@ -1099,9 +1096,6 @@ static void __init efi_exit_boot(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *Syste /* Adjust pointers into EFI. */ efi_ct = (void *)efi_ct + DIRECTMAP_VIRT_START; -#ifdef USE_SET_VIRTUAL_ADDRESS_MAP -efi_rs = (void *)efi_rs + DIRECTMAP_VIRT_START; -#endif efi_memmap = (void *)efi_memmap + DIRECTMAP_VIRT_START; efi_fw_vendor = (void *)efi_fw_vendor + DIRECTMAP_VIRT_START; } @@ -1422,7 +1416,6 @@ static int __init parse_efi_param(const char *s) } custom_param("efi", parse_efi_param); -#ifndef USE_SET_VIRTUAL_ADDRESS_MAP static __init void copy_mapping(unsigned long mfn, unsigned long end, bool (*is_valid)(unsigned long smfn, unsigned long emfn)) @@ -1466,7 +1459,6 @@ static bool __init rt_range_valid(unsigned long smfn, unsigned long emfn) { return true; } -#endif #define INVALID_VIRTUAL_ADDRESS (0xBAAADUL << \ (EFI_PAGE_SHIFT + BITS_PER_LONG - 32)) @@ -1474,13 +1466,11 @@ static bool __init rt_range_valid(unsigned long smfn, unsigned long emfn) void __init efi_init_memory(void) { unsigned int i; -#ifndef USE_SET_VIRTUAL_ADDRESS_MAP struct rt_extra { struct rt_extra *next; unsigned long smfn, emfn; unsigned int prot; } *extra, *extra_head = NULL; -#endif free_ebmalloc_unused_mem(); @@ -1563,7 +1553,6 @@ void __init efi_init_memory(void) printk(XENLOG_ERR "Could not map MFNs %#lx-%#lx\n", smfn, emfn - 1); } -#ifndef USE_SET_VIRTUAL_ADDRESS_MAP else if ( !((desc->PhysicalStart + len - 1) >> (VADDR_BITS - 1)) && (extra = xmalloc(struct rt_extra)) != NULL ) { @@ -1574,12 +1563,8 @@ void __init efi_init_memory(void) extra_head = extra; desc->VirtualStart = desc->PhysicalStart; } -#endif else { -#ifdef USE_SET_VIRTUAL_ADDRESS_MAP -/* XXX allocate e.g. down from FIXADDR_START */ -#endif printk(XENLOG_ERR "No mapping for MFNs %#lx-%#lx\n", smfn, emfn - 1); } @@ -1591,10 +1576,6 @@ void __init efi_init_memory(void) return; } -#ifdef USE_SET_VIRTUAL_ADDRESS_MAP -efi_rs->SetVirtualAddressMap(efi_memmap_size, efi_mdesc_size, - mdesc_ver, efi_memmap); -#else /* Set up 1:1 page tables to do runtime calls in "physical" mode. */ efi_l4_pgtable = alloc_xen_pagetable(); BUG_ON(!efi_l4_pgtable); @@ -1680,6 +1661,5 @@ void __init efi_init_memory(void) for ( i = l4_table_offset(HYPERVISOR_VIRT_START); i < l4_table_offset(DIRECTMAP_VIRT_END); ++i ) efi_l4_pgtable[i] = idle_pg_table[i]; -#endif } #endif -- git-series 0.9.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen 4.12 panic on Thinkpad W540 with UEFI mutiboot2, efi=no-rs workarounds it
On Tue, Oct 08, 2019 at 06:29:27PM +0200, Marek Marczykowski-Górecki wrote: > On Tue, Oct 08, 2019 at 04:19:13PM +0200, Jan Beulich wrote: > > On 08.10.2019 15:52, Marek Marczykowski-Górecki wrote: > > > Regardless of SetVirtualAddressMap() discussion, I propose to > > > automatically map boot services code/data, to make Xen work on more > > > machines (even if _we_ consider those buggy). > > > > I remain on my prior position: Adding command line triggerable > > workarounds for such cases is fine. Defaulting to assume buggy > > firmware is acceptable only if this means no extra penalty to > > systems with conforming firmware. Hence, for the case at hand, > > I object to doing this automatically; we already have the > > /mapbs workaround in place to deal with the case when xen.efi > > is used. Judging from the title here there may need to be an > > addition to also allow triggering this from the MB2 boot path. > > What about mirroring Linux behavior? I.e. mapping those regions for the > SetVirtualAddressMap() time (when enabled) and unmapping after - unless > tagged with EFI_MEMORY_RUNTIME? Oh, even better: I can call SetVirtualAddressMap() while still in 1:1, just after ExitBootServices(), giving it 1:1-like map (for areas marked with EFI_MEMORY_RUNTIME). This way I don't need to really map BootServices areas (and exclude from Xen memory allocator), so there is no penalty for systems with conforming firmware. And indeed calling SetVirtualAddressMap() is enough for other runtime services (like GetVariable*) to behave correctly, even if boot services are not mapped anymore. So, the only downside is incompatibility with kexec. > Similarly to Andrew, I'd really prefer for Xen to work out of the box, > with as little as possible manual tweaks needed. > > Allowing SetVirtualAddressMap() when !KEXEC would be fine with me. > > The fly in the ointment here is that we'd prefer not to have such > > Kconfig options (at least not without EXPERT qualifier), as > > (security) supporting all the possible combinations would be a > > nightmare. If an EXPERT dependency is okay with you, then I'll be > > looking forward to your patch. > > EXPERT is fine with me. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? signature.asc Description: PGP signature ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.13 v3] xen/arm: fix buf size in make_cpus_node
The size of buf is calculated wrongly: the number is printed as a hexadecimal number, so we need 8 bytes for 32bit, not 10 bytes. As a result, it should be sizeof("cpu@") + 8 bytes for a 32-bit number + 1 byte for \0. Total = 13. mpidr_aff is 64-bit, however, only bits [0-23] are used. Add a check for that. Fixes: c81a791d34 (xen/arm: Set 'reg' of cpu node for dom0 to match MPIDR's affinity) Signed-off-by: Stefano Stabellini Release-acked-by: Juergen Gross --- Changes in v3: - make sure only [23:0] bits are used in mpidr_aff - clarify that we only need 32bit for buf writes Changes in v2: - patch added --- xen/arch/arm/domain_build.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 921b054520..d5ee639548 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -789,7 +789,7 @@ static int __init make_cpus_node(const struct domain *d, void *fdt) const void *compatible = NULL; u32 len; /* Placeholder for cpu@ + a 32-bit number + \0 */ -char buf[15]; +char buf[13]; u32 clock_frequency; bool clock_valid; uint64_t mpidr_aff; @@ -847,8 +847,18 @@ static int __init make_cpus_node(const struct domain *d, void *fdt) * the MPIDR's affinity bits. We will use AFF0 and AFF1 when * constructing the reg value of the guest at the moment, for it * is enough for the current max vcpu number. + * + * We only deal with AFF{0, 1, 2} stored in bits [23:0] at the + * moment. */ mpidr_aff = vcpuid_to_vaffinity(cpu); +if ( (mpidr_aff & ~GENMASK_ULL(23, 0)) != 0 ) +{ +printk(XENLOG_ERR "Unable to handle MPIDR AFFINITY 0x%"PRIx64"\n", + mpidr_aff); +return -EINVAL; +} + dt_dprintk("Create cpu@%"PRIx64" (logical CPUID: %d) node\n", mpidr_aff, cpu); -- 2.17.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [xen-unstable test] 142417: regressions - FAIL
flight 142417 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/142417/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 141822 test-amd64-i386-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 141822 test-amd64-amd64-xl-pvshim 16 guest-localmigrate fail REGR. vs. 141822 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 141822 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 141822 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 141822 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 141822 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 141822 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 141822 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 141822 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 141822 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 141822 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 141822 test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim12 guest-start fail never pass test-arm64-arm64-xl-seattle 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass
Re: [Xen-devel] [[PATCH for-4.13]] xen/arm: mm: Allow generic xen page-tables helpers to be called early
On Tue, 8 Oct 2019, Julien Grall wrote: > On 10/8/19 1:18 AM, Stefano Stabellini wrote: > > On Mon, 7 Oct 2019, Julien Grall wrote: > > > Hi, > > > > > > On 03/10/2019 02:02, Stefano Stabellini wrote: > > > > On Fri, 20 Sep 2019, Julien Grall wrote: > > > > > That's not correct. alloc_boot_pages() is actually here to allow > > > > > dynamic > > > > > allocation before the memory subsystem (and therefore the runtime > > > > > allocator) > > > > > is initialized. > > > > > > > > Let me change the question then: is the system_state == > > > > SYS_STATE_early_boot check strictly necessary? It looks like it is not: > > > > the patch would work even if it was just: > > > > > > I had a few thoughts about it. On Arm32, this only really works for > > > 32-bits machine address (it can go up to 40-bits). I haven't really > > > fully investigated what could go wrong, but it would be best to keep it > > > only for early boot. > > > > > > Also, I don't really want to rely on this "workaround" after boot. Maybe > > > we would want to keep them unmapped in the future. > > > > Yes, no problems, we agree on that. I am not asking in regards to the > > check system_state == SYS_STATE_early_boot with the goal of asking you > > to get rid of it. I am fine with keeping the check. (Maybe we want to add > > an `unlikely()' around the check.) > > > > I am trying to understand whether the code actually relies on > > system_state == SYS_STATE_early_boot, and, if so, why. The goal is to > > make sure that if there are some limitations that they are documented, > > or just to double-check that there are no limitations. > > The check is not strictly necessary. > > > > > In regards to your comment about only working for 32-bit addresses on > > Arm32, you have a point. At least we should be careful with the mfn to > > vaddr conversion because mfn_to_maddr returns a paddr_t which is 64-bit > > and vaddr_t is 32-bit. I imagine that theoretically, even with > > system_state == SYS_STATE_early_boot, it could get truncated with the > > wrong combination of mfn and phys_offset. > > > > If nothing else, maybe we should add a truncation check for safety? > > Except that phys_offset is not defined correctly, so your check below will > break some setup :(. Let's take the following example: > >Xen is loaded at PA 0x10 > > The boot offset is computed using 32-bit address (see head.S): > PA - VA = 0x10 - 0x20 > = 0xfff0 > > This value will be passed to C code as an unsigned long. But then we will > store it in a uint64_t/paddr_t (see phys_offset which is set in > setup_page_tables). Because this is a conversion from unsigned to unsigned, > the "sign bit" will not be propagated. > > This means that phys_offset will be equal to 0xfff0 and not > 0xfff0! > > Therefore if we try to convert 0x10 (where Xen has been loaded) back to > its VA, the resulting value will be 0x00200100. > > Looking at the code, I think pte_of_xenaddr() has also the exact problem. :( One way to fix it would be to change phys_offset to register_t (or just declare it as unsigned long on arm32 and unsigned long long on arm64): diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index be23acfe26..c7ead39654 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -170,7 +170,7 @@ static DEFINE_PAGE_TABLE(xen_xenmap); /* Non-boot CPUs use this to find the correct pagetables. */ uint64_t init_ttbr; -static paddr_t phys_offset; +static register_t phys_offset; /* Limits of the Xen heap */ mfn_t xenheap_mfn_start __read_mostly = INVALID_MFN_INITIALIZER; It would work OK for virtual to physical conversions. Physical to virtual is more challenging because physical could go up to 40bits. Fortunately, it doesn't look like we are using phys_offset to do many phys-to-virt conversions. The only example is the one in this patch, and dump_hyp_walk. > I guess nobody tried to load Xen that low in memory on Arm32? But that's going > to be definitely an issues with the memory rework I have in mind. > > I have some other important work to finish for Xen 4.13. So I am thinking to > defer the problem I mention above for post Xen 4.13. Although, the GRUB issues > would still need to be fix. Any opinions? I think for Xen 4.13 I would like to get in your patch to fix GRUB booting, with a little bit more attention to integer issues. Something like the following: paddr_t pa_end = _end + phys_offset; paddr_t pa_start = _start + phys_offset; paddr_t pa = mfn_to_maddr(mfn); if ( pa < pa_end && pa >= pa_start ) return (lpae_t *)(vaddr_t)(pa - pa_start + XEN_VIRT_ADDR); I think it should work if phys_offset is register_t. Or we could have a check on pa >= 4G, something like: paddr_t pa = (register_t)mfn_to_maddr(mfn) - phys_offset; if ( mfn_to_maddr(mfn) <= UINT32_MAX && pa < _end && is_kernel((vaddr_t)pa) ) return (lpae_t *)(vaddr_t)pa; > Note that this is
Re: [Xen-devel] [PATCH v2 1/3] xen/arm: fix buf size in make_cpus_node
Hi Stefano, On 08/10/2019 22:18, Stefano Stabellini wrote: > On Tue, 8 Oct 2019, Julien Grall wrote: >> On 10/8/19 2:14 AM, Stefano Stabellini wrote: >>> The size of buf is calculated wrongly: the number is 64bit, not 32bit. >> >> While the variable mpdir_aff is 64-bit, we only write the first 32-bit in the >> property reg (#address-cells == 1 and fdt_property_cell()). So what needs to >> be modified is the format here. >> >> Also, looking the CPU bindings (see >> linux/Documentation/devicetree/bindings/arm/cpus.yaml), technically only the >> bits [23:0] of the mpidr should be used. The rest is zeroed. >> >> This is ok because vcpuid_to_vaffinity() is always returning a value >> following >> the requirements above. However, for correctness, this may want to be fixed. > > It looks like it would be best to change mpdir_aff to uint32_t and > change vcpuid_to_vaffinity to return a uint32_t. vcpuid_to_vaffinity is meant to return the AFFx bits of the MIDR (so 32-bit on Arm32 and 64-bit on Arm64). We are only using AFF0 and AFF1, so the rest is zeroed. But this does not mean we should switch to 32-bit. If we want to change the interface then, it should be register_t and not 32-bit. However, I didn't suggest to switch to 32-bit but to transfer the bits [23:0] to a variable and possibly check that the rest is 0. Maybe something like: uint32_t reg; reg = mpidr_aff & GENMASK(23, 0); /* We only are able to deal with AFF{0, 1, 2} stored in bits [23:0] at the moment */ if ( reg != mpidr_aff ) { printk(XENLOG_ERR "Unable to handle MPIDR AFFINITY 0x%"PRIx64"\n", mpidr_aff); return -EINVAL; } Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] xen/arm: fix buf size in make_cpus_node
On Tue, 8 Oct 2019, Julien Grall wrote: > On 10/8/19 2:14 AM, Stefano Stabellini wrote: > > The size of buf is calculated wrongly: the number is 64bit, not 32bit. > > While the variable mpdir_aff is 64-bit, we only write the first 32-bit in the > property reg (#address-cells == 1 and fdt_property_cell()). So what needs to > be modified is the format here. > > Also, looking the CPU bindings (see > linux/Documentation/devicetree/bindings/arm/cpus.yaml), technically only the > bits [23:0] of the mpidr should be used. The rest is zeroed. > > This is ok because vcpuid_to_vaffinity() is always returning a value following > the requirements above. However, for correctness, this may want to be fixed. It looks like it would be best to change mpdir_aff to uint32_t and change vcpuid_to_vaffinity to return a uint32_t. Then of course the buf allocation would be buf[13]. Is that what you have in mind? > > Also the number is printed as a hexadecimal number, so we need 8 bytes > > for 32bit, not 10 bytes. > > > > As a result, it should be sizeof("cpu@") + 16 bytes for a 64-bit number > > + 1 byte for \0. Total = 21. > > > > Fixes: fafd682c3e (xen/arm: Create a fake cpus node in dom0 device tree) > > I am afraid this is not fixing this patch: > > snprintf(buf, sizeof(buf), "cpu@%u", cpu); > > So the 10 bytes were actually correct back then. > > The problem was introduced by commit c81a791d34 "xen/arm: Set 'reg' of cpu > node for dom0 to match MPIDR's affinity". Yes, I'll change it > > Signed-off-by: Stefano Stabellini > > --- > > Changes in v2: > > - patch added > > --- > > xen/arch/arm/domain_build.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c > > index 921b054520..60923a7051 100644 > > --- a/xen/arch/arm/domain_build.c > > +++ b/xen/arch/arm/domain_build.c > > @@ -788,8 +788,8 @@ static int __init make_cpus_node(const struct domain *d, > > void *fdt) > > unsigned int cpu; > > const void *compatible = NULL; > > u32 len; > > -/* Placeholder for cpu@ + a 32-bit number + \0 */ > > -char buf[15]; > > +/* Placeholder for cpu@ + a 64-bit number + \0 */ > > +char buf[21]; > > u32 clock_frequency; > > bool clock_valid; > > uint64_t mpidr_aff; ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [libvirt test] 142427: regressions - FAIL
flight 142427 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/142427/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-arm64-arm64-libvirt-qcow2 16 guest-start.2 fail REGR. vs. 142252 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 142252 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 142252 test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass version targeted for testing: libvirt 23605f58bfd5c47d0f6fbd2aa57a8bda15e720df baseline version: libvirt 2346b2f6564ae9f7ba35bc863cb0fab39cadeb12 Last test of basis 142252 2019-10-04 04:23:56 Z4 days Failing since142345 2019-10-06 04:19:12 Z2 days3 attempts Testing same since 142427 2019-10-08 04:18:58 Z0 days1 attempts People who touched revisions under test: Collin Walling Daniel P. Berrangé Daniel Veillard Fabiano Fidêncio Ján Tomko Pavel Mores Peter Krempa jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-arm64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-arm64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-libvirt-xsm pass test-arm64-arm64-libvirt-xsm pass test-amd64-i386-libvirt-xsm pass test-amd64-amd64-libvirt pass test-arm64-arm64-libvirt pass test-armhf-armhf-libvirt pass test-amd64-i386-libvirt pass test-amd64-amd64-libvirt-pairpass test-amd64-i386-libvirt-pair pass test-arm64-arm64-libvirt-qcow2 fail test-armhf-armhf-libvirt-raw pass test-amd64-amd64-libvirt-vhd pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log;
[Xen-devel] [PATCH v2] xen: Stop abusing DT of_dma_configure API
As the removed comments say, these aren't DT based devices. of_dma_configure() is going to stop allowing a NULL DT node and calling it will no longer work. The comment is also now out of date as of commit 9ab91e7c5c51 ("arm64: default to the direct mapping in get_arch_dma_ops"). Direct mapping is now the default rather than dma_dummy_ops. According to Stefano and Oleksandr, the only other part needed is setting the DMA masks and there's no reason to restrict the masks to 32-bits. So set the masks to 64 bits. Cc: Robin Murphy Cc: Julien Grall Cc: Nicolas Saenz Julienne Cc: Oleksandr Andrushchenko Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Christoph Hellwig Cc: xen-devel@lists.xenproject.org Signed-off-by: Rob Herring --- v2: - Setup dma masks - Also fix xen_drm_front.c This can now be applied to the Xen tree independent of the coming of_dma_configure() changes. Rob drivers/gpu/drm/xen/xen_drm_front.c | 12 ++-- drivers/xen/gntdev.c| 13 ++--- 2 files changed, 4 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/xen/xen_drm_front.c b/drivers/gpu/drm/xen/xen_drm_front.c index ba1828acd8c9..4be49c1aef51 100644 --- a/drivers/gpu/drm/xen/xen_drm_front.c +++ b/drivers/gpu/drm/xen/xen_drm_front.c @@ -718,17 +718,9 @@ static int xen_drv_probe(struct xenbus_device *xb_dev, struct device *dev = _dev->dev; int ret; - /* -* The device is not spawn from a device tree, so arch_setup_dma_ops -* is not called, thus leaving the device with dummy DMA ops. -* This makes the device return error on PRIME buffer import, which -* is not correct: to fix this call of_dma_configure() with a NULL -* node to set default DMA ops. -*/ - dev->coherent_dma_mask = DMA_BIT_MASK(32); - ret = of_dma_configure(dev, NULL, true); + ret = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(64)); if (ret < 0) { - DRM_ERROR("Cannot setup DMA ops, ret %d", ret); + DRM_ERROR("Cannot setup DMA mask, ret %d", ret); return ret; } diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index a446a7221e13..81401f386c9c 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -22,6 +22,7 @@ #define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt +#include #include #include #include @@ -34,9 +35,6 @@ #include #include #include -#ifdef CONFIG_XEN_GRANT_DMA_ALLOC -#include -#endif #include #include @@ -625,14 +623,7 @@ static int gntdev_open(struct inode *inode, struct file *flip) flip->private_data = priv; #ifdef CONFIG_XEN_GRANT_DMA_ALLOC priv->dma_dev = gntdev_miscdev.this_device; - - /* -* The device is not spawn from a device tree, so arch_setup_dma_ops -* is not called, thus leaving the device with dummy DMA ops. -* Fix this by calling of_dma_configure() with a NULL node to set -* default DMA ops. -*/ - of_dma_configure(priv->dma_dev, NULL, true); + dma_coerce_mask_and_coherent(priv->dma_dev, DMA_BIT_MASK(64)); #endif pr_debug("priv %p\n", priv); -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.13] x86/microcode: Drop trailing whitespace in printk()
This has actually been present since c/s bd7c09c0 in 2008, and survived through all of the recent microcode refactoring. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Wei Liu CC: Roger Pau Monné CC: Chao Gao CC: Juergen Gross This is a cosmetic nice-to-have which has no risk for 4.13. --- xen/arch/x86/microcode_intel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c index 9ededcc73a..9f66057aad 100644 --- a/xen/arch/x86/microcode_intel.c +++ b/xen/arch/x86/microcode_intel.c @@ -319,7 +319,7 @@ static int apply_microcode(const struct microcode_patch *patch) return -EIO; } printk(KERN_INFO "microcode: CPU%d updated from revision " - "%#x to %#x, date = %04x-%02x-%02x \n", + "%#x to %#x, date = %04x-%02x-%02x\n", cpu_num, sig->rev, val[1], mc_intel->hdr.year, mc_intel->hdr.month, mc_intel->hdr.day); sig->rev = val[1]; -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [ovmf test] 142423: all pass - PUSHED
flight 142423 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/142423/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 410c4d00d9f7e369d1ce183e9e8de98cb59c4d20 baseline version: ovmf d19040804afb2bdd60f18e8aef7da78028575fe6 Last test of basis 142274 2019-10-04 14:39:34 Z4 days Testing same since 142423 2019-10-08 01:39:34 Z0 days1 attempts People who touched revisions under test: Liming Gao jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git d19040804a..410c4d00d9 410c4d00d9f7e369d1ce183e9e8de98cb59c4d20 -> xen-tested-master ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] HPET interrupt remapping during boot
Hello, I have no idea if this is a regression or not. I suspect it might not be, and has always been broken. Either way, I'm seeing occasional single interrupt remapping errors when booting a range of Intel systems (XEN) x2APIC mode is already enabled by BIOS. (XEN) Using APIC driver x2apic_cluster ... (XEN) Platform timer is 23.999MHz HPET (XEN) Detected 2194.922 MHz processor. ... (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB (XEN) alt table 82d08047a070 -> 82d080486c6c (XEN) [VT-D]INTR-REMAP: Request device [:f0:1f.0] fault index 0, iommu reg = 82c00072d000 (XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear (XEN) microcode: CPU2 updated from revision 0x521 to 0x52b, date = 2019-08-12 From other debugging, I know that this happens after CPU 1 (which is a hyperthread) has passed through start_secondary(). f0:1f.0 is one of the IO-APICs, and if I've cross referenced the DMAR and APIC tables properly, is the IO-APIC on the PCH, making the problematic IRQ GSI0. This suggests that we have an error setting up the timer IRQ (as the HPET isn't MSI-capable), but we have already allegedly used it successfully earlier on boot. I haven't investigated further yet, but it is an intermittent issue (i.e. doesn't reproduce on each boot). My gut feeling is that we have something which corrects itself as a side effect of a later action. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for Xen 4.13] iommu/arm: Remove arch_iommu_populate_page_table() completely
Hi, all Gentle reminder... -- Regards, Oleksandr Tyshchenko ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [qemu-mainline test] 142415: regressions - FAIL
flight 142415 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/142415/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-i386-xl-qemuu-debianhvm-amd64 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-amd64-xl-qemuu-win7-amd64 10 windows-install fail REGR. vs. 140282 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-amd64-qemuu-nested-intel 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-amd64-xl-qemuu-debianhvm-amd64 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-amd64-qemuu-nested-amd 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-i386-qemuu-rhel6hvm-intel 10 redhat-install fail REGR. vs. 140282 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-install fail REGR. vs. 140282 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-i386-qemuu-rhel6hvm-amd 10 redhat-install fail REGR. vs. 140282 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-i386-freebsd10-amd64 11 guest-start fail REGR. vs. 140282 test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-install fail REGR. vs. 140282 test-amd64-i386-freebsd10-i386 11 guest-startfail REGR. vs. 140282 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-i386-xl-qemuu-win7-amd64 10 windows-install fail REGR. vs. 140282 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 debian-hvm-install fail REGR. vs. 140282 test-amd64-i386-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 140282 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 140282 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 140282 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14
Re: [Xen-devel] [PATCH v3] x86/mm: don't needlessly veto migration
On 08/10/2019 17:10, Jan Beulich wrote: > From: Paul Durrant > > Now that xl.cfg has an option to explicitly enable IOMMU mappings for a > domain, migration may be needlessly vetoed due to the check of > is_iommu_enabled() in paging_log_dirty_enable(). > There is actually no need to prevent logdirty from being enabled unless > devices are assigned to a domain. > > NOTE: While in the neighbourhood, the bool_t parameter type in > paging_log_dirty_enable() is replaced with a bool and the format > of the comment in assign_device() is fixed. > > Signed-off-by: Paul Durrant > Signed-off-by: Jan Beulich > Release-acked-by: Juergen Gross Seriously FFS. Why am I having to repeat myself? What if any way unclear on the previous threads? NACK NACK NACK. Xen is, and has always been, the wrong place to have any logic, because IT DOESN'T HAVE ENOUGH INFORMATION TO MAKE THE DECISION CORRECTLY. The toolstack does. Therefore, the toolstack is the only level capable decide whether it is safe to migration/suspend/resume/checkpoint the VM. If I have to write the patches myself, I will, but this patch in this form is frankly unacceptable. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen 4.12 panic on Thinkpad W540 with UEFI mutiboot2, efi=no-rs workarounds it
On Tue, Oct 08, 2019 at 04:19:13PM +0200, Jan Beulich wrote: > On 08.10.2019 15:52, Marek Marczykowski-Górecki wrote: > > On Tue, Oct 08, 2019 at 03:08:29PM +0200, Jan Beulich wrote: > >> On 08.10.2019 13:50, Marek Marczykowski-Górecki wrote: > >>> I explored it a bit more and talked with a few people doing firmware > >>> development and few conclusions: > >>> 1. Not calling SetVirtualAddressMap(), while technically legal, is > >>> pretty uncommon and not recommended if you want to avoid less tested > >>> (aka buggy) UEFI code paths. > >>> 2. Every UEFI call before SetVirtualAddressMap() call should be done > >>> with flat physical memory. This include SetVirtualAddressMap() call > >>> itself. Implicitly this means such calls can legally access memory areas > >>> not marked with EFI_MEMORY_RUNTIME. > >> > >> I don't think this is quite right - whether non-runtime memory may > >> be touched depends exclusively on ExitBootServices() (not) having > >> got called (yet). > > > > That would be logical. In practice however we have evidences firmware > > vendors have different opinion... A comment from Linux (already quoted > > here 2 months ago): > > > > /* > > * The UEFI specification makes it clear that the operating system is > > * free to do whatever it wants with boot services code after > > * ExitBootServices() has been called. Ignoring this recommendation a > > * significant bunch of EFI implementations continue calling into boot > > * services code (SetVirtualAddressMap). In order to work around such > > * buggy implementations we reserve boot services region during EFI > > * init and make sure it stays executable. Then, after > > * SetVirtualAddressMap(), it is discarded. > > * > > * However, some boot services regions contain data that is required > > * by drivers, so we need to track which memory ranges can never be > > * freed. This is done by tagging those regions with the > > * EFI_MEMORY_RUNTIME attribute. > > * > > * Any driver that wants to mark a region as reserved must use > > * efi_mem_reserve() which will insert a new EFI memory descriptor > > * into efi.memmap (splitting existing regions if necessary) and tag > > * it with EFI_MEMORY_RUNTIME. > > */ > > But you realize that the comment specifically talks about > the call tree underneath SetVirtualAddressMap() being the violator. > As long as we don't call this function, we're unaffected as far as > this comment goes. Well, this very thread proves it isn't only about SetVirtualAddressMap(). I _guess_ it's about calls before SetVirtualAddressMap() returns (which supposedly do some cleanups). In Linux case, it is only that one call. > > Regardless of SetVirtualAddressMap() discussion, I propose to > > automatically map boot services code/data, to make Xen work on more > > machines (even if _we_ consider those buggy). > > I remain on my prior position: Adding command line triggerable > workarounds for such cases is fine. Defaulting to assume buggy > firmware is acceptable only if this means no extra penalty to > systems with conforming firmware. Hence, for the case at hand, > I object to doing this automatically; we already have the > /mapbs workaround in place to deal with the case when xen.efi > is used. Judging from the title here there may need to be an > addition to also allow triggering this from the MB2 boot path. What about mirroring Linux behavior? I.e. mapping those regions for the SetVirtualAddressMap() time (when enabled) and unmapping after - unless tagged with EFI_MEMORY_RUNTIME? Similarly to Andrew, I'd really prefer for Xen to work out of the box, with as little as possible manual tweaks needed. > >>> Then I've tried a different approach: call SetVirtualAddressMap(), but > >>> with an address map that tries to pretend physical addressing (the code > >>> under #ifndef USE_SET_VIRTUAL_ADDRESS_MAP). This mostly worked, I needed > >>> only few changes: > >>> - set VirtualStart back to PhysicalStart in that memory map (it was set > >>>to directmap) > >>> - map boot services (at least for the SetVirtualAddressMap() call time, > >>>but haven't tried unmapping it later) > >>> - call SetVirtualAddressMap() with that "1:1" map in place, using > >>>efi_rs_enter/efi_rs_leave. > >>> > >>> This fixed the issue for me, now runtime services do work even without > >>> disabling ExitBootServices() call. And without any extra > >>> platform-specific command line arguments. And I think it also shouldn't > >>> break > >>> kexec, since it uses 1:1-like map, but I haven't tried. One should > >>> simply ignore EFI_UNSUPPORTED return code (I don't know how to avoid the > >>> call at all after kexec). > >> > >> That's the point - it can't be avoided, and hence it failing is not > >> an option. Or else there needs to be a protocol telling kexec what > >> it is to expect, and that it in particular can't change the virtual > >>
Re: [Xen-devel] [xen-unstable test] 142383: regressions - FAIL
On 08.10.2019 13:31, Jan Beulich wrote: > On 08.10.2019 13:15, Roger Pau Monné wrote: >> On Tue, Oct 08, 2019 at 12:42:25PM +0200, Jürgen Groß wrote: >>> On 07.10.19 22:56, osstest service owner wrote: flight 142383 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/142383/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 141822 test-amd64-i386-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 141822 >>> >>> Hmm, test log says the guest didn't suspend. >>> >>> Could that be related to commit b183e180bce9303 ? >> >> The libvirt libxl daemon spits: >> >> 2019-10-07 12:31:09.953+: libxl-save-helper: starting save: Success >> 2019-10-07 12:31:09.953+: xc: fd 40, dom 1, flags 1, hvm 0 >> 2019-10-07 12:31:09.954+: xc: Saving domain 1, type x86 PV >> 2019-10-07 12:31:09.954+: xc: 64 bits, 4 levels >> 2019-10-07 12:31:09.959+: xc: max_mfn 0x28 >> 2019-10-07 12:31:09.959+: xc: p2m list from 0xc900 to >> 0xc90f, root at 0x27de0a >> 2019-10-07 12:31:09.959+: xc: max_pfn 0x1, p2m_frames 256 >> 2019-10-07 12:31:09.960+: xc: Failed to enable logdirty: 22,0,22 (22 = >> Invalid argument): Internal error >> 2019-10-07 12:31:09.960+: xc: Save failed (22 = Invalid argument): >> Internal error >> 2019-10-07 12:31:09.981+: libxl-save-helper: complete r=-1: Invalid >> argument >> 2019-10-07 12:31:09.983+: libxl: libxl.c:752:libxl__fd_flags_restore: >> fnctl F_SETFL of fd 40 to 0x2 >> 2019-10-07 12:31:09.983+: libxl: libxl_event.c:1873:libxl__ao_complete: >> ao 0x7f760c002110: complete, rc=-8 >> 2019-10-07 12:31:09.983+: libxl: libxl_event.c:1842:libxl__ao__destroy: >> ao 0x7f760c002110: destroy >> >> Which seem to be related to the recent iommu changes, I would say it's >> likely hitting the is_iommu_enabled check in paging_log_dirty_enable >> and hence returning EINVAL. >> >> Adding Paul and Jan who worked on the series. > > Well, as mentioned in reply to Paul's patch earlier today, I intend > to commit the agreed upon half of it later today. Actually I can't - the patch wants an ack from George first. I've submitted v3 for this reason. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3] x86/mm: don't needlessly veto migration
From: Paul Durrant Now that xl.cfg has an option to explicitly enable IOMMU mappings for a domain, migration may be needlessly vetoed due to the check of is_iommu_enabled() in paging_log_dirty_enable(). There is actually no need to prevent logdirty from being enabled unless devices are assigned to a domain. NOTE: While in the neighbourhood, the bool_t parameter type in paging_log_dirty_enable() is replaced with a bool and the format of the comment in assign_device() is fixed. Signed-off-by: Paul Durrant Signed-off-by: Jan Beulich Release-acked-by: Juergen Gross --- v3: - reduce to iommu_use_hap_pt()-less variant v2: - expand commit comment --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -71,7 +71,7 @@ int hap_track_dirty_vram(struct domain * if ( !paging_mode_log_dirty(d) ) { -rc = paging_log_dirty_enable(d, 0); +rc = paging_log_dirty_enable(d, false); if ( rc ) goto out; } --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -209,15 +209,15 @@ static int paging_free_log_dirty_bitmap( return rc; } -int paging_log_dirty_enable(struct domain *d, bool_t log_global) +int paging_log_dirty_enable(struct domain *d, bool log_global) { int ret; -if ( is_iommu_enabled(d) && log_global ) +if ( has_arch_pdevs(d) && log_global ) { /* * Refuse to turn on global log-dirty mode - * if the domain is using the IOMMU. + * if the domain is sharing the P2M with the IOMMU. */ return -EINVAL; } @@ -727,7 +727,7 @@ int paging_domctl(struct domain *d, stru break; /* Else fall through... */ case XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY: -return paging_log_dirty_enable(d, 1); +return paging_log_dirty_enable(d, true); case XEN_DOMCTL_SHADOW_OP_OFF: if ( (rc = paging_log_dirty_disable(d, resuming)) != 0 ) --- a/xen/include/asm-x86/paging.h +++ b/xen/include/asm-x86/paging.h @@ -157,7 +157,7 @@ void paging_log_dirty_range(struct domai uint8_t *dirty_bitmap); /* enable log dirty */ -int paging_log_dirty_enable(struct domain *d, bool_t log_global); +int paging_log_dirty_enable(struct domain *d, bool log_global); /* log dirty initialization */ void paging_log_dirty_init(struct domain *d, const struct log_dirty_ops *ops); ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] xen/arm: domain_build: Don't expose IOMMU specific properties to the guest
On 08.10.19 00:02, Julien Grall wrote: Hi, Hi Julien But, this is clearly what happen in current Xen setup if the driver is not enabled. What I want to avoid is exposing an half complete bindings to the guest (you don't know how it will behave). So we either remove all the properties and node related to the IOMMUs or nothing. I think, I got it. Our target is *not* to add a way for Dom0 to use IOMMU, but to be consistent in removing IOMMU node/master device properties. Now, we remove the IOMMU node if Xen identifies it (the IOMMU driver is present in Xen), so looks like we have to remove master device properties only if this master device is behind the IOMMU which node is removed. This, I hope, will avoid exposing an half complete bindings to guest. Am I right? [1] https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00858.html -- If you happy with that logic I will craft a proper patch. The logic looks alright to me. One comment below, I will look at the rest once this is formally sent. ok diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 67021d9..6d45e55 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -480,10 +480,26 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo, const struct dt_property *prop, *status = NULL; int res = 0; int had_dom0_bootargs = 0; + struct dt_device_node *iommu_node; if ( kinfo->cmdline && kinfo->cmdline[0] ) bootargs = >cmdline[0]; + /* + * If we skip the IOMMU device when creating DT for Dom0 (even if I would prefer if we use "hwdom" over "Dom0". They are both the same on Arm, but this may change in the future (we may actually drop Dom0 ;)). Already sent v2, please see: https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg00673.html -- Regards, Oleksandr Tyshchenko ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.13 v2] xen/arm: domain_build: Don't expose IOMMU specific properties to hwdom
From: Oleksandr Tyshchenko We don't passthrough IOMMU device to hwdom even if it is not used by Xen. Therefore exposing the properties that describe relationship between master devices and IOMMUs does not make any sense. According to the: 1. Documentation/devicetree/bindings/iommu/iommu.txt 2. Documentation/devicetree/bindings/pci/pci-iommu.txt Signed-off-by: Oleksandr Tyshchenko --- Changes V1 [1] -> V2: - Only skip IOMMU specific properties of the master device if we skip the corresponding IOMMU device - Use "hwdom" over "Dom0" [1] https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg00104.html --- xen/arch/arm/domain_build.c | 29 + 1 file changed, 29 insertions(+) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 6d6dd52..a7321b8 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -480,10 +480,26 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo, const struct dt_property *prop, *status = NULL; int res = 0; int had_dom0_bootargs = 0; +struct dt_device_node *iommu_node; if ( kinfo->cmdline && kinfo->cmdline[0] ) bootargs = >cmdline[0]; +/* + * If we skip the IOMMU device when creating DT for hwdom (even if + * the IOMMU device is not used by Xen), we should also skip the IOMMU + * specific properties of the master device behind it in order to avoid + * exposing an half complete IOMMU bindings to hwdom. + * Use "iommu_node" as an indicator of the master device which properties + * should be skipped. + */ +iommu_node = dt_parse_phandle(node, "iommus", 0); +if ( iommu_node ) +{ +if ( device_get_class(iommu_node) != DEVICE_IOMMU ) +iommu_node = NULL; +} + dt_for_each_property_node (node, prop) { const void *prop_data = prop->value; @@ -540,6 +556,19 @@ static int __init write_properties(struct domain *d, struct kernel_info *kinfo, continue; } +if ( iommu_node ) +{ +/* Don't expose IOMMU specific properties to hwdom */ +if ( dt_property_name_is_equal(prop, "iommus") ) +continue; + +if ( dt_property_name_is_equal(prop, "iommu-map") ) +continue; + +if ( dt_property_name_is_equal(prop, "iommu-map-mask") ) +continue; +} + res = fdt_property(kinfo->fdt, prop->name, prop_data, prop_len); if ( res ) -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] pci: clear host_maskall field on assign
On 08.10.2019 16:16, Roger Pau Monné wrote: > On Tue, Oct 08, 2019 at 03:32:25PM +0200, Jan Beulich wrote: >> On 08.10.2019 15:14, Roger Pau Monné wrote: >>> On Tue, Oct 08, 2019 at 01:28:49PM +0200, Jan Beulich wrote: On 08.10.2019 13:09, Roger Pau Monné wrote: > Given that as you correctly point out maskall is unset after device > reset, I feel that option 4 is the best one since it matches the state > of the hardware after reset. Right, that's the variant coming closest to what hardware state ought to be at that point. We'd need to double check that the per-entry mask bits are all set at that point. >>> >>> I'm not saying such check is not worth doing, but why do it in this >>> case but not when also clearing the maskall (in msix_capability_init) >>> when called from prepare_msix? >> >> By "double check" I meant inspect the source, not to add checking logic. > > Oh, I implied you wanted to iterate over all entries and check that > the mask bit is set for each. > > It's my understanding that Xen relies on dom0 having done a device > reset before it being assigned, which masks all entries. I've checked > the pciback code and the reset is performed when the device is > assigned to dom0 (ie: guest shutdown or hot-unplug), and hence when > the device is assigned to a different domain the state of it should be > the after reset one. > > I can add a comment in assign_device that Xen expects the device state > to be the after reset one, and hence host_maskall = guest_maskall = > false and all entries should have the mask bit set. Yes please. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen 4.12 panic on Thinkpad W540 with UEFI mutiboot2, efi=no-rs workarounds it
On 08.10.2019 15:52, Marek Marczykowski-Górecki wrote: > On Tue, Oct 08, 2019 at 03:08:29PM +0200, Jan Beulich wrote: >> On 08.10.2019 13:50, Marek Marczykowski-Górecki wrote: >>> On Thu, Aug 08, 2019 at 08:03:49AM +0200, Jan Beulich wrote: On 08.08.2019 04:53, Marek Marczykowski-Górecki wrote: > On Wed, Aug 07, 2019 at 09:26:00PM +0200, Marek Marczykowski-Górecki > wrote: >> Ok, regardless of adding proper option for that, I've hardcoded map_bs=1 >> and it still crashes, just slightly differently: >> >> Xen call trace: >> [<0080>] 0080 >> [<8c2b0398edaa>] 8c2b0398edaa >> >> Pagetable walk from 858483a1: >> L4[0x1ff] = >> >> >> Panic on CPU 0: >> FATAL PAGE FAULT >> [error_code=0002] >> Faulting linear address: 858483a1 >> >> >> Full message attached. > > After playing more with it and also know workarounds for various EFI > issues, I've found a way to boot it: avoid calling Exit BootServices. > There was a patch from Konrad adding /noexit option, that never get > committed. Similar to efi=mapbs option, I'd add efi=no-exitboot too > (once efi=mapbs patch is accepted). > > Anyway, I'm curious what exactly is wrong here. Is it that the firmware > is not happy about lack of SetVirtualAddressMap call? FWIW, the crash is > during GetVariable RS call. I've verified that the function itself is > within EfiRuntimeServicesCode, but I don't feel like tracing Lenovo > UEFI... This suggests that the firmware zaps a few too many pointers during ExitBootServices(). Perhaps internally they check whether pointers point into BootServices* memory, and hence the wrong marking in the memory map has consequences beyond the OS re-using such memory? A proper answer to your question can of course only be given by someone knowing this specific firmware version. >>> >>> I explored it a bit more and talked with a few people doing firmware >>> development and few conclusions: >>> 1. Not calling SetVirtualAddressMap(), while technically legal, is >>> pretty uncommon and not recommended if you want to avoid less tested >>> (aka buggy) UEFI code paths. >>> 2. Every UEFI call before SetVirtualAddressMap() call should be done >>> with flat physical memory. This include SetVirtualAddressMap() call >>> itself. Implicitly this means such calls can legally access memory areas >>> not marked with EFI_MEMORY_RUNTIME. >> >> I don't think this is quite right - whether non-runtime memory may >> be touched depends exclusively on ExitBootServices() (not) having >> got called (yet). > > That would be logical. In practice however we have evidences firmware > vendors have different opinion... A comment from Linux (already quoted > here 2 months ago): > > /* > * The UEFI specification makes it clear that the operating system is > * free to do whatever it wants with boot services code after > * ExitBootServices() has been called. Ignoring this recommendation a > * significant bunch of EFI implementations continue calling into boot > * services code (SetVirtualAddressMap). In order to work around such > * buggy implementations we reserve boot services region during EFI > * init and make sure it stays executable. Then, after > * SetVirtualAddressMap(), it is discarded. > * > * However, some boot services regions contain data that is required > * by drivers, so we need to track which memory ranges can never be > * freed. This is done by tagging those regions with the > * EFI_MEMORY_RUNTIME attribute. > * > * Any driver that wants to mark a region as reserved must use > * efi_mem_reserve() which will insert a new EFI memory descriptor > * into efi.memmap (splitting existing regions if necessary) and tag > * it with EFI_MEMORY_RUNTIME. > */ But you realize that the comment specifically talks about the call tree underneath SetVirtualAddressMap() being the violator. As long as we don't call this function, we're unaffected as far as this comment goes. > Regardless of SetVirtualAddressMap() discussion, I propose to > automatically map boot services code/data, to make Xen work on more > machines (even if _we_ consider those buggy). I remain on my prior position: Adding command line triggerable workarounds for such cases is fine. Defaulting to assume buggy firmware is acceptable only if this means no extra penalty to systems with conforming firmware. Hence, for the case at hand, I object to doing this automatically; we already have the /mapbs workaround in place to deal with the case when xen.efi is used. Judging from the title here there may need to be an
Re: [Xen-devel] [PATCH] pci: clear host_maskall field on assign
On Tue, Oct 08, 2019 at 03:32:25PM +0200, Jan Beulich wrote: > On 08.10.2019 15:14, Roger Pau Monné wrote: > > On Tue, Oct 08, 2019 at 01:28:49PM +0200, Jan Beulich wrote: > >> On 08.10.2019 13:09, Roger Pau Monné wrote: > >>> Given that as you correctly point out maskall is unset after device > >>> reset, I feel that option 4 is the best one since it matches the state > >>> of the hardware after reset. > >> > >> Right, that's the variant coming closest to what hardware state > >> ought to be at that point. We'd need to double check that the > >> per-entry mask bits are all set at that point. > > > > I'm not saying such check is not worth doing, but why do it in this > > case but not when also clearing the maskall (in msix_capability_init) > > when called from prepare_msix? > > By "double check" I meant inspect the source, not to add checking logic. Oh, I implied you wanted to iterate over all entries and check that the mask bit is set for each. It's my understanding that Xen relies on dom0 having done a device reset before it being assigned, which masks all entries. I've checked the pciback code and the reset is performed when the device is assigned to dom0 (ie: guest shutdown or hot-unplug), and hence when the device is assigned to a different domain the state of it should be the after reset one. I can add a comment in assign_device that Xen expects the device state to be the after reset one, and hence host_maskall = guest_maskall = false and all entries should have the mask bit set. Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v1 2/2] libxl: add removing XS backend path for PV devices on domain destroy
From: Oleksandr Grytsov Currently backend path is remove for specific devices: VBD, VIF, QDISK. This commit adds removing backend path for: VDISPL, VSND, VINPUT. Signed-off-by: Oleksandr Grytsov --- tools/libxl/libxl_device.c | 14 +- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c index 1402b61a81..8ce70661e5 100644 --- a/tools/libxl/libxl_device.c +++ b/tools/libxl/libxl_device.c @@ -1477,7 +1477,7 @@ typedef struct libxl__ddomain_device { */ typedef struct libxl__ddomain_guest { uint32_t domid; -int num_vifs, num_vbds, num_qdisks; +int num_qdisks; LIBXL_SLIST_HEAD(, struct libxl__ddomain_device) devices; LIBXL_SLIST_ENTRY(struct libxl__ddomain_guest) next; } libxl__ddomain_guest; @@ -1530,8 +1530,7 @@ static void check_and_maybe_remove_guest(libxl__gc *gc, { assert(ddomain); -if (dguest != NULL && -dguest->num_vifs + dguest->num_vbds + dguest->num_qdisks == 0) { +if (dguest != NULL && LIBXL_SLIST_FIRST(>devices) == NULL) { LIBXL_SLIST_REMOVE(>guests, dguest, libxl__ddomain_guest, next); LOGD(DEBUG, dguest->domid, "Removed domain from the list of active guests"); @@ -1573,9 +1572,6 @@ static int add_device(libxl__egc *egc, libxl__ao *ao, switch(dev->backend_kind) { case LIBXL__DEVICE_KIND_VBD: case LIBXL__DEVICE_KIND_VIF: -if (dev->backend_kind == LIBXL__DEVICE_KIND_VBD) dguest->num_vbds++; -if (dev->backend_kind == LIBXL__DEVICE_KIND_VIF) dguest->num_vifs++; - GCNEW(aodev); libxl__prepare_ao_device(ao, aodev); /* @@ -1619,11 +1615,11 @@ static int remove_device(libxl__egc *egc, libxl__ao *ao, int rc = 0; switch(ddev->dev->backend_kind) { +case LIBXL__DEVICE_KIND_VDISPL: +case LIBXL__DEVICE_KIND_VSND: +case LIBXL__DEVICE_KIND_VINPUT: case LIBXL__DEVICE_KIND_VBD: case LIBXL__DEVICE_KIND_VIF: -if (dev->backend_kind == LIBXL__DEVICE_KIND_VBD) dguest->num_vbds--; -if (dev->backend_kind == LIBXL__DEVICE_KIND_VIF) dguest->num_vifs--; - GCNEW(aodev); libxl__prepare_ao_device(ao, aodev); /* -- 2.17.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v1 1/2] libxl: introduce new backend type VINPUT
From: Oleksandr Grytsov There are two kind of VKBD devices: with QEMU backend and user space backend. In current implementation they can't be distinguished as both use VKBD backend type. As result, user space KBD backend is started and stopped as QEMU backend. This commit adds new device kind VINPUT to be used as backend type for user space KBD backend. Signed-off-by: Oleksandr Grytsov --- tools/libxl/libxl_types_internal.idl | 1 + tools/libxl/libxl_vkb.c | 29 ++-- 2 files changed, 20 insertions(+), 10 deletions(-) diff --git a/tools/libxl/libxl_types_internal.idl b/tools/libxl/libxl_types_internal.idl index cb85c3b37f..3593e21dbb 100644 --- a/tools/libxl/libxl_types_internal.idl +++ b/tools/libxl/libxl_types_internal.idl @@ -31,6 +31,7 @@ libxl__device_kind = Enumeration("device_kind", [ (13, "VUART"), (14, "PVCALLS"), (15, "VSND"), +(16, "VINPUT"), ]) libxl__console_backend = Enumeration("console_backend", [ diff --git a/tools/libxl/libxl_vkb.c b/tools/libxl/libxl_vkb.c index 26376a7eef..4c44a813c1 100644 --- a/tools/libxl/libxl_vkb.c +++ b/tools/libxl/libxl_vkb.c @@ -38,9 +38,6 @@ static int libxl__set_xenstore_vkb(libxl__gc *gc, uint32_t domid, flexarray_t *back, flexarray_t *front, flexarray_t *ro_front) { -flexarray_append_pair(back, "backend-type", - (char *)libxl_vkb_backend_to_string(vkb->backend_type)); - if (vkb->unique_id) { flexarray_append_pair(back, XENKBD_FIELD_UNIQUE_ID, vkb->unique_id); } @@ -93,7 +90,8 @@ static int libxl__vkb_from_xenstore(libxl__gc *gc, const char *libxl_path, libxl_devid devid, libxl_device_vkb *vkb) { -const char *be_path, *be_type, *fe_path, *tmp; +const char *be_path, *fe_path, *tmp; +libxl__device dev; int rc; vkb->devid = devid; @@ -111,13 +109,11 @@ static int libxl__vkb_from_xenstore(libxl__gc *gc, const char *libxl_path, rc = libxl__backendpath_parse_domid(gc, be_path, >backend_domid); if (rc) goto out; -rc = libxl__xs_read_mandatory(gc, XBT_NULL, - GCSPRINTF("%s/backend-type", be_path), - _type); +rc = libxl__parse_backend_path(gc, be_path, ); if (rc) goto out; -rc = libxl_vkb_backend_from_string(be_type, >backend_type); -if (rc) goto out; +vkb->backend_type = dev.backend_kind == LIBXL__DEVICE_KIND_VINPUT ? +LIBXL_VKB_BACKEND_LINUX : LIBXL_VKB_BACKEND_QEMU; vkb->unique_id = xs_read(CTX->xsh, XBT_NULL, GCSPRINTF("%s/"XENKBD_FIELD_UNIQUE_ID, be_path), NULL); @@ -218,6 +214,20 @@ out: return rc; } +static int libxl__device_from_vkb(libxl__gc *gc, uint32_t domid, + libxl_device_vkb *type, libxl__device *device) +{ +device->backend_devid = type->devid; +device->backend_domid = type->backend_domid; +device->backend_kind= type->backend_type == LIBXL_VKB_BACKEND_LINUX ? + LIBXL__DEVICE_KIND_VINPUT : LIBXL__DEVICE_KIND_VKBD; +device->devid = type->devid; +device->domid = domid; +device->kind= LIBXL__DEVICE_KIND_VKBD; + +return 0; +} + int libxl_device_vkb_add(libxl_ctx *ctx, uint32_t domid, libxl_device_vkb *vkb, const libxl_asyncop_how *ao_how) { @@ -318,7 +328,6 @@ out: return rc; } -static LIBXL_DEFINE_DEVICE_FROM_TYPE(vkb) static LIBXL_DEFINE_UPDATE_DEVID(vkb) #define libxl__add_vkbs NULL -- 2.17.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v1 0/2] Remove backend xen store entry on domain destroy
From: Oleksandr Grytsov We have added VKBD device with user space PV backend. But libxl can't differentiate domain kind for this device. As result, it performs QEMU procedure for adding and removing VKB device with user space backend. To fix this issue, new device kind VINPUT is introduced. It is used as backend kind in case of user space VKBD backend. Another issue addresses in this patch series is error timeout on guest domain destroy in case using user space PV backends. We have a driver domain with user space backends (VDISPL, VSND, VKBD). When a guest domain destroyed, we see following error: "timed out while waiting for ... to be removed" xl expects that PV device backend entries is removed. xl devd removes these entries for specific devices only: VBD, VIF, QDISK and then deletes the guest domain from the list. The fix is to delete guest domain only after all devices are removed and performs generic device remove procedure for following device: VINPUT, VDISPL, VSND. Oleksandr Grytsov (2): libxl: introduce new backend type VINPUT libxl: add removing XS backend path for PV devices on domain destroy tools/libxl/libxl_device.c | 14 +- tools/libxl/libxl_types_internal.idl | 1 + tools/libxl/libxl_vkb.c | 29 ++-- 3 files changed, 25 insertions(+), 19 deletions(-) -- 2.17.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [linux-4.9 test] 142412: regressions - FAIL
flight 142412 linux-4.9 real [real] http://logs.test-lab.xenproject.org/osstest/logs/142412/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 7 xen-boot fail REGR. vs. 142318 test-armhf-armhf-libvirt16 guest-start/debian.repeat fail REGR. vs. 142318 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 142318 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 142318 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 142318 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 142318 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 142318 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail never pass test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail never pass test-amd64-amd64-xl-pvhv2-amd 12 guest-start fail never pass test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-seattle 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install
Re: [Xen-devel] [[PATCH for-4.13]] xen/arm: mm: Allow generic xen page-tables helpers to be called early
(+ Juergen) Hi Stefano, On 10/8/19 1:18 AM, Stefano Stabellini wrote: On Mon, 7 Oct 2019, Julien Grall wrote: Hi, On 03/10/2019 02:02, Stefano Stabellini wrote: On Fri, 20 Sep 2019, Julien Grall wrote: That's not correct. alloc_boot_pages() is actually here to allow dynamic allocation before the memory subsystem (and therefore the runtime allocator) is initialized. Let me change the question then: is the system_state == SYS_STATE_early_boot check strictly necessary? It looks like it is not: the patch would work even if it was just: I had a few thoughts about it. On Arm32, this only really works for 32-bits machine address (it can go up to 40-bits). I haven't really fully investigated what could go wrong, but it would be best to keep it only for early boot. Also, I don't really want to rely on this "workaround" after boot. Maybe we would want to keep them unmapped in the future. Yes, no problems, we agree on that. I am not asking in regards to the check system_state == SYS_STATE_early_boot with the goal of asking you to get rid of it. I am fine with keeping the check. (Maybe we want to add an `unlikely()' around the check.) I am trying to understand whether the code actually relies on system_state == SYS_STATE_early_boot, and, if so, why. The goal is to make sure that if there are some limitations that they are documented, or just to double-check that there are no limitations. The check is not strictly necessary. In regards to your comment about only working for 32-bit addresses on Arm32, you have a point. At least we should be careful with the mfn to vaddr conversion because mfn_to_maddr returns a paddr_t which is 64-bit and vaddr_t is 32-bit. I imagine that theoretically, even with system_state == SYS_STATE_early_boot, it could get truncated with the wrong combination of mfn and phys_offset. If nothing else, maybe we should add a truncation check for safety? Except that phys_offset is not defined correctly, so your check below will break some setup :(. Let's take the following example: Xen is loaded at PA 0x10 The boot offset is computed using 32-bit address (see head.S): PA - VA = 0x10 - 0x20 = 0xfff0 This value will be passed to C code as an unsigned long. But then we will store it in a uint64_t/paddr_t (see phys_offset which is set in setup_page_tables). Because this is a conversion from unsigned to unsigned, the "sign bit" will not be propagated. This means that phys_offset will be equal to 0xfff0 and not 0xfff0! Therefore if we try to convert 0x10 (where Xen has been loaded) back to its VA, the resulting value will be 0x00200100. Looking at the code, I think pte_of_xenaddr() has also the exact problem. :( I guess nobody tried to load Xen that low in memory on Arm32? But that's going to be definitely an issues with the memory rework I have in mind. I have some other important work to finish for Xen 4.13. So I am thinking to defer the problem I mention above for post Xen 4.13. Although, the GRUB issues would still need to be fix. Any opinions? Note that this is also more reasons to limit the use of "MA - phys_offset". So the mess is kept to boot code. Something like the following but that ideally would be applicable to arm64 too without having to add an #ifdef: paddr_t pa = mfn_to_maddr(mfn) - phys_offset; if ( pa < _end && is_kernel((vaddr_t)pa) ) return (lpae_t *)(vaddr_t)pa; Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen 4.12 panic on Thinkpad W540 with UEFI mutiboot2, efi=no-rs workarounds it
On Tue, Oct 08, 2019 at 03:08:29PM +0200, Jan Beulich wrote: > On 08.10.2019 13:50, Marek Marczykowski-Górecki wrote: > > On Thu, Aug 08, 2019 at 08:03:49AM +0200, Jan Beulich wrote: > >> On 08.08.2019 04:53, Marek Marczykowski-Górecki wrote: > >>> On Wed, Aug 07, 2019 at 09:26:00PM +0200, Marek Marczykowski-Górecki > >>> wrote: > Ok, regardless of adding proper option for that, I've hardcoded map_bs=1 > and it still crashes, just slightly differently: > > Xen call trace: > [<0080>] 0080 > [<8c2b0398edaa>] 8c2b0398edaa > > Pagetable walk from 858483a1: > L4[0x1ff] = > > > Panic on CPU 0: > FATAL PAGE FAULT > [error_code=0002] > Faulting linear address: 858483a1 > > > Full message attached. > >>> > >>> After playing more with it and also know workarounds for various EFI > >>> issues, I've found a way to boot it: avoid calling Exit BootServices. > >>> There was a patch from Konrad adding /noexit option, that never get > >>> committed. Similar to efi=mapbs option, I'd add efi=no-exitboot too > >>> (once efi=mapbs patch is accepted). > >>> > >>> Anyway, I'm curious what exactly is wrong here. Is it that the firmware > >>> is not happy about lack of SetVirtualAddressMap call? FWIW, the crash is > >>> during GetVariable RS call. I've verified that the function itself is > >>> within EfiRuntimeServicesCode, but I don't feel like tracing Lenovo > >>> UEFI... > >> > >> This suggests that the firmware zaps a few too many pointers > >> during ExitBootServices(). Perhaps internally they check > >> whether pointers point into BootServices* memory, and hence the > >> wrong marking in the memory map has consequences beyond the OS > >> re-using such memory? > >> > >> A proper answer to your question can of course only be given > >> by someone knowing this specific firmware version. > > > > I explored it a bit more and talked with a few people doing firmware > > development and few conclusions: > > 1. Not calling SetVirtualAddressMap(), while technically legal, is > > pretty uncommon and not recommended if you want to avoid less tested > > (aka buggy) UEFI code paths. > > 2. Every UEFI call before SetVirtualAddressMap() call should be done > > with flat physical memory. This include SetVirtualAddressMap() call > > itself. Implicitly this means such calls can legally access memory areas > > not marked with EFI_MEMORY_RUNTIME. > > I don't think this is quite right - whether non-runtime memory may > be touched depends exclusively on ExitBootServices() (not) having > got called (yet). That would be logical. In practice however we have evidences firmware vendors have different opinion... A comment from Linux (already quoted here 2 months ago): /* * The UEFI specification makes it clear that the operating system is * free to do whatever it wants with boot services code after * ExitBootServices() has been called. Ignoring this recommendation a * significant bunch of EFI implementations continue calling into boot * services code (SetVirtualAddressMap). In order to work around such * buggy implementations we reserve boot services region during EFI * init and make sure it stays executable. Then, after * SetVirtualAddressMap(), it is discarded. * * However, some boot services regions contain data that is required * by drivers, so we need to track which memory ranges can never be * freed. This is done by tagging those regions with the * EFI_MEMORY_RUNTIME attribute. * * Any driver that wants to mark a region as reserved must use * efi_mem_reserve() which will insert a new EFI memory descriptor * into efi.memmap (splitting existing regions if necessary) and tag * it with EFI_MEMORY_RUNTIME. */ Regardless of SetVirtualAddressMap() discussion, I propose to automatically map boot services code/data, to make Xen work on more machines (even if _we_ consider those buggy). > > Then I've tried a different approach: call SetVirtualAddressMap(), but > > with an address map that tries to pretend physical addressing (the code > > under #ifndef USE_SET_VIRTUAL_ADDRESS_MAP). This mostly worked, I needed > > only few changes: > > - set VirtualStart back to PhysicalStart in that memory map (it was set > >to directmap) > > - map boot services (at least for the SetVirtualAddressMap() call time, > >but haven't tried unmapping it later) > > - call SetVirtualAddressMap() with that "1:1" map in place, using > >efi_rs_enter/efi_rs_leave. > > > > This fixed the issue for me, now runtime services do work even without > > disabling ExitBootServices() call. And without any extra > > platform-specific command line arguments.
Re: [Xen-devel] [PATCH] pci: clear host_maskall field on assign
On 08.10.2019 15:14, Roger Pau Monné wrote: > On Tue, Oct 08, 2019 at 01:28:49PM +0200, Jan Beulich wrote: >> On 08.10.2019 13:09, Roger Pau Monné wrote: >>> Given that as you correctly point out maskall is unset after device >>> reset, I feel that option 4 is the best one since it matches the state >>> of the hardware after reset. >> >> Right, that's the variant coming closest to what hardware state >> ought to be at that point. We'd need to double check that the >> per-entry mask bits are all set at that point. > > I'm not saying such check is not worth doing, but why do it in this > case but not when also clearing the maskall (in msix_capability_init) > when called from prepare_msix? By "double check" I meant inspect the source, not to add checking logic. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [linux-linus bisection] complete test-amd64-i386-qemuu-rhel6hvm-amd
On 08.10.2019 15:06, osstest service owner wrote: > branch xen-unstable > xenbranch xen-unstable > job test-amd64-i386-qemuu-rhel6hvm-amd > testid xen-boot > > Tree: linux > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git > Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git > Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git > Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git > Tree: qemuu git://xenbits.xen.org/qemu-xen.git > Tree: seabios git://xenbits.xen.org/osstest/seabios.git > Tree: xen git://xenbits.xen.org/xen.git > > *** Found and reproduced problem changeset *** > > Bug is in tree: linux > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git > Bug introduced: da0c9ea146cbe92b832f1b0f694840ea8eb33cce > Bug not present: 46713c3d2f8da5e3d8ddd2249bcb1d9974fb5d28 > Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/142439/ I did send a patch for this issue yesterday: https://lkml.org/lkml/2019/10/7/220 (can't seem to find it on patchwork). Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] pci: clear host_maskall field on assign
On Tue, Oct 08, 2019 at 01:28:49PM +0200, Jan Beulich wrote: > On 08.10.2019 13:09, Roger Pau Monné wrote: > > Given that as you correctly point out maskall is unset after device > > reset, I feel that option 4 is the best one since it matches the state > > of the hardware after reset. > > Right, that's the variant coming closest to what hardware state > ought to be at that point. We'd need to double check that the > per-entry mask bits are all set at that point. I'm not saying such check is not worth doing, but why do it in this case but not when also clearing the maskall (in msix_capability_init) when called from prepare_msix? Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/4] docs/sphinx: License content with CC-BY-4.0
On 03/10/2019, 21:56, "Andrew Cooper" wrote: Creative Commons is a more common license than GPL for documentation purposes. Switch to using CC-BY-4.0 to explicitly permit re-purposing and remixing of the content. Signed-off-by: Andrew Cooper Reviewed-by: Lars Kurth Also, I need to clarify that one of the statements I made yesterday is wrong CC-BY-4 is compatible with all versions with of the GPL, see https://www.gnu.org/licenses/license-list.en.html#OtherLicenses I mixed up CC-BY-4 with CC-BY-SA-4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen 4.12 panic on Thinkpad W540 with UEFI mutiboot2, efi=no-rs workarounds it
On 08.10.2019 13:50, Marek Marczykowski-Górecki wrote: > On Thu, Aug 08, 2019 at 08:03:49AM +0200, Jan Beulich wrote: >> On 08.08.2019 04:53, Marek Marczykowski-Górecki wrote: >>> On Wed, Aug 07, 2019 at 09:26:00PM +0200, Marek Marczykowski-Górecki wrote: Ok, regardless of adding proper option for that, I've hardcoded map_bs=1 and it still crashes, just slightly differently: Xen call trace: [<0080>] 0080 [<8c2b0398edaa>] 8c2b0398edaa Pagetable walk from 858483a1: L4[0x1ff] = Panic on CPU 0: FATAL PAGE FAULT [error_code=0002] Faulting linear address: 858483a1 Full message attached. >>> >>> After playing more with it and also know workarounds for various EFI >>> issues, I've found a way to boot it: avoid calling Exit BootServices. >>> There was a patch from Konrad adding /noexit option, that never get >>> committed. Similar to efi=mapbs option, I'd add efi=no-exitboot too >>> (once efi=mapbs patch is accepted). >>> >>> Anyway, I'm curious what exactly is wrong here. Is it that the firmware >>> is not happy about lack of SetVirtualAddressMap call? FWIW, the crash is >>> during GetVariable RS call. I've verified that the function itself is >>> within EfiRuntimeServicesCode, but I don't feel like tracing Lenovo >>> UEFI... >> >> This suggests that the firmware zaps a few too many pointers >> during ExitBootServices(). Perhaps internally they check >> whether pointers point into BootServices* memory, and hence the >> wrong marking in the memory map has consequences beyond the OS >> re-using such memory? >> >> A proper answer to your question can of course only be given >> by someone knowing this specific firmware version. > > I explored it a bit more and talked with a few people doing firmware > development and few conclusions: > 1. Not calling SetVirtualAddressMap(), while technically legal, is > pretty uncommon and not recommended if you want to avoid less tested > (aka buggy) UEFI code paths. > 2. Every UEFI call before SetVirtualAddressMap() call should be done > with flat physical memory. This include SetVirtualAddressMap() call > itself. Implicitly this means such calls can legally access memory areas > not marked with EFI_MEMORY_RUNTIME. I don't think this is quite right - whether non-runtime memory may be touched depends exclusively on ExitBootServices() (not) having got called (yet). > Then I've tried a different approach: call SetVirtualAddressMap(), but > with an address map that tries to pretend physical addressing (the code > under #ifndef USE_SET_VIRTUAL_ADDRESS_MAP). This mostly worked, I needed > only few changes: > - set VirtualStart back to PhysicalStart in that memory map (it was set >to directmap) > - map boot services (at least for the SetVirtualAddressMap() call time, >but haven't tried unmapping it later) > - call SetVirtualAddressMap() with that "1:1" map in place, using >efi_rs_enter/efi_rs_leave. > > This fixed the issue for me, now runtime services do work even without > disabling ExitBootServices() call. And without any extra > platform-specific command line arguments. And I think it also shouldn't break > kexec, since it uses 1:1-like map, but I haven't tried. One should > simply ignore EFI_UNSUPPORTED return code (I don't know how to avoid the > call at all after kexec). That's the point - it can't be avoided, and hence it failing is not an option. Or else there needs to be a protocol telling kexec what it is to expect, and that it in particular can't change the virtual address map to its liking. Back at the time when I put together the EFI booting code, no such protocol existed, and hence calling SetVirtualAddressMap() was not an option. I have no idea whether things have changed in the meantime. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [linux-linus bisection] complete test-amd64-i386-qemuu-rhel6hvm-amd
branch xen-unstable xenbranch xen-unstable job test-amd64-i386-qemuu-rhel6hvm-amd testid xen-boot Tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: seabios git://xenbits.xen.org/osstest/seabios.git Tree: xen git://xenbits.xen.org/xen.git *** Found and reproduced problem changeset *** Bug is in tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git Bug introduced: da0c9ea146cbe92b832f1b0f694840ea8eb33cce Bug not present: 46713c3d2f8da5e3d8ddd2249bcb1d9974fb5d28 Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/142439/ (Revision log too long, omitted.) For bisection revision-tuple graph see: http://logs.test-lab.xenproject.org/osstest/results/bisect/linux-linus/test-amd64-i386-qemuu-rhel6hvm-amd.xen-boot.html Revision IDs in each graph node refer, respectively, to the Trees above. Running cs-bisection-step --graph-out=/home/logs/results/bisect/linux-linus/test-amd64-i386-qemuu-rhel6hvm-amd.xen-boot --summary-out=tmp/142439.bisection-summary --basis-template=133580 --blessings=real,real-bisect linux-linus test-amd64-i386-qemuu-rhel6hvm-amd xen-boot Searching for failure / basis pass: 142398 fail [host=rimava1] / 138849 [host=pinot1] 138813 ok. Failure / basis pass flights: 142398 / 138813 (tree with no url: minios) Tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: seabios git://xenbits.xen.org/osstest/seabios.git Tree: xen git://xenbits.xen.org/xen.git Latest da0c9ea146cbe92b832f1b0f694840ea8eb33cce c530a75c1e6a472b0eb9558310b518f0dfcd8860 d19040804afb2bdd60f18e8aef7da78028575fe6 d0d8ad39ecb51cd7497cd524484fe09f50876798 933ebad2470a169504799a1d95b8e410bd9847ef 43f5df79dad6738d52ea79d072de2b56eb96a91f f93abf0315efef861270c25d83c8047fd6a54ec4 Basis pass 46713c3d2f8da5e3d8ddd2249bcb1d9974fb5d28 c530a75c1e6a472b0eb9558310b518f0dfcd8860 d031fc07eb83c9d13bff3ebac25da458d5a47917 d0d8ad39ecb51cd7497cd524484fe09f50876798 9cca02d8ffc23e9688a971d858e4ffdff5389b11 30f1e41f04fb4c715d27f987f003cfc31c9ff4f3 93ef224d63f9f04a0897d64981c619eb4816c0d3 Generating revisions with ./adhoc-revtuple-generator git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git#46713c3d2f8da5e3d8ddd2249bcb1d9974fb5d28-da0c9ea146cbe92b832f1b0f694840ea8eb33cce git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860 git://xenbits.xen.org/osstest/ovmf.git#d031fc07eb83c9d13bff3ebac25da458d5a47917-d19040804afb2bdd60f18e8aef7da78028575fe6 git://xenbits.xen.org/qemu-xen-traditional.\ git#d0d8ad39ecb51cd7497cd524484fe09f50876798-d0d8ad39ecb51cd7497cd524484fe09f50876798 git://xenbits.xen.org/qemu-xen.git#9cca02d8ffc23e9688a971d858e4ffdff5389b11-933ebad2470a169504799a1d95b8e410bd9847ef git://xenbits.xen.org/osstest/seabios.git#30f1e41f04fb4c715d27f987f003cfc31c9ff4f3-43f5df79dad6738d52ea79d072de2b56eb96a91f git://xenbits.xen.org/xen.git#93ef224d63f9f04a0897d64981c619eb4816c0d3-f93abf0315efef861270c25d83c8047fd6a54ec4 adhoc-revtuple-generator: tree discontiguous: linux-2.6 adhoc-revtuple-generator: tree discontiguous: qemu-xen Loaded 3003 nodes in revision graph Searching for test results: 138780 [host=pinot1] 138813 pass 46713c3d2f8da5e3d8ddd2249bcb1d9974fb5d28 c530a75c1e6a472b0eb9558310b518f0dfcd8860 d031fc07eb83c9d13bff3ebac25da458d5a47917 d0d8ad39ecb51cd7497cd524484fe09f50876798 9cca02d8ffc23e9688a971d858e4ffdff5389b11 30f1e41f04fb4c715d27f987f003cfc31c9ff4f3 93ef224d63f9f04a0897d64981c619eb4816c0d3 138849 [host=pinot1] 138878 fail irrelevant 138902 fail irrelevant 138962 fail irrelevant 139003 fail irrelevant 139068 fail irrelevant 139134 fail irrelevant 139237 fail irrelevant 139223 fail irrelevant 139257 fail irrelevant 139324 fail irrelevant 139306 fail irrelevant 139286 fail irrelevant 139338 fail irrelevant 139361 fail irrelevant 139383 fail irrelevant 139408 fail irrelevant 139478 fail irrelevant 139532 fail irrelevant 139584 fail irrelevant 139555 fail irrelevant 139687 fail irrelevant 139710 pass irrelevant 139676 pass 46713c3d2f8da5e3d8ddd2249bcb1d9974fb5d28 c530a75c1e6a472b0eb9558310b518f0dfcd8860 cf2d8d4978e87a80d23ecd40ef7b49f27199a13c d0d8ad39ecb51cd7497cd524484fe09f50876798 9cca02d8ffc23e9688a971d858e4ffdff5389b11 30f1e41f04fb4c715d27f987f003cfc31c9ff4f3 0c38c61aad2106e23a5fcab7e435671fb39dc44c 139616 fail irrelevant 139686 pass irrelevant 139679 pass irrelevant
Re: [Xen-devel] [PATCH v2 3/3] xen/arm: fix duplicate memory node in DT
Hi, On 10/8/19 2:15 AM, Stefano Stabellini wrote: When reserved-memory regions are present in the host device tree, dom0 is started with multiple memory nodes. Each memory node should have a unique name, but today they are all called "memory" leading to Linux printing the following warning at boot: OF: Duplicate name in base, renamed to "memory#1" This patch fixes the problem by appending a "@" to the name, as per the Device Tree specification, where matches the base of address of the first region. Fixes: 248faa637d2 (xen/arm: add reserved-memory regions to the dom0 memory node) Reported-by: Oleksandr Tyshchenko Signed-off-by: Stefano Stabellini Acked-by: Julien Grall Cheers, --- Changes in v2: - fix buf size calculation: the number is 64bit and printed as hexadecimal - move check on nr_banks to a separate patch --- xen/arch/arm/domain_build.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index ea01aada0b..3de4dafaed 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -646,6 +646,8 @@ static int __init make_memory_node(const struct domain *d, int res, i; int reg_size = addrcells + sizecells; int nr_cells = reg_size * mem->nr_banks; +/* Placeholder for memory@ + a 64-bit number + \0 */ +char buf[24]; __be32 reg[NR_MEM_BANKS * 4 /* Worst case addrcells + sizecells */]; __be32 *cells; @@ -657,7 +659,8 @@ static int __init make_memory_node(const struct domain *d, reg_size, nr_cells); /* ePAPR 3.4 */ -res = fdt_begin_node(fdt, "memory"); +snprintf(buf, sizeof(buf), "memory@%"PRIx64, mem->bank[0].start); +res = fdt_begin_node(fdt, buf); if ( res ) return res; -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/3] xen/arm: make_memory_node return error on nr_banks == 0
On 10/8/19 2:15 AM, Stefano Stabellini wrote: Call make_memory_node for reserved_memory only if we actually have any reserved_memory regions to handle. Add a check in make_memory_node to return an error if it has been called with no memory banks as argument. Fixes: 248faa637d2 (xen/arm: add reserved-memory regions to the dom0 memory node) Signed-off-by: Stefano Stabellini Acked-by: Julien Grall Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [linux-4.14 test] 142410: tolerable FAIL - PUSHED
flight 142410 linux-4.14 real [real] http://logs.test-lab.xenproject.org/osstest/logs/142410/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass version targeted for testing: linux42327896f194f256e5a361e0069985bc8d209b42 baseline version: linuxdb1892238c55c5138801f131a837ccd0056f002e Last test of basis 142320 2019-10-05 11:10:44 Z3 days Testing same since 142410 2019-10-07 17:11:08 Z0 days1 attempts People who
Re: [Xen-devel] [PATCH 3/4] docs/sphinx: Introduction
On 03/10/2019, 21:59, "Andrew Cooper" wrote: Put together an introduction page for the Sphinx/RST docs, along with a glossary which will accumulate over time. Signed-off-by: Andrew Cooper Reviewed-by: Lars Kurth There were a few minor improvements which could be made, I am listing these below, but none are show-stoppers. +Xen is an open source, bare metal hypervisor. It runs as the most privileged +piece of software, and shares the resources of the hardware between virtual Maybe better: s/software/software on the system/ or s/software/software on the host/ +machines. + hardware domain + A :term:`domain`, commonly dom0, which shares responsibility with Xen + about the system as a whole. + + By default it gets all devices, including all disks and network cards, so + is responsible for multiplexing guest I/O This is a little unclear: in particular the 1st paragraph. Earlier you talk about hardware domain="responsible for hardware and marshalling guest I/O", which is clearer. Maybe: A :term:`domain`, commonly dom0, which hosts all devices, including disks and network cards and is responsible for multiplexing guest I/O is better Regards Lars ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/4] docs/sphinx: Indent cleanup
On 03/10/2019, 21:56, "Andrew Cooper" wrote: Sphinx, its linters, and RST modes in common editors, expect 3 spaces of indentation. Some bits already conform to this expectation. Update the rest to match. Signed-off-by: Andrew Cooper Reviewed-by: Lars Kurth ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Xen 4.12 panic on Thinkpad W540 with UEFI mutiboot2, efi=no-rs workarounds it
On Thu, Aug 08, 2019 at 08:03:49AM +0200, Jan Beulich wrote: > On 08.08.2019 04:53, Marek Marczykowski-Górecki wrote: > > On Wed, Aug 07, 2019 at 09:26:00PM +0200, Marek Marczykowski-Górecki wrote: > > > Ok, regardless of adding proper option for that, I've hardcoded map_bs=1 > > > and it still crashes, just slightly differently: > > > > > > Xen call trace: > > > [<0080>] 0080 > > > [<8c2b0398edaa>] 8c2b0398edaa > > > > > > Pagetable walk from 858483a1: > > > L4[0x1ff] = > > > > > > > > > Panic on CPU 0: > > > FATAL PAGE FAULT > > > [error_code=0002] > > > Faulting linear address: 858483a1 > > > > > > > > > Full message attached. > > > > After playing more with it and also know workarounds for various EFI > > issues, I've found a way to boot it: avoid calling Exit BootServices. > > There was a patch from Konrad adding /noexit option, that never get > > committed. Similar to efi=mapbs option, I'd add efi=no-exitboot too > > (once efi=mapbs patch is accepted). > > > > Anyway, I'm curious what exactly is wrong here. Is it that the firmware > > is not happy about lack of SetVirtualAddressMap call? FWIW, the crash is > > during GetVariable RS call. I've verified that the function itself is > > within EfiRuntimeServicesCode, but I don't feel like tracing Lenovo > > UEFI... > > This suggests that the firmware zaps a few too many pointers > during ExitBootServices(). Perhaps internally they check > whether pointers point into BootServices* memory, and hence the > wrong marking in the memory map has consequences beyond the OS > re-using such memory? > > A proper answer to your question can of course only be given > by someone knowing this specific firmware version. I explored it a bit more and talked with a few people doing firmware development and few conclusions: 1. Not calling SetVirtualAddressMap(), while technically legal, is pretty uncommon and not recommended if you want to avoid less tested (aka buggy) UEFI code paths. 2. Every UEFI call before SetVirtualAddressMap() call should be done with flat physical memory. This include SetVirtualAddressMap() call itself. Implicitly this means such calls can legally access memory areas not marked with EFI_MEMORY_RUNTIME. For the second point, relevant part of UEFI 2.7 spec (Runtime Services - Virtual Memory Services chapter): This section contains function definitions for the virtual memory support that may be optionally used by an operating system at runtime. If an operating system chooses to make EFI runtime service calls in a virtual addressing mode instead of the flat physical mode, then the operating system must use the services in this section to switch the EFI runtime services from flat physical addressing to virtual addressing. (...) The call to SetVirtualAddressMap() must be done with the physical mappings. On successful return from this function, the system must then make any future calls with the newly assigned virtual mappings. All address space mappings must be done in accordance to the cacheability flags as specified in the original address map. I've tried to poke around this part of Xen code, including resurrecting SetVirtualAddressMap() (#define USE_SET_VIRTUAL_ADDRESS_MAP in common/efi/boot.c) and (unsurprisingly) hit multiple issues: - at this point of time, Xen is already relocated and paging is enabled - SetVirtualAddressMap() is indeed not happy about being called with new address map in place already - directmap - at which that code points - is mapped with NX, which breaks EfiRuntimeServicesCode area Then I've tried a different approach: call SetVirtualAddressMap(), but with an address map that tries to pretend physical addressing (the code under #ifndef USE_SET_VIRTUAL_ADDRESS_MAP). This mostly worked, I needed only few changes: - set VirtualStart back to PhysicalStart in that memory map (it was set to directmap) - map boot services (at least for the SetVirtualAddressMap() call time, but haven't tried unmapping it later) - call SetVirtualAddressMap() with that "1:1" map in place, using efi_rs_enter/efi_rs_leave. This fixed the issue for me, now runtime services do work even without disabling ExitBootServices() call. And without any extra platform-specific command line arguments. And I think it also shouldn't break kexec, since it uses 1:1-like map, but I haven't tried. One should simply ignore EFI_UNSUPPORTED return code (I don't know how to avoid the call at all after kexec). Any thoughts? If the above sounds good, I'll cleanup the patch and submit it. BTW Does it qualify for 4.13? On one hand it may be seen as a bugfix (fix booting on some UEFI firmwares), but on the other hand, I can't think of all the side effects. -- Best Regards,
Re: [Xen-devel] [xen-unstable test] 142383: regressions - FAIL
On 08.10.2019 13:15, Roger Pau Monné wrote: > On Tue, Oct 08, 2019 at 12:42:25PM +0200, Jürgen Groß wrote: >> On 07.10.19 22:56, osstest service owner wrote: >>> flight 142383 xen-unstable real [real] >>> http://logs.test-lab.xenproject.org/osstest/logs/142383/ >>> >>> Regressions :-( >>> >>> Tests which did not succeed and are blocking, >>> including tests which could not be run: >>> test-amd64-amd64-libvirt-pair 22 guest-migrate/src_host/dst_host fail >>> REGR. vs. 141822 >>> test-amd64-i386-libvirt-pair 22 guest-migrate/src_host/dst_host fail >>> REGR. vs. 141822 >> >> Hmm, test log says the guest didn't suspend. >> >> Could that be related to commit b183e180bce9303 ? > > The libvirt libxl daemon spits: > > 2019-10-07 12:31:09.953+: libxl-save-helper: starting save: Success > 2019-10-07 12:31:09.953+: xc: fd 40, dom 1, flags 1, hvm 0 > 2019-10-07 12:31:09.954+: xc: Saving domain 1, type x86 PV > 2019-10-07 12:31:09.954+: xc: 64 bits, 4 levels > 2019-10-07 12:31:09.959+: xc: max_mfn 0x28 > 2019-10-07 12:31:09.959+: xc: p2m list from 0xc900 to > 0xc90f, root at 0x27de0a > 2019-10-07 12:31:09.959+: xc: max_pfn 0x1, p2m_frames 256 > 2019-10-07 12:31:09.960+: xc: Failed to enable logdirty: 22,0,22 (22 = > Invalid argument): Internal error > 2019-10-07 12:31:09.960+: xc: Save failed (22 = Invalid argument): > Internal error > 2019-10-07 12:31:09.981+: libxl-save-helper: complete r=-1: Invalid > argument > 2019-10-07 12:31:09.983+: libxl: libxl.c:752:libxl__fd_flags_restore: > fnctl F_SETFL of fd 40 to 0x2 > 2019-10-07 12:31:09.983+: libxl: libxl_event.c:1873:libxl__ao_complete: > ao 0x7f760c002110: complete, rc=-8 > 2019-10-07 12:31:09.983+: libxl: libxl_event.c:1842:libxl__ao__destroy: > ao 0x7f760c002110: destroy > > Which seem to be related to the recent iommu changes, I would say it's > likely hitting the is_iommu_enabled check in paging_log_dirty_enable > and hence returning EINVAL. > > Adding Paul and Jan who worked on the series. Well, as mentioned in reply to Paul's patch earlier today, I intend to commit the agreed upon half of it later today. Suggesting this was actually a result of me noticing this log-dirty enabling failure when I looked at the logs (another time) in the morning. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] pci: clear host_maskall field on assign
On 08.10.2019 13:09, Roger Pau Monné wrote: > On Tue, Oct 08, 2019 at 11:42:23AM +0200, Jan Beulich wrote: >> On 08.10.2019 11:23, Roger Pau Monné wrote: >>> On Wed, Oct 02, 2019 at 03:33:43PM +0200, Jan Beulich wrote: On 02.10.2019 12:49, Roger Pau Monne wrote: > The current implementation of host_maskall makes it sticky across > assign and deassign calls, which means that once a guest forces Xen to > set host_maskall the maskall bit is not going to be cleared until a > call to PHYSDEVOP_prepare_msix is performed. Such call however > shouldn't be part of the normal flow when doing PCI passthrough, and > hence the flag needs to be cleared when assigning in order to prevent > host_maskall being carried over from previous assignations. > > Note that other mask fields, like guest_masked or the entry maskbit > are already reset when the msix capability is initialized. I take it you mean a guest action here, as PHYSDEVOP_prepare_msix is specifically about setting up the actual hardware's one? >>> >>> Right, or any path that calls into msix_capability_init (ie: QEMU >>> requesting to map a PIRQ to an MSIX entry will also call into >>> msix_capability_init). >>> This happens quite a bit later though, i.e. ->guest_maskall may need explicitly setting at the same time as you clear ->host_maskall. Furthermore ... > --- a/xen/drivers/passthrough/pci.c > +++ b/xen/drivers/passthrough/pci.c > @@ -1504,7 +1504,10 @@ static int assign_device(struct domain *d, u16 > seg, u8 bus, u8 devfn, u32 flag) > } > > if ( pdev->msix ) > +{ > msixtbl_init(d); > +pdev->msix->host_maskall = false; > +} ... doing just this would violate an assumed property: It ought to be fine to assert at every entry or exit point that the physical maskall bit of an MSI-X-enabled device is the logical OR of ->host_maskall and ->guest_maskall. >>> >>> Is this still valid at this point, even without my patch? >>> >>> The hardware domain should have performed a reset of the device, so >>> the state of the maskall hardware bit should be true, regardless of >>> the previous state of either the guest_maskall or the host_maskall >>> bits. >> >> But a reset _clears_ the hardware maskall bit (alongside the >> enable one). > > Right you are, I was confusing the reset state of the maskall bit and > the per-entry mask bit. > I.e. I see the following options: 1) your variant accompanied by updating of the hardware bit, 2) pdev->msix->guest_maskall = pdev->msix->host_maskall; pdev->msix->host_maskall = false; leaving the hardware bit alone, as the above transformation wouldn't change what it's supposed to be set to, 3) pdev->msix->guest_maskall = true; pdev->msix->host_maskall = false; alongside setting the bit in hardware (if not already set), >>> >>> That seems like the best option IMO, since it's the reset state of the >>> device, and should be the expected one when assigning a device to a >>> guest. >> >> As per above - no, it's not. We mask interrupts in hardware >> (through individual mask bits iirc) because pci_prepare_msix() >> gets invoked by Dom0 ahead of giving the device to the guest, >> which involves setting the enable bit (and hence unmasked >> interrupts could trigger). > > I'm afraid I'm a little bit lost. So if pci_prepare_msix gets called > before assigning the device it means there's a call to > PHYSDEVOP_prepare_msix done by dom0? This happens when pciback takes control of the device, not before every individual assignment (which is why ->host_maskall may remain set in the first place). > I somehow assumed that would only happen when dom0 is actually using > the device, but not as part of a normal flow when the device is just > being assigned to guests without the dom0 using it at all. > > Is there some documentation about what dom0 is expected to perform > when assigning and deassigning devices to guests? I'm afraid there isn't. > Given that as you correctly point out maskall is unset after device > reset, I feel that option 4 is the best one since it matches the state > of the hardware after reset. Right, that's the variant coming closest to what hardware state ought to be at that point. We'd need to double check that the per-entry mask bits are all set at that point. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [qemu-mainline bisection] complete test-amd64-amd64-xl-qemuu-win7-amd64
branch xen-unstable xenbranch xen-unstable job test-amd64-amd64-xl-qemuu-win7-amd64 testid windows-install Tree: linux git://xenbits.xen.org/linux-pvops.git Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://git.qemu.org/qemu.git Tree: seabios git://xenbits.xen.org/osstest/seabios.git Tree: xen git://xenbits.xen.org/xen.git *** Found and reproduced problem changeset *** Bug is in tree: qemuu git://git.qemu.org/qemu.git Bug introduced: 4f59102571fce49af180cfc6d4cdd2b5df7bdb14 Bug not present: a578cdfbdd8f9beff5ced52b7826ddb1669abbbf Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/142437/ (Revision log too long, omitted.) For bisection revision-tuple graph see: http://logs.test-lab.xenproject.org/osstest/results/bisect/qemu-mainline/test-amd64-amd64-xl-qemuu-win7-amd64.windows-install.html Revision IDs in each graph node refer, respectively, to the Trees above. Running cs-bisection-step --graph-out=/home/logs/results/bisect/qemu-mainline/test-amd64-amd64-xl-qemuu-win7-amd64.windows-install --summary-out=tmp/142437.bisection-summary --basis-template=140282 --blessings=real,real-bisect qemu-mainline test-amd64-amd64-xl-qemuu-win7-amd64 windows-install Searching for failure / basis pass: 142388 fail [host=godello0] / 141466 [host=godello1] 141434 [host=godello1] 141377 [host=godello1] 141348 [host=godello1] 141320 [host=godello1] 141285 [host=godello1] 141259 [host=godello1] 141243 [host=godello1] 141204 [host=godello1] 141179 [host=godello1] 141087 [host=godello1] 141058 [host=godello1] 141015 [host=godello1] 140989 [host=godello1] 140966 [host=godello1] 140932 [host=godello1] 140905 [host=godello1] 140862 [host=godello1] 140832 [host=godello1] 140739 [host=godello1] 140691 [h\ ost=godello1] 140657 [host=godello1] 140635 [host=godello1] 140610 [host=godello1] 140583 [host=godello1] 140542 [host=godello1] 140458 [host=godello1] 140409 [host=godello1] 140361 [host=godello1] 140282 [host=godello1] 140232 [host=godello1] 140196 [host=godello1] 140170 [host=godello1] 140148 [host=godello1] 140119 [host=godello1] 140012 [host=godello1] 139887 [host=godello1] 139862 [host=godello1] 139796 [host=godello1] 139766 [host=godello1] 139737 [host=godello1] 139678 [host=godello1] 139\ 659 [host=godello1] 139633 [host=godello1] 139594 [host=godello1] 139573 [host=godello1] 139548 [host=godello1] 139525 [host=godello1] 139502 [host=godello1] 139462 [host=godello1] 139429 [host=godello1] 139401 [host=godello1] 139374 [host=godello1] 139352 [host=godello1] 139335 [host=godello1] 139300 [host=godello1] 139277 [host=godello1] 139251 [host=godello1] 139230 [host=godello1] 139201 [host=godello1] 139183 [host=godello1] 139140 [host=godello1] 139075 [host=godello1] 139035 [host=godello\ 1] 139014 [host=godello1] 138977 [host=godello1] 138946 [host=godello1] 138911 [host=godello1] 138890 [host=godello1] 138857 [host=godello1] 138825 [host=godello1] 138799 [host=godello1] 138762 [host=godello1] 138740 [host=godello1] 138712 [host=godello1] 138686 [host=godello1] 137600 ok. Failure / basis pass flights: 142388 / 137600 (tree with no url: minios) Tree: linux git://xenbits.xen.org/linux-pvops.git Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://git.qemu.org/qemu.git Tree: seabios git://xenbits.xen.org/osstest/seabios.git Tree: xen git://xenbits.xen.org/xen.git Latest db1892238c55c5138801f131a837ccd0056f002e c530a75c1e6a472b0eb9558310b518f0dfcd8860 d19040804afb2bdd60f18e8aef7da78028575fe6 d0d8ad39ecb51cd7497cd524484fe09f50876798 4f59102571fce49af180cfc6d4cdd2b5df7bdb14 43f5df79dad6738d52ea79d072de2b56eb96a91f f93abf0315efef861270c25d83c8047fd6a54ec4 Basis pass 8cb1239889087368a792c655de99529eec219bfc c530a75c1e6a472b0eb9558310b518f0dfcd8860 f0718d1d6b47745a4249f4006807a45f2245dba1 d0d8ad39ecb51cd7497cd524484fe09f50876798 a578cdfbdd8f9beff5ced52b7826ddb1669abbbf 0932c20560574696cf87ddd12623e8c423ee821b 844aa0a13d34e9a341a8374119d2ed67d4dcd6bb Generating revisions with ./adhoc-revtuple-generator git://xenbits.xen.org/linux-pvops.git#8cb1239889087368a792c655de99529eec219bfc-db1892238c55c5138801f131a837ccd0056f002e git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860 git://xenbits.xen.org/osstest/ovmf.git#f0718d1d6b47745a4249f4006807a45f2245dba1-d19040804afb2bdd60f18e8aef7da78028575fe6 git://xenbits.xen.org/qemu-xen-traditional.git#d0d8ad39ecb51cd7497cd524484\ fe09f50876798-d0d8ad39ecb51cd7497cd524484fe09f50876798 git://git.qemu.org/qemu.git#a578cdfbdd8f9beff5ced52b7826ddb1669abbbf-4f59102571fce49af180cfc6d4cdd2b5df7bdb14
Re: [Xen-devel] [PATCH v2 1/3] xen/arm: fix buf size in make_cpus_node
Hi Stefano, On 10/8/19 2:14 AM, Stefano Stabellini wrote: The size of buf is calculated wrongly: the number is 64bit, not 32bit. While the variable mpdir_aff is 64-bit, we only write the first 32-bit in the property reg (#address-cells == 1 and fdt_property_cell()). So what needs to be modified is the format here. Also, looking the CPU bindings (see linux/Documentation/devicetree/bindings/arm/cpus.yaml), technically only the bits [23:0] of the mpidr should be used. The rest is zeroed. This is ok because vcpuid_to_vaffinity() is always returning a value following the requirements above. However, for correctness, this may want to be fixed. Also the number is printed as a hexadecimal number, so we need 8 bytes for 32bit, not 10 bytes. As a result, it should be sizeof("cpu@") + 16 bytes for a 64-bit number + 1 byte for \0. Total = 21. Fixes: fafd682c3e (xen/arm: Create a fake cpus node in dom0 device tree) I am afraid this is not fixing this patch: snprintf(buf, sizeof(buf), "cpu@%u", cpu); So the 10 bytes were actually correct back then. The problem was introduced by commit c81a791d34 "xen/arm: Set 'reg' of cpu node for dom0 to match MPIDR's affinity". Signed-off-by: Stefano Stabellini --- Changes in v2: - patch added --- xen/arch/arm/domain_build.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 921b054520..60923a7051 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -788,8 +788,8 @@ static int __init make_cpus_node(const struct domain *d, void *fdt) unsigned int cpu; const void *compatible = NULL; u32 len; -/* Placeholder for cpu@ + a 32-bit number + \0 */ -char buf[15]; +/* Placeholder for cpu@ + a 64-bit number + \0 */ +char buf[21]; u32 clock_frequency; bool clock_valid; uint64_t mpidr_aff; Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [xen-unstable test] 142383: regressions - FAIL
On Tue, Oct 08, 2019 at 12:42:25PM +0200, Jürgen Groß wrote: > On 07.10.19 22:56, osstest service owner wrote: > > flight 142383 xen-unstable real [real] > > http://logs.test-lab.xenproject.org/osstest/logs/142383/ > > > > Regressions :-( > > > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > test-amd64-amd64-libvirt-pair 22 guest-migrate/src_host/dst_host fail > > REGR. vs. 141822 > > test-amd64-i386-libvirt-pair 22 guest-migrate/src_host/dst_host fail > > REGR. vs. 141822 > > Hmm, test log says the guest didn't suspend. > > Could that be related to commit b183e180bce9303 ? The libvirt libxl daemon spits: 2019-10-07 12:31:09.953+: libxl-save-helper: starting save: Success 2019-10-07 12:31:09.953+: xc: fd 40, dom 1, flags 1, hvm 0 2019-10-07 12:31:09.954+: xc: Saving domain 1, type x86 PV 2019-10-07 12:31:09.954+: xc: 64 bits, 4 levels 2019-10-07 12:31:09.959+: xc: max_mfn 0x28 2019-10-07 12:31:09.959+: xc: p2m list from 0xc900 to 0xc90f, root at 0x27de0a 2019-10-07 12:31:09.959+: xc: max_pfn 0x1, p2m_frames 256 2019-10-07 12:31:09.960+: xc: Failed to enable logdirty: 22,0,22 (22 = Invalid argument): Internal error 2019-10-07 12:31:09.960+: xc: Save failed (22 = Invalid argument): Internal error 2019-10-07 12:31:09.981+: libxl-save-helper: complete r=-1: Invalid argument 2019-10-07 12:31:09.983+: libxl: libxl.c:752:libxl__fd_flags_restore: fnctl F_SETFL of fd 40 to 0x2 2019-10-07 12:31:09.983+: libxl: libxl_event.c:1873:libxl__ao_complete: ao 0x7f760c002110: complete, rc=-8 2019-10-07 12:31:09.983+: libxl: libxl_event.c:1842:libxl__ao__destroy: ao 0x7f760c002110: destroy Which seem to be related to the recent iommu changes, I would say it's likely hitting the is_iommu_enabled check in paging_log_dirty_enable and hence returning EINVAL. Adding Paul and Jan who worked on the series. Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] pci: clear host_maskall field on assign
On Tue, Oct 08, 2019 at 11:42:23AM +0200, Jan Beulich wrote: > On 08.10.2019 11:23, Roger Pau Monné wrote: > > On Wed, Oct 02, 2019 at 03:33:43PM +0200, Jan Beulich wrote: > >> On 02.10.2019 12:49, Roger Pau Monne wrote: > >>> The current implementation of host_maskall makes it sticky across > >>> assign and deassign calls, which means that once a guest forces Xen to > >>> set host_maskall the maskall bit is not going to be cleared until a > >>> call to PHYSDEVOP_prepare_msix is performed. Such call however > >>> shouldn't be part of the normal flow when doing PCI passthrough, and > >>> hence the flag needs to be cleared when assigning in order to prevent > >>> host_maskall being carried over from previous assignations. > >>> > >>> Note that other mask fields, like guest_masked or the entry maskbit > >>> are already reset when the msix capability is initialized. > >> > >> I take it you mean a guest action here, as PHYSDEVOP_prepare_msix is > >> specifically about setting up the actual hardware's one? > > > > Right, or any path that calls into msix_capability_init (ie: QEMU > > requesting to map a PIRQ to an MSIX entry will also call into > > msix_capability_init). > > > >> This happens > >> quite a bit later though, i.e. ->guest_maskall may need explicitly > >> setting at the same time as you clear ->host_maskall. Furthermore ... > >>> --- a/xen/drivers/passthrough/pci.c > >>> +++ b/xen/drivers/passthrough/pci.c > >>> @@ -1504,7 +1504,10 @@ static int assign_device(struct domain *d, u16 > >>> seg, u8 bus, u8 devfn, u32 flag) > >>> } > >>> > >>> if ( pdev->msix ) > >>> +{ > >>> msixtbl_init(d); > >>> +pdev->msix->host_maskall = false; > >>> +} > >> > >> ... doing just this would violate an assumed property: It ought to > >> be fine to assert at every entry or exit point that the physical > >> maskall bit of an MSI-X-enabled device is the logical OR of > >> ->host_maskall and ->guest_maskall. > > > > Is this still valid at this point, even without my patch? > > > > The hardware domain should have performed a reset of the device, so > > the state of the maskall hardware bit should be true, regardless of > > the previous state of either the guest_maskall or the host_maskall > > bits. > > But a reset _clears_ the hardware maskall bit (alongside the > enable one). Right you are, I was confusing the reset state of the maskall bit and the per-entry mask bit. > >> I.e. I see the following > >> options: > >> > >> 1) your variant accompanied by updating of the hardware bit, > >> > >> 2) > >> > >> pdev->msix->guest_maskall = pdev->msix->host_maskall; > >> pdev->msix->host_maskall = false; > >> > >> leaving the hardware bit alone, as the above transformation > >> wouldn't change what it's supposed to be set to, > >> > >> 3) > >> > >> pdev->msix->guest_maskall = true; > >> pdev->msix->host_maskall = false; > >> > >> alongside setting the bit in hardware (if not already set), > > > > That seems like the best option IMO, since it's the reset state of the > > device, and should be the expected one when assigning a device to a > > guest. > > As per above - no, it's not. We mask interrupts in hardware > (through individual mask bits iirc) because pci_prepare_msix() > gets invoked by Dom0 ahead of giving the device to the guest, > which involves setting the enable bit (and hence unmasked > interrupts could trigger). I'm afraid I'm a little bit lost. So if pci_prepare_msix gets called before assigning the device it means there's a call to PHYSDEVOP_prepare_msix done by dom0? I somehow assumed that would only happen when dom0 is actually using the device, but not as part of a normal flow when the device is just being assigned to guests without the dom0 using it at all. Is there some documentation about what dom0 is expected to perform when assigning and deassigning devices to guests? Given that as you correctly point out maskall is unset after device reset, I feel that option 4 is the best one since it matches the state of the hardware after reset. > > >> 4) > >> > >> pdev->msix->guest_maskall = false; > >> pdev->msix->host_maskall = false; > >> > >> alongside clearing the bit in hardware (if not already clear), > >> relying on all entries being individually masked (which ought > >> to be the state after the initial msix_capability_init()). > >> > >> In all cases the operation would likely better be done by > >> calling a function to be put in x86/msi.c. > > > > Maybe name it pci_reset_msix_state? > > The common naming pattern in the file seems to be to start with > msi_ or msix_, so perhaps better msix_reset_state()? I could > live with your suggested named though. I don't have a strong opinion either, added the pci_ prefix before most global functions in the file seem to do so, msix_ prefixed functions seem to be mostly static. Thanks, Roger. ___ Xen-devel mailing list
Re: [Xen-devel] [xen-unstable test] 142383: regressions - FAIL
On 07.10.19 22:56, osstest service owner wrote: flight 142383 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/142383/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 141822 test-amd64-i386-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 141822 Hmm, test log says the guest didn't suspend. Could that be related to commit b183e180bce9303 ? Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH-for-4.13 v2] x86/mm: don't needlessly veto migration
On Tue, 8 Oct 2019 at 09:25, Jan Beulich wrote: > > On 01.10.2019 17:11, Paul Durrant wrote: > > --- a/xen/arch/x86/mm/paging.c > > +++ b/xen/arch/x86/mm/paging.c > > @@ -209,15 +209,15 @@ static int paging_free_log_dirty_bitmap(struct domain > > *d, int rc) > > return rc; > > } > > > > -int paging_log_dirty_enable(struct domain *d, bool_t log_global) > > +int paging_log_dirty_enable(struct domain *d, bool log_global) > > { > > int ret; > > > > -if ( is_iommu_enabled(d) && log_global ) > > +if ( has_arch_pdevs(d) && iommu_use_hap_pt(d) && log_global ) > > To unblock a push to master, the is_iommu_enabled() -> has_arch_pdevs() > transformation here is needed afaict. Since the other half of the > change here (and a corresponding change to assign_device()) continues > to be controversial, could we agree on splitting this patch into two? > (I'd be fine doing the legwork.) Yes, please. I don't think there's a realistic chance of me getting back to this in time for 4.13. Hopefully shortly thereafter I'll be able to deal with it though. Paul > > Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] pci: clear host_maskall field on assign
On 08.10.2019 11:23, Roger Pau Monné wrote: > On Wed, Oct 02, 2019 at 03:33:43PM +0200, Jan Beulich wrote: >> On 02.10.2019 12:49, Roger Pau Monne wrote: >>> The current implementation of host_maskall makes it sticky across >>> assign and deassign calls, which means that once a guest forces Xen to >>> set host_maskall the maskall bit is not going to be cleared until a >>> call to PHYSDEVOP_prepare_msix is performed. Such call however >>> shouldn't be part of the normal flow when doing PCI passthrough, and >>> hence the flag needs to be cleared when assigning in order to prevent >>> host_maskall being carried over from previous assignations. >>> >>> Note that other mask fields, like guest_masked or the entry maskbit >>> are already reset when the msix capability is initialized. >> >> I take it you mean a guest action here, as PHYSDEVOP_prepare_msix is >> specifically about setting up the actual hardware's one? > > Right, or any path that calls into msix_capability_init (ie: QEMU > requesting to map a PIRQ to an MSIX entry will also call into > msix_capability_init). > >> This happens >> quite a bit later though, i.e. ->guest_maskall may need explicitly >> setting at the same time as you clear ->host_maskall. Furthermore ... >>> --- a/xen/drivers/passthrough/pci.c >>> +++ b/xen/drivers/passthrough/pci.c >>> @@ -1504,7 +1504,10 @@ static int assign_device(struct domain *d, u16 seg, >>> u8 bus, u8 devfn, u32 flag) >>> } >>> >>> if ( pdev->msix ) >>> +{ >>> msixtbl_init(d); >>> +pdev->msix->host_maskall = false; >>> +} >> >> ... doing just this would violate an assumed property: It ought to >> be fine to assert at every entry or exit point that the physical >> maskall bit of an MSI-X-enabled device is the logical OR of >> ->host_maskall and ->guest_maskall. > > Is this still valid at this point, even without my patch? > > The hardware domain should have performed a reset of the device, so > the state of the maskall hardware bit should be true, regardless of > the previous state of either the guest_maskall or the host_maskall > bits. But a reset _clears_ the hardware maskall bit (alongside the enable one). >> I.e. I see the following >> options: >> >> 1) your variant accompanied by updating of the hardware bit, >> >> 2) >> >> pdev->msix->guest_maskall = pdev->msix->host_maskall; >> pdev->msix->host_maskall = false; >> >> leaving the hardware bit alone, as the above transformation >> wouldn't change what it's supposed to be set to, >> >> 3) >> >> pdev->msix->guest_maskall = true; >> pdev->msix->host_maskall = false; >> >> alongside setting the bit in hardware (if not already set), > > That seems like the best option IMO, since it's the reset state of the > device, and should be the expected one when assigning a device to a > guest. As per above - no, it's not. We mask interrupts in hardware (through individual mask bits iirc) because pci_prepare_msix() gets invoked by Dom0 ahead of giving the device to the guest, which involves setting the enable bit (and hence unmasked interrupts could trigger). >> 4) >> >> pdev->msix->guest_maskall = false; >> pdev->msix->host_maskall = false; >> >> alongside clearing the bit in hardware (if not already clear), >> relying on all entries being individually masked (which ought >> to be the state after the initial msix_capability_init()). >> >> In all cases the operation would likely better be done by >> calling a function to be put in x86/msi.c. > > Maybe name it pci_reset_msix_state? The common naming pattern in the file seems to be to start with msi_ or msix_, so perhaps better msix_reset_state()? I could live with your suggested named though. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] Ping: [PATCH] x86/CPUID: RSTR_FP_ERR_PTRS depends on FPU
On 25.09.2019 17:27, Jan Beulich wrote: > There's nothing to restore here if there's no FPU in the first place. > > Signed-off-by: Jan Beulich > --- > To be considered for 4.13 since RSTR_FP_ERR_PTRS support was introduced > just recently. And already release-acked by Jürgen. Jan > --- a/xen/tools/gen-cpuid.py > +++ b/xen/tools/gen-cpuid.py > @@ -168,8 +168,9 @@ def crunch_numbers(state): > deps = { > # FPU is taken to mean support for the x87 regisers as well as the > # instructions. MMX is documented to alias the %MM registers over > the > -# x87 %ST registers in hardware. > -FPU: [MMX], > +# x87 %ST registers in hardware. Correct restoring of error pointers > +# of course makes no sense without there being anything to restore. > +FPU: [MMX, RSTR_FP_ERR_PTRS], > > # The PSE36 feature indicates that reserved bits in a PSE superpage > # may be used as extra physical address bits. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] pci: clear host_maskall field on assign
On Wed, Oct 02, 2019 at 03:33:43PM +0200, Jan Beulich wrote: > On 02.10.2019 12:49, Roger Pau Monne wrote: > > The current implementation of host_maskall makes it sticky across > > assign and deassign calls, which means that once a guest forces Xen to > > set host_maskall the maskall bit is not going to be cleared until a > > call to PHYSDEVOP_prepare_msix is performed. Such call however > > shouldn't be part of the normal flow when doing PCI passthrough, and > > hence the flag needs to be cleared when assigning in order to prevent > > host_maskall being carried over from previous assignations. > > > > Note that other mask fields, like guest_masked or the entry maskbit > > are already reset when the msix capability is initialized. > > I take it you mean a guest action here, as PHYSDEVOP_prepare_msix is > specifically about setting up the actual hardware's one? Right, or any path that calls into msix_capability_init (ie: QEMU requesting to map a PIRQ to an MSIX entry will also call into msix_capability_init). > This happens > quite a bit later though, i.e. ->guest_maskall may need explicitly > setting at the same time as you clear ->host_maskall. Furthermore ... > > --- a/xen/drivers/passthrough/pci.c > > +++ b/xen/drivers/passthrough/pci.c > > @@ -1504,7 +1504,10 @@ static int assign_device(struct domain *d, u16 seg, > > u8 bus, u8 devfn, u32 flag) > > } > > > > if ( pdev->msix ) > > +{ > > msixtbl_init(d); > > +pdev->msix->host_maskall = false; > > +} > > ... doing just this would violate an assumed property: It ought to > be fine to assert at every entry or exit point that the physical > maskall bit of an MSI-X-enabled device is the logical OR of > ->host_maskall and ->guest_maskall. Is this still valid at this point, even without my patch? The hardware domain should have performed a reset of the device, so the state of the maskall hardware bit should be true, regardless of the previous state of either the guest_maskall or the host_maskall bits. > I.e. I see the following > options: > > 1) your variant accompanied by updating of the hardware bit, > > 2) > > pdev->msix->guest_maskall = pdev->msix->host_maskall; > pdev->msix->host_maskall = false; > > leaving the hardware bit alone, as the above transformation > wouldn't change what it's supposed to be set to, > > 3) > > pdev->msix->guest_maskall = true; > pdev->msix->host_maskall = false; > > alongside setting the bit in hardware (if not already set), That seems like the best option IMO, since it's the reset state of the device, and should be the expected one when assigning a device to a guest. > > 4) > > pdev->msix->guest_maskall = false; > pdev->msix->host_maskall = false; > > alongside clearing the bit in hardware (if not already clear), > relying on all entries being individually masked (which ought > to be the state after the initial msix_capability_init()). > > In all cases the operation would likely better be done by > calling a function to be put in x86/msi.c. Maybe name it pci_reset_msix_state? Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [ANNOUNCE] Call for agenda items for Oct, 10th Community Call @ 15:00 UTC - One week later than normal - different dial-in details
Hi all, I won't be able to kick off the call on Thursday, but should be able to join the last 30 minutes of the call. Juergen has volunteered to kick off and moderate the call, but to do this we have to change the dial-in details. Please use the following details https://www.gotomeet.me/JuergenGross You can also dial in using your phone. United States (Toll Free): 1 866 899 4679 United States: +1 (646) 749-3117 Access Code: 133-463-709 More phone numbers Argentina (Toll Free): 0 800 444 2385 Australia (Toll Free): 1 800 191 358 Australia: +61 2 8355 1038 Austria (Toll Free): 0 800 202144 Austria: +43 7 2081 5337 Bahrain (Toll Free): 800 81 305 Belarus (Toll Free): 8 820 0073 0071 Belgium (Toll Free): 0 800 78881 Belgium: +32 28 93 7002 Brazil (Toll Free): 0 800 047 4902 Brazil: +55 11 4118-4898 Bulgaria (Toll Free): 00800 120 4413 Canada (Toll Free): 1 888 299 1889 Canada: +1 (647) 497-9373 Chile (Toll Free): 800 395 146 China (Toll Free): 4006 037937 Colombia (Toll Free): 01 800 012 9057 Costa Rica (Toll Free): 0800 542 5404 Czech Republic (Toll Free): 800 143570 Denmark (Toll Free): 8025 3112 Denmark: +45 32 72 03 69 Finland (Toll Free): 0 800 917643 Finland: +358 942 72 0972 France (Toll Free): 0 805 541 052 France: +33 170 950 590 Germany (Toll Free): 0 800 723 5274 Germany: +49 692 5736 7300 Greece (Toll Free): 00 800 4414 4282 Hong Kong (Toll Free): 800 900 221 Hungary (Toll Free): (06) 80 986 259 Iceland (Toll Free): 800 7206 India (Toll Free): 18002669775 Indonesia (Toll Free): 001 803 852 9155 Ireland (Toll Free): 1 800 818 263 Ireland: +353 15 295 146 Israel (Toll Free): 1 809 453 019 Italy (Toll Free): 800 792289 Italy: +39 0 230 57 81 80 Japan (Toll Free): 0 120 242 200 Korea, Republic of (Toll Free): 00798 14 207 4828 Luxembourg (Toll Free): 800 29524 Malaysia (Toll Free): 1 800 81 6860 Mexico (Toll Free): 01 800 083 5535 Mexico: +52 55 4624 4518 Netherlands (Toll Free): 0 800 023 1954 Netherlands: +31 207 941 375 New Zealand (Toll Free): 0 800 47 0051 New Zealand: +64 9 282 9510 Norway (Toll Free): 800 33 083 Norway: +47 21 93 37 37 Panama (Toll Free): 00 800 203 0002 Peru (Toll Free): 0 800 55465 Philippines (Toll Free): 1 800 1110 1669 Poland (Toll Free): 00 800 1124748 Portugal (Toll Free): 800 819 683 Romania (Toll Free): 0 800 400 826 Russian Federation (Toll Free): 8 800 100 6216 Saudi Arabia (Toll Free): 800 844 3636 Singapore (Toll Free): 18007231322 Slovakia (Toll Free): 0 800 132 608 South Africa (Toll Free): 0 800 555 451 Spain (Toll Free): 800 900 593 Spain: +34 932 75 1230 Sweden (Toll Free): 0 200 330 924 Sweden: +46 853 527 818 Switzerland (Toll Free): 0 800 000 452 Switzerland: +41 225 4599 60 Taiwan (Toll Free): 0 800 666 846 Thailand (Toll Free): 001 800 658 129 Turkey (Toll Free): 00 800 4488 29001 Ukraine (Toll Free): 0 800 60 9142 United Arab Emirates (Toll Free): 800 044 40444 United Kingdom (Toll Free): 0 800 389 5276 United Kingdom: +44 20 3713 5011 Uruguay (Toll Free): 0004 019 1017 Viet Nam (Toll Free): 120 32 148 Join from a video-conferencing room or system. Dial in or type: 67.217.95.2 or inroomlink.goto.com Meeting ID: 133 463 709 Or dial directly: 133463709@67.217.95.2 or 67.217.95.2##133463709 New to GoToMeeting? Get the app now and be ready when your first meeting starts: https://global.gotomeeting.com/install/133463709 Regards Lars On 27/09/2019, 14:56, "Lars Kurth" wrote: Hi all, the proposed agenda is in https://cryptpad.fr/pad/#/2/pad/edit/4FGEw81flPUiivkjkuvQJ-CK/embed/present/and you can edit to add items Alternatively, you can reply to this mail directly Agenda items appreciated a few days before the call: please put your name besides items if you edit the document Best Regards Lars P.S.: If you want to be added or removed from the CC list please reply privately == Dial-in Information == ## Meeting time 15:00 - 16:00 UTC Further International meeting times: https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019=10=10=15=0=0=225=224=24=179=136=37=33 ## Dial in details Web: https://www.gotomeet.me/larskurth You can also dial in using your phone. Access Code: 906-886-965 China (Toll Free): 4008 811084 Germany: +49 692 5736 7317 Poland (Toll Free): 00 800 1124759 Ukraine (Toll Free): 0 800 50 1733 United Kingdom: +44 330 221 0088 United States: +1 (571) 317-3129 Spain: +34 932 75 2004 More phone numbers Australia: +61 2 9087 3604 Austria: +43 7 2081 5427 Argentina (Toll Free): 0 800 444 3375 Bahrain (Toll Free): 800 81 111 Belarus (Toll Free): 8 820 0011 0400 Belgium: +32 28 93 7018 Brazil (Toll Free): 0 800 047 4906 Bulgaria (Toll Free): 00800 120 4417 Canada: +1 (647) 497-9391 Chile (Toll Free): 800 395 150 Colombia (Toll Free): 01 800 518 4483 Czech Republic (Toll Free): 800 500448 Denmark: +45 32 72 03 82 Finland: +358 923 17 0568 France:
[Xen-devel] [linux-4.19 test] 142407: regressions - FAIL
flight 142407 linux-4.19 real [real] http://logs.test-lab.xenproject.org/osstest/logs/142407/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-arm64-arm64-examine11 examine-serial/bootloader fail REGR. vs. 142323 test-amd64-i386-qemut-rhel6hvm-amd 12 guest-start/redhat.repeat fail REGR. vs. 142323 Tests which did not succeed, but are not blocking: test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 142323 test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 13 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail never pass test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass version targeted for testing: linux58fce20645303bee01d74144ec00e405be43b1ca baseline version: linux
Re: [Xen-devel] How PV frontend and backend initializes?
On Sat, Oct 05, 2019 at 10:35:11AM +, tosher 1 wrote: > I was trying to understand the following things regarding the PV driver. > > 1. Who create frontend and backend instances? That depends on where the frontend and backends run. For example Linux blkback is implemented as a module of the Linux kernel. Linux blkfront is also a Linux kernel module. OTOH there's also a block backend in QEMU called qdisk, which is obviously implemented in QEMU. I'm however unsure by what you mean with instance, so you might have to clarify exactly what you mean in order to get a more concise reply. > 2. When are these instances created? The xenstore entries are created at domain creation by the toolstack (xl/libxl). The connection between the frontend and the backend happens when the frontend driver loads and initializes the connection. > 3. How xenbus directories are created? What is the hierarchy of the > directories? They are created by the toolstack during domain creation: xl/libxl. There are documents in the public headers that describe the expected and optional xenstore nodes for each device, see: http://xenbits.xen.org/gitweb/?p=xen.git;a=tree;f=xen/include/public/io > 4. What is the role of "vifname" and who sets it? That's set in the guest config file [0] and then it's the hotplug script the one in charge of renaming the interface from it's original name into the desired one [1]. Roger. [0] https://xenbits.xen.org/docs/unstable/man/xl-network-configuration.5.html#vifname [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/hotplug/Linux/vif-common.sh#l67 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] On unions usage, specifically arch.{hvm,pv}
On 08.10.2019 02:38, Marek Marczykowski-Górecki wrote: > To be honest, I think unions are very scary from security point of view. > It's quite easy to use a field that in given context have very different > meaning and easily results in security issue. In the most cases, > compiler can't help you here. And seeing "IOMMU: add missing HVM check" > patch recently, fixing a but that was there for a year, I think my point > is valid. > There are multiple unions in the Xen code base, but I'd start with the > one in x86's struct arch_domain: {pv,hvm}. In some cases (like the above > _patched_ one), it is obvious that using arch.pv or arch.hvm in given > context is valid. But in some it is very much not obvious, like usage of > d->arch.hvm.dirty_vram in _sh_propagate() > (xen/arch/x86/mm/shadow/multi.c) - at least one caller seems to deal > also with PV vcpus > (sh_page_fault->shadow_get_and_create_l1e->l2e_propagate_from_guest). > Maybe I'm missing something, or maybe I've just found a bug, I don't > know, that's my point. And this is after casual grep for arch.hvm and > picking random file (took like 1 minute). Yes, this is a (latent) issue; we're being saved here merely by the fact that the field lives far enough into the structure that there's no overlap with the PV side of the union, and hence the NULL checks of the pointer guard the code paths de-referencing it from getting entered for PV. I do have a patch to address this (in the context of arranging to compile out the whole dirty-VRAM handling code for !HVM builds) already. > I propose to implement some measures to make similar bugs less likely. > Some ideas: > 1. Add asserts for guest type, if the check isn't visible in obvious > place, near arch.pv / arch.hvm usage. > > or maybe even better: > > 2. Add wrappers (inline, #define, whatever) that perform the check > before accessing those fields. And forbid accessing those (and maybe > later others too?) unions directly, so it would be trivial to verify. > There could be multiple wrappers for most commons code patterns. For > example one for combined is_hvm_domain(d) && > some-check-on-arch.hvm-field. Or another one just adding an ASSERT() / > BUG_ON(). > > Ideally such check should be part of a release build (IMO it's better to > crash early with clear message, instead of crashing later in mysterious > way or having privilege escalation bug - if that's the alternative). But > having those checks just in debug build would be an improvement already. Adding checks where context is not obvious is certainly a good idea. As always we want to be as non-destructive as possible, i.e. avoid crashing the host at least in release builds when some alternative solution is viable. But just like for finding such issues, given the state of things it's a matter of auditing all uses to find where such checks may want adding, which someone would need to find time for. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH-for-4.13 v2] x86/mm: don't needlessly veto migration
On 01.10.2019 17:11, Paul Durrant wrote: > --- a/xen/arch/x86/mm/paging.c > +++ b/xen/arch/x86/mm/paging.c > @@ -209,15 +209,15 @@ static int paging_free_log_dirty_bitmap(struct domain > *d, int rc) > return rc; > } > > -int paging_log_dirty_enable(struct domain *d, bool_t log_global) > +int paging_log_dirty_enable(struct domain *d, bool log_global) > { > int ret; > > -if ( is_iommu_enabled(d) && log_global ) > +if ( has_arch_pdevs(d) && iommu_use_hap_pt(d) && log_global ) To unblock a push to master, the is_iommu_enabled() -> has_arch_pdevs() transformation here is needed afaict. Since the other half of the change here (and a corresponding change to assign_device()) continues to be controversial, could we agree on splitting this patch into two? (I'd be fine doing the legwork.) Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 5/6] Add guide on Communication Best Practice
On 07.10.2019 18:13, George Dunlap wrote: > On 9/27/19 10:14 AM, Jan Beulich wrote: >> On 26.09.2019 21:39, Lars Kurth wrote: >>> +### Verbose vs. terse >>> +Due to the time it takes to review and compose code reviewer, reviewers >>> often adopt a >>> +terse style. It is not unusual to see review comments such as >>> +> typo >>> +> s/resions/regions/ >>> +> coding style >>> +> coding style: brackets not needed >>> +etc. >>> + >>> +Terse code review style has its place and can be productive for both the >>> reviewer and >>> +the author. However, overuse can come across as unfriendly, lacking >>> empathy and >>> +can thus create a negative impression with the author of a patch. This is >>> in particular >>> +true, when you do not know the author or the author is a newcomer. Terse >>> +communication styles can also be perceived as rude in some cultures. >> >> And another remark here: Not being terse in situations like the ones >> enumerated as examples above is a double waste of the reviewer's time: > > FWIW I don't think this document prohibits terse replies. It points out > that they can come across as unfriendly, and they can be perceived as > rude in some cultures; both of which are true. It then *recommends* > that reviewers compensate for it in a review opening (i.e., once per > patch / series) which expresses appreciation; which is both helpful and > relatively low cost. > > The point of the opening is to set the tone. If you start out with > something positive, and ends with "thanks", then a long series of terse > comments is more likely to be read as simply being efficient. If the > entire review consists of nothing but criticism or terse comments, it's > more likely to be read as annoyance on the part of the reviewer. > >> They shouldn't even need to make such comments, especially not many >> times for a single patch (see your mention of "overuse"). I realize >> we still have no automated mechanism to check style aspects, but >> anybody can easily look over their patches before submitting them. >> And for an occasional issue I think a terse reply is quite reasonable >> to have. > > This sort of sounds like you are *intending* to express annoyance? Implicitly by being terse, yes. I've been trying to avoid expressing such explicitly, but I have to admit there are (luckily only few) cases where I find it pretty hard to stay away. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] Latest development (master) Xen fails to boot on HP ProLiant DL20 GEN10
On 08/10/2019 06:54, Roman Shaposhnik wrote: > Sorry -- was traveling last week, but I'm still very curious to get to > the bottom of this: > > On Tue, Oct 1, 2019 at 1:25 AM Jan Beulich wrote: >> On 01.10.2019 00:38, Roman Shaposhnik wrote: >>> Btw, forgot to attach the patch with maxcpus=2 -- interestingly enough >>> Xen seems to hang much further down than before (basically after >>> attempting to build out Dom0) >> All 3 logs contain >> >> (XEN) TSC_DEADLINE disabled due to Errata; please update microcode to >> version 0x52 (or later) > Ok. This makes some sense. Btw, in a situation like this, do we expect > Xen to cope with a broken microcode or we always expect microcode to > be updated? Intel consider you to be running out-of-spec if you don't have up-to-date microcode. As for functioning correctly, all bets are off. It is quite possible the updated microcode doesn't have an impact here, but it is not worth your time or ours trying to diagnose issues in the face of an out-of-spec system. > >> Please load up-to-date microcode on the system and, > I couldn't find any kind of HP guide on how to do this on this box. > Any chance someone here may have a pointer for me? Here is some documentation I prepared earlier: https://xenbits.xen.org/docs/sphinx-unstable/admin-guide/microcode-loading.html To a first approximation, install the intel-ucode package into dom0, and it should DTRT. If it doesn't, https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files is upstream for Intel's microcode, and you can manually arrange for Xen to find and use it. ~Andrew P.S. I'd also appreciate any feedback you have on the microcode documentation, because that is the style/quality I'm aiming for across the board in the Sphinx docs, and you are possibly the first user to use it in anger. In particular, if things are not clear, I'd like to see about fixing that. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] pci: clear host_maskall field on assign
On Mon, Oct 07, 2019 at 09:38:48AM +0200, Jan Beulich wrote: >On 05.10.2019 01:58, Chao Gao wrote: >> On Wed, Oct 02, 2019 at 12:49:35PM +0200, Roger Pau Monne wrote: >>> The current implementation of host_maskall makes it sticky across >>> assign and deassign calls, which means that once a guest forces Xen to >>> set host_maskall the maskall bit is not going to be cleared until a >>> call to PHYSDEVOP_prepare_msix is performed. Such call however >>> shouldn't be part of the normal flow when doing PCI passthrough, and >>> hence the flag needs to be cleared when assigning in order to prevent >>> host_maskall being carried over from previous assignations. >>> >>> Note that other mask fields, like guest_masked or the entry maskbit >>> are already reset when the msix capability is initialized. Also note >>> that doing the reset of host_maskall there would allow the guest to >>> reset such field by enabling and disabling MSIX, which is not >>> intended. >>> >>> Signed-off-by: Roger Pau Monné >>> --- >>> Cc: Chao Gao >>> Cc: "Spassov, Stanislav" >>> Cc: Pasi Kärkkäinen >>> --- >>> Chao, Stanislav, can you please check if this patch fixes your >>> issues? >> >> I am glad to. For your testing, you can just kill qemu and destroy the >> guest. Then maskall bit of a pass-thru device will be set. And in this >> case, try to recreate the guest and check whether the maskall bit is >> cleared in guest. >> >> The solution is similar to my v1 [1]. One question IMO (IIRC, it is why >> I changed to another approach) is: why not do such reset at deivce >> deassignment such that dom0 can use a clean device. Otherwise, the >> device won't work after being unbound from pciback. But I am not so >> sure, I can check it next Tuesday. > >I too did think about this, but aiui pciback needs to issue >PHYSDEVOP_release_msix anyway, and Dom0 would then re-setup MSI-X >"from scratch", i.e. we'd clear the flag anyway in >msix_capability_init() due to msix->used_entries being zero at >the first (of possibly several) invocation(s). Yes. I just checked it on my machine and found you are right. Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel