[PATCH] x86/DMI: adjustments to comply with Misra C:2012 Rule 9.3
The rule demands that all array elements be initialized (or dedicated initializers be used). Introduce a small set of macros to allow doing so without unduly affecting use sites (in particular in terms of how many elements .matches[] actually has; right now there's no use of DMI_MATCH4(), so we could even consider reducing the array size to 3). Note that DMI_MATCH() needs adjustment because of the comma included in its expansion, which - due to being unparenthesized - would otherwise cause macro arguments in the "further replacement" step to be wrong. Signed-off-by: Jan Beulich --- Of course a question is how many of these DMI table entries are in fact no longer applicable (e.g. because of naming 32-bit-only systems). Subsequently the table in dmi_scan.c itself may want cleaning up as well, yet I guess the question of stale entries is even more relevant there. An alternative to using the compound literal approach might be to go along the lines of #define DMI_MATCH4(m1, m2, m3, m4) .matches = { [0] = m1, [1] = m2, [2] = m3, [3] = m4 } I've chosen the other approach mainly because of being slightly shorter. --- a/xen/arch/x86/genapic/bigsmp.c +++ b/xen/arch/x86/genapic/bigsmp.c @@ -19,11 +19,14 @@ static int __init cf_check force_bigsmp( static const struct dmi_system_id __initconstrel bigsmp_dmi_table[] = { - { force_bigsmp, "UNISYS ES7000-ONE", { - DMI_MATCH(DMI_PRODUCT_NAME, "ES7000-ONE") -}}, + { + .ident = "UNISYS ES7000-ONE", + .callback = force_bigsmp, + DMI_MATCH1( + DMI_MATCH(DMI_PRODUCT_NAME, "ES7000-ONE")), + }, -{ } + { } }; --- a/xen/arch/x86/hvm/quirks.c +++ b/xen/arch/x86/hvm/quirks.c @@ -36,42 +36,37 @@ static int __init cf_check check_port80( { .callback = dmi_hvm_deny_port80, .ident= "Compaq Presario V6000", -.matches = { + DMI_MATCH2( DMI_MATCH(DMI_BOARD_VENDOR, "Quanta"), -DMI_MATCH(DMI_BOARD_NAME, "30B7") -} +DMI_MATCH(DMI_BOARD_NAME, "30B7")), }, { .callback = dmi_hvm_deny_port80, .ident= "HP Pavilion dv9000z", -.matches = { + DMI_MATCH2( DMI_MATCH(DMI_BOARD_VENDOR, "Quanta"), -DMI_MATCH(DMI_BOARD_NAME, "30B9") -} +DMI_MATCH(DMI_BOARD_NAME, "30B9")), }, { .callback = dmi_hvm_deny_port80, .ident= "HP Pavilion dv6000", -.matches = { + DMI_MATCH2( DMI_MATCH(DMI_BOARD_VENDOR, "Quanta"), -DMI_MATCH(DMI_BOARD_NAME, "30B8") -} +DMI_MATCH(DMI_BOARD_NAME, "30B8")), }, { .callback = dmi_hvm_deny_port80, .ident= "HP Pavilion tx1000", -.matches = { + DMI_MATCH2( DMI_MATCH(DMI_BOARD_VENDOR, "Quanta"), -DMI_MATCH(DMI_BOARD_NAME, "30BF") -} +DMI_MATCH(DMI_BOARD_NAME, "30BF")), }, { .callback = dmi_hvm_deny_port80, .ident= "Presario F700", -.matches = { + DMI_MATCH2( DMI_MATCH(DMI_BOARD_VENDOR, "Quanta"), -DMI_MATCH(DMI_BOARD_NAME, "30D3") -} +DMI_MATCH(DMI_BOARD_NAME, "30D3")), }, { } }; --- a/xen/arch/x86/ioport_emulate.c +++ b/xen/arch/x86/ioport_emulate.c @@ -43,59 +43,51 @@ static const struct dmi_system_id __init */ { .ident = "HP ProLiant DL3xx", -.matches = { +DMI_MATCH2( DMI_MATCH(DMI_BIOS_VENDOR, "HP"), -DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant DL3"), -}, +DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant DL3")), }, { .ident = "HP ProLiant DL5xx", -.matches = { +DMI_MATCH2( DMI_MATCH(DMI_BIOS_VENDOR, "HP"), -DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant DL5"), -}, +DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant DL5")), }, { .ident = "HP ProLiant DL7xx", -.matches = { +DMI_MATCH2( DMI_MATCH(DMI_BIOS_VENDOR, "HP"), -DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant DL7"), -}, +DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant DL7")), }, { .ident = "HP ProLiant ML3xx", -.matches = { +DMI_MATCH2( DMI_MATCH(DMI_BIOS_VENDOR, "HP"), -DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant ML3"), -}, +DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant ML3")), }, { .ident = "HP ProLiant ML5xx", -.matches = { +DMI_MATCH2( DMI_MATCH(DMI_BIOS_VENDOR, "HP"), -DMI_MATCH(DMI_PRODUCT_NAME, "ProLiant ML5"), -
[ovmf test] 183945: all pass - PUSHED
flight 183945 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/183945/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 59b6b5059b90883abfcbd906c411e8f59ac1aa0d baseline version: ovmf 4f99b5fb936a2a0239d5212948b7104d75d1558c Last test of basis 183942 2023-11-29 21:12:45 Z0 days Testing same since 183945 2023-11-30 05:11:04 Z0 days1 attempts People who touched revisions under test: Ashish Singhal jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git 4f99b5fb93..59b6b5059b 59b6b5059b90883abfcbd906c411e8f59ac1aa0d -> xen-tested-master
Re: [PATCH 6/6] Rename "QEMU global mutex" to "BQL" in comments and docs
Stefan Hajnoczi writes: > The term "QEMU global mutex" is identical to the more widely used Big > QEMU Lock ("BQL"). Update the code comments and documentation to use > "BQL" instead of "QEMU global mutex". > > Signed-off-by: Stefan Hajnoczi For QAPI Acked-by: Markus Armbruster
Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
On 2023/11/30 11:46, Stefano Stabellini wrote: > On Fri, 24 Nov 2023, Jiqian Chen wrote: >> When device on dom0 side has been reset, the vpci on Xen side >> won't get notification, so that the cached state in vpci is >> all out of date with the real device state. >> To solve that problem, this patch add a function to clear all >> vpci device state when device is reset on dom0 side. >> >> And call that function in pcistub_init_device. Because when >> we use "pci-assignable-add" to assign a passthrough device in >> Xen, it will reset passthrough device and the vpci state will >> out of date, and then device will fail to restore bar state. >> >> Signed-off-by: Jiqian Chen >> Signed-off-by: Huang Rui >> --- >> drivers/xen/pci.c | 12 >> drivers/xen/xen-pciback/pci_stub.c | 3 +++ >> include/xen/interface/physdev.h| 2 ++ >> include/xen/pci.h | 6 ++ >> 4 files changed, 23 insertions(+) >> >> diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c >> index 72d4e3f193af..e9b30bc09139 100644 >> --- a/drivers/xen/pci.c >> +++ b/drivers/xen/pci.c >> @@ -177,6 +177,18 @@ static int xen_remove_device(struct device *dev) >> return r; >> } >> >> +int xen_reset_device_state(const struct pci_dev *dev) >> +{ >> +struct physdev_pci_device device = { >> +.seg = pci_domain_nr(dev->bus), >> +.bus = dev->bus->number, >> +.devfn = dev->devfn >> +}; >> + >> +return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_state_reset, ); >> +} >> +EXPORT_SYMBOL_GPL(xen_reset_device_state); >> + >> static int xen_pci_notifier(struct notifier_block *nb, >> unsigned long action, void *data) >> { >> diff --git a/drivers/xen/xen-pciback/pci_stub.c >> b/drivers/xen/xen-pciback/pci_stub.c >> index e34b623e4b41..5a96b6c66c07 100644 >> --- a/drivers/xen/xen-pciback/pci_stub.c >> +++ b/drivers/xen/xen-pciback/pci_stub.c >> @@ -421,6 +421,9 @@ static int pcistub_init_device(struct pci_dev *dev) >> else { >> dev_dbg(>dev, "resetting (FLR, D3, etc) the device\n"); >> __pci_reset_function_locked(dev); >> +err = xen_reset_device_state(dev); >> +if (err) >> +goto config_release; > > Older versions of Xen won't have the hypercall > PHYSDEVOP_pci_device_state_reset implemented. I think we should do > something like: > > if (err && xen_pvh_domain()) > goto config_release; > > > Or even: > > if (xen_pvh_domain()) { > err = xen_reset_device_state(dev); > if (err) > goto config_release; > } > > depending on whether we want to call xen_reset_device_state also for PV > guests or not. I am assuming we don't want to error out on failure such > as -ENOENT for PV guests. Yes, only for PVH dom0, I will add the condition in next version. Thank you! > > >> pci_restore_state(dev); >> } >> /* Now disable the device (this also ensures some private device >> diff --git a/include/xen/interface/physdev.h >> b/include/xen/interface/physdev.h >> index a237af867873..231526f80f6c 100644 >> --- a/include/xen/interface/physdev.h >> +++ b/include/xen/interface/physdev.h >> @@ -263,6 +263,8 @@ struct physdev_pci_device { >> uint8_t devfn; >> }; >> >> +#define PHYSDEVOP_pci_device_state_reset 32 >> + >> #define PHYSDEVOP_DBGP_RESET_PREPARE1 >> #define PHYSDEVOP_DBGP_RESET_DONE 2 >> >> diff --git a/include/xen/pci.h b/include/xen/pci.h >> index b8337cf85fd1..b2e2e856efd6 100644 >> --- a/include/xen/pci.h >> +++ b/include/xen/pci.h >> @@ -4,10 +4,16 @@ >> #define __XEN_PCI_H__ >> >> #if defined(CONFIG_XEN_DOM0) >> +int xen_reset_device_state(const struct pci_dev *dev); >> int xen_find_device_domain_owner(struct pci_dev *dev); >> int xen_register_device_domain_owner(struct pci_dev *dev, uint16_t domain); >> int xen_unregister_device_domain_owner(struct pci_dev *dev); >> #else >> +static inline int xen_reset_device_state(const struct pci_dev *dev) >> +{ >> +return -1; >> +} >> + >> static inline int xen_find_device_domain_owner(struct pci_dev *dev) >> { >> return -1; >> -- >> 2.34.1 >> -- Best regards, Jiqian Chen.
Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
On 2023/11/29 00:11, Roger Pau Monné wrote: > On Tue, Nov 28, 2023 at 03:42:31PM +0100, Juergen Gross wrote: >> On 28.11.23 15:25, Roger Pau Monné wrote: >>> On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote: In PVH dom0, it uses the linux local interrupt mechanism, when it allocs irq for a gsi, it is dynamic, and follow the principle of applying first, distributing first. And if you debug the kernel codes, you will find the irq number is alloced from small to large, but the applying gsi number is not, may gsi 38 comes before gsi 28, that causes the irq number is not equal with the gsi number. And when we passthrough a device, QEMU will use its gsi number to do mapping actions, see xen_pt_realize-> xc_physdev_map_pirq, but the gsi number is got from file /sys/bus/pci/devices/:xx:xx.x/irq in current code, so it will fail when mapping. And in current codes, there is no method to translate irq to gsi for userspace. >>> >>> I think it would be cleaner to just introduce a new sysfs node that >>> contains the gsi if a device is using one (much like the irq sysfs >>> node)? >>> >>> Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so >>> placing it in privcmd does seem quite strange to me. I understand >>> that for passthrough we need the GSI, but such translation layer from >>> IRQ to GSI is all Linux internal, and it would be much simpler to just >>> expose the GSI in sysfs IMO. >> >> You are aware that we have a Xen specific variant of acpi_register_gsi()? >> >> It is the Xen event channel driver being responsible for the GSI<->IRQ >> mapping. > > I'm kind of lost, this translation function is specifically needed for > PVH which doesn't use the Xen specific variant of acpi_register_gsi(), > and hence the IRQ <-> GSI relation is whatever the Linux kernel does > on native. > > I do understand that on a PV dom0 the proposed sysfs gsi node would > match the irq node, but that doesn't seem like an issue to me. > > Note also that PVH doesn't use acpi_register_gsi_xen_hvm() because > XENFEAT_hvm_pirqs feature is not exposed to PVH, so I expect it uses > the x86 acpi_register_gsi_ioapic(). Yes, PVH use acpi_register_gsi_ioapic, thank Roger for explanation. > > Thanks, Roger. -- Best regards, Jiqian Chen.
Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
On 2023/11/28 22:25, Roger Pau Monné wrote: > On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote: >> In PVH dom0, it uses the linux local interrupt mechanism, >> when it allocs irq for a gsi, it is dynamic, and follow >> the principle of applying first, distributing first. And >> if you debug the kernel codes, you will find the irq >> number is alloced from small to large, but the applying >> gsi number is not, may gsi 38 comes before gsi 28, that >> causes the irq number is not equal with the gsi number. >> And when we passthrough a device, QEMU will use its gsi >> number to do mapping actions, see xen_pt_realize-> >> xc_physdev_map_pirq, but the gsi number is got from file >> /sys/bus/pci/devices/:xx:xx.x/irq in current code, >> so it will fail when mapping. >> And in current codes, there is no method to translate >> irq to gsi for userspace. > > I think it would be cleaner to just introduce a new sysfs node that > contains the gsi if a device is using one (much like the irq sysfs > node)? Yes, I also ever thought this way. Add a sysfs node in /sys/bus/pci/devices/:xx:xx.x/gsi. But I am not sure sysfs or privcmd, which is better. If use sysfs node, should I need to use the macro of Xen to wrap the codes? And is it suitable to create it in function acpi_register_gsi_ioapic? > > Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so > placing it in privcmd does seem quite strange to me. I understand > that for passthrough we need the GSI, but such translation layer from > IRQ to GSI is all Linux internal, and it would be much simpler to just > expose the GSI in sysfs IMO. > > Thanks, Roger. -- Best regards, Jiqian Chen.
Re: [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
On 2023/11/28 23:14, Jan Beulich wrote: > On 24.11.2023 11:41, Jiqian Chen wrote: >> --- a/xen/arch/x86/hvm/hypercall.c >> +++ b/xen/arch/x86/hvm/hypercall.c >> @@ -74,6 +74,8 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) >> arg) >> { >> case PHYSDEVOP_map_pirq: >> case PHYSDEVOP_unmap_pirq: >> +if (is_hardware_domain(currd)) >> +break; >> case PHYSDEVOP_eoi: >> case PHYSDEVOP_irq_status_query: >> case PHYSDEVOP_get_free_pirq: > > If you wouldn't go the route suggested by Roger, I think you will need > to deny self-mapping requests here. Do you mean below? if (arg.domid == DOMID_SELF) return; > > Also note that both here and in patch 1 you will want to adjust a number > of style violations. Could you please descript in detail? This will greatly assist me in making modifications in the next version. Thank you! > > Jan -- Best regards, Jiqian Chen.
Re: [RFC XEN PATCH v2 2/3] x86/pvh: Open PHYSDEVOP_map_pirq for PVH dom0
On 2023/11/28 22:17, Roger Pau Monné wrote: > On Fri, Nov 24, 2023 at 06:41:35PM +0800, Jiqian Chen wrote: >> If we run Xen with PVH dom0 and hvm domU, hvm will map a pirq for >> a passthrough device by using gsi, see xen_pt_realize->xc_physdev_map_pirq >> and pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq will >> call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq is not allowed >> because currd is PVH dom0 and PVH has no X86_EMU_USE_PIRQ flag, it will >> fail at has_pirq check. >> >> So, I think we may need to allow PHYSDEVOP_map_pirq when currd is dom0 (at > > And PHYSDEVOP_unmap_pirq also? Yes, in the failed path, PHYSDEVOP_unmap_pirq will be called. I will add some descriptions about it into the commit message. > >> present dom0 is PVH). > > IMO it would be better to implement a new set of DMOP hypercalls that > handle the setup of interrupts from physical devices that are assigned > to a guest. That should allow us to get rid of the awkward PHYSDEVOP > + XEN_DOMCTL_{,un}bind_pt_irq hypercall interface, which currently > prevents QEMU from being hypervisor version agnostic (since the domctl > interface is not stable). > > I understand this might be too much to ask for, but something to take > into account. Yes, that will be a complex project. I think current change can meet the needs. We can take DMOP into account in the future. Thank you. > > Thanks, Roger. -- Best regards, Jiqian Chen.
Re: [RFC XEN PATCH v2 1/3] xen/vpci: Clear all vpci status of device
Hi Roger and Daniel P. Smith, On 2023/11/28 22:08, Roger Pau Monné wrote: > On Fri, Nov 24, 2023 at 06:41:34PM +0800, Jiqian Chen wrote: >> When a device has been reset on dom0 side, the vpci on Xen >> side won't get notification, so the cached state in vpci is >> all out of date compare with the real device state. >> To solve that problem, this patch add new hypercall to clear >> all vpci device state. And when reset device happens on dom0 >> side, dom0 can call this hypercall to notify vpci. >> >> Signed-off-by: Jiqian Chen >> Signed-off-by: Huang Rui >> --- >> xen/arch/x86/hvm/hypercall.c | 1 + >> xen/drivers/passthrough/pci.c | 21 + >> xen/drivers/pci/physdev.c | 14 ++ >> xen/drivers/vpci/vpci.c | 9 + >> xen/include/public/physdev.h | 2 ++ >> xen/include/xen/pci.h | 1 + >> xen/include/xen/vpci.h| 6 ++ >> 7 files changed, 54 insertions(+) >> >> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c >> index eeb73e1aa5..6ad5b4d5f1 100644 >> --- a/xen/arch/x86/hvm/hypercall.c >> +++ b/xen/arch/x86/hvm/hypercall.c >> @@ -84,6 +84,7 @@ long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) >> arg) >> case PHYSDEVOP_pci_mmcfg_reserved: >> case PHYSDEVOP_pci_device_add: >> case PHYSDEVOP_pci_device_remove: >> +case PHYSDEVOP_pci_device_state_reset: >> case PHYSDEVOP_dbgp_op: >> if ( !is_hardware_domain(currd) ) >> return -ENOSYS; >> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c >> index 04d00c7c37..f871715585 100644 >> --- a/xen/drivers/passthrough/pci.c >> +++ b/xen/drivers/passthrough/pci.c >> @@ -824,6 +824,27 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn) >> return ret; >> } >> >> +int pci_reset_device_state(u16 seg, u8 bus, u8 devfn) > > You could use pci_sbdf_t here instead of 3 parameters. Will change in next version, thank you. > > I'm however unsure whether we really need this helper just to fetch > the pdev and call vpci_reset_device_state(), I think you could place > this logic directly in pci_physdev_op(). Unless there are plans to > use such logic outside of pci_physdev_op(). If I place the logic of pci_reset_device_state directly in pci_physdev_op. I think I need to declare vpci_reset_device_state in pci.h? If it is suitable, I will change in next version. > >> +{ >> +struct pci_dev *pdev; >> +int ret = -ENODEV; > > Some XSM check should be introduced fro this operation, as none of the > existing ones is suitable. See xsm_resource_unplug_pci() for example. > > xsm_resource_reset_state_pci() or some such I would assume. I don't know what I should do in XSM function(assume it is xsm_resource_reset_state_pci). Hi Daniel P. Smith, could you please give me some suggestions? Just to check the XSM_PRIV action? > > (adding the XSM maintainer for feedback). > >> + >> +pcidevs_lock(); >> + >> +pdev = pci_get_pdev(NULL, PCI_SBDF(seg, bus, devfn)); >> +if ( !pdev ) >> +goto error; >> + >> +ret = vpci_reset_device_state(pdev); >> +if (ret) >> +printk(XENLOG_ERR "PCI reset device %pp state failed\n", >> >sbdf); >> + >> +error: >> +pcidevs_unlock(); >> + >> +return ret; >> +} >> + >> /* Caller should hold the pcidevs_lock */ >> static int deassign_device(struct domain *d, uint16_t seg, uint8_t bus, >> uint8_t devfn) >> diff --git a/xen/drivers/pci/physdev.c b/xen/drivers/pci/physdev.c >> index 42db3e6d13..cfdb545654 100644 >> --- a/xen/drivers/pci/physdev.c >> +++ b/xen/drivers/pci/physdev.c >> @@ -67,6 +67,20 @@ ret_t pci_physdev_op(int cmd, >> XEN_GUEST_HANDLE_PARAM(void) arg) >> break; >> } >> >> +case PHYSDEVOP_pci_device_state_reset: { >> +struct physdev_pci_device dev; >> + >> +if ( !is_pci_passthrough_enabled() ) >> +return -EOPNOTSUPP; >> + >> +ret = -EFAULT; >> +if ( copy_from_guest(, arg, 1) != 0 ) >> +break; >> + >> +ret = pci_reset_device_state(dev.seg, dev.bus, dev.devfn); >> +break; >> +} >> + >> default: >> ret = -ENOSYS; >> break; >> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c >> index 3bec9a4153..e9c3eb1cd6 100644 >> --- a/xen/drivers/vpci/vpci.c >> +++ b/xen/drivers/vpci/vpci.c >> @@ -103,6 +103,15 @@ int vpci_add_handlers(struct pci_dev *pdev) >> >> return rc; >> } >> + >> +int vpci_reset_device_state(struct pci_dev *pdev) >> +{ >> +ASSERT(pcidevs_locked()); >> + >> +vpci_remove_device(pdev); >> +return vpci_add_handlers(pdev); >> +} >> + >> #endif /* __XEN__ */ >> >> static int vpci_register_cmp(const struct vpci_register *r1, >> diff --git a/xen/include/public/physdev.h b/xen/include/public/physdev.h >> index f0c0d4727c..4156948903 100644 >> --- a/xen/include/public/physdev.h >> +++ b/xen/include/public/physdev.h >> @@ -305,6 +305,8 @@
RE: [PATCH v12 02/37] x86/opcode: Add the WRMSRNS instruction to the x86 opcode map
> Add the opcode used by WRMSRNS, which is the non-serializing version of > WRMSR and may replace it to improve performance, to the x86 opcode map. > > Tested-by: Shan Kang > Signed-off-by: Xin Li > Acked-by: Masami Hiramatsu (Google) Hi Masami, Boris prefers to merge the first 3 patches into one: https://lore.kernel.org/lkml/20231108123647.GBZUuA31zntox0W0gu@fat_crate.local/#t So I'm thinking could you please also ack the other 2 patches (1st and 3rd of this patch set)? Thus I can keep your acked-by Thanks! Xin > --- > arch/x86/lib/x86-opcode-map.txt | 2 +- > tools/arch/x86/lib/x86-opcode-map.txt | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-)
Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
n Wed, 29 Nov 2023, Stefano Stabellini wrote: > On Tue, 28 Nov 2023, Roger Pau Monné wrote: > > On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote: > > > In PVH dom0, it uses the linux local interrupt mechanism, > > > when it allocs irq for a gsi, it is dynamic, and follow > > > the principle of applying first, distributing first. And > > > if you debug the kernel codes, you will find the irq > > > number is alloced from small to large, but the applying > > > gsi number is not, may gsi 38 comes before gsi 28, that > > > causes the irq number is not equal with the gsi number. > > > And when we passthrough a device, QEMU will use its gsi > > > number to do mapping actions, see xen_pt_realize-> > > > xc_physdev_map_pirq, but the gsi number is got from file > > > /sys/bus/pci/devices/:xx:xx.x/irq in current code, > > > so it will fail when mapping. > > > And in current codes, there is no method to translate > > > irq to gsi for userspace. > > > > I think it would be cleaner to just introduce a new sysfs node that > > contains the gsi if a device is using one (much like the irq sysfs > > node)? > > > > Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so > > placing it in privcmd does seem quite strange to me. I understand > > that for passthrough we need the GSI, but such translation layer from > > IRQ to GSI is all Linux internal, and it would be much simpler to just > > expose the GSI in sysfs IMO. > > Maybe something to add to drivers/xen/sys-hypervisor.c in Linux. > Juergen what do you think? Let me also add that privcmd.c is already a Linux specific interface. Although it was born to be a Xen hypercall "proxy" in reality today we have a few privcmd ioctls that don't translate into hypercalls. So I don't think that adding IOCTL_PRIVCMD_GSI_FROM_IRQ would be a problem.
Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0
On Fri, 24 Nov 2023, Jiqian Chen wrote: > This patch is to solve two problems we encountered when we try to > passthrough a device to hvm domU base on Xen PVH dom0. > > First, hvm guest will alloc a pirq and irq for a passthrough device > by using gsi, before that, the gsi must first has a mapping in dom0, > see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call > into Xen and check whether dom0 has the mapping. See > XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH > dom0 and it return irq is 0, and then return -EPERM. > This is because the passthrough device doesn't do PHYSDEVOP_map_pirq > when thay are enabled. > > Second, in PVH dom0, the gsi of a passthrough device doesn't get > registered, but gsi must be configured for it to be able to be > mapped into a domU. > > After searching codes, we can find map_pirq and register_gsi will be > done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when > the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems > can be conclude to that the gsi of a passthrough device doesn't be > unmasked. > > To solve the unmaske problem, this patch call the unmask_irq when we > assign a device to be passthrough. So that the gsi can get registered > and mapped in PVH dom0. Roger, this seems to be more of a Xen issue than a Linux issue. Why do we need the unmask check in Xen? Couldn't we just do: diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c index 4e40d3609a..df262a4a18 100644 --- a/xen/arch/x86/hvm/vioapic.c +++ b/xen/arch/x86/hvm/vioapic.c @@ -287,7 +287,7 @@ static void vioapic_write_redirent( hvm_dpci_eoi(d, gsi); } -if ( is_hardware_domain(d) && unmasked ) +if ( is_hardware_domain(d) ) { /* * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock
Re: [RFC KERNEL PATCH v2 1/3] xen/pci: Add xen_reset_device_state function
On Fri, 24 Nov 2023, Jiqian Chen wrote: > When device on dom0 side has been reset, the vpci on Xen side > won't get notification, so that the cached state in vpci is > all out of date with the real device state. > To solve that problem, this patch add a function to clear all > vpci device state when device is reset on dom0 side. > > And call that function in pcistub_init_device. Because when > we use "pci-assignable-add" to assign a passthrough device in > Xen, it will reset passthrough device and the vpci state will > out of date, and then device will fail to restore bar state. > > Signed-off-by: Jiqian Chen > Signed-off-by: Huang Rui > --- > drivers/xen/pci.c | 12 > drivers/xen/xen-pciback/pci_stub.c | 3 +++ > include/xen/interface/physdev.h| 2 ++ > include/xen/pci.h | 6 ++ > 4 files changed, 23 insertions(+) > > diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c > index 72d4e3f193af..e9b30bc09139 100644 > --- a/drivers/xen/pci.c > +++ b/drivers/xen/pci.c > @@ -177,6 +177,18 @@ static int xen_remove_device(struct device *dev) > return r; > } > > +int xen_reset_device_state(const struct pci_dev *dev) > +{ > + struct physdev_pci_device device = { > + .seg = pci_domain_nr(dev->bus), > + .bus = dev->bus->number, > + .devfn = dev->devfn > + }; > + > + return HYPERVISOR_physdev_op(PHYSDEVOP_pci_device_state_reset, ); > +} > +EXPORT_SYMBOL_GPL(xen_reset_device_state); > + > static int xen_pci_notifier(struct notifier_block *nb, > unsigned long action, void *data) > { > diff --git a/drivers/xen/xen-pciback/pci_stub.c > b/drivers/xen/xen-pciback/pci_stub.c > index e34b623e4b41..5a96b6c66c07 100644 > --- a/drivers/xen/xen-pciback/pci_stub.c > +++ b/drivers/xen/xen-pciback/pci_stub.c > @@ -421,6 +421,9 @@ static int pcistub_init_device(struct pci_dev *dev) > else { > dev_dbg(>dev, "resetting (FLR, D3, etc) the device\n"); > __pci_reset_function_locked(dev); > + err = xen_reset_device_state(dev); > + if (err) > + goto config_release; Older versions of Xen won't have the hypercall PHYSDEVOP_pci_device_state_reset implemented. I think we should do something like: if (err && xen_pvh_domain()) goto config_release; Or even: if (xen_pvh_domain()) { err = xen_reset_device_state(dev); if (err) goto config_release; } depending on whether we want to call xen_reset_device_state also for PV guests or not. I am assuming we don't want to error out on failure such as -ENOENT for PV guests. > pci_restore_state(dev); > } > /* Now disable the device (this also ensures some private device > diff --git a/include/xen/interface/physdev.h b/include/xen/interface/physdev.h > index a237af867873..231526f80f6c 100644 > --- a/include/xen/interface/physdev.h > +++ b/include/xen/interface/physdev.h > @@ -263,6 +263,8 @@ struct physdev_pci_device { > uint8_t devfn; > }; > > +#define PHYSDEVOP_pci_device_state_reset 32 > + > #define PHYSDEVOP_DBGP_RESET_PREPARE1 > #define PHYSDEVOP_DBGP_RESET_DONE 2 > > diff --git a/include/xen/pci.h b/include/xen/pci.h > index b8337cf85fd1..b2e2e856efd6 100644 > --- a/include/xen/pci.h > +++ b/include/xen/pci.h > @@ -4,10 +4,16 @@ > #define __XEN_PCI_H__ > > #if defined(CONFIG_XEN_DOM0) > +int xen_reset_device_state(const struct pci_dev *dev); > int xen_find_device_domain_owner(struct pci_dev *dev); > int xen_register_device_domain_owner(struct pci_dev *dev, uint16_t domain); > int xen_unregister_device_domain_owner(struct pci_dev *dev); > #else > +static inline int xen_reset_device_state(const struct pci_dev *dev) > +{ > + return -1; > +} > + > static inline int xen_find_device_domain_owner(struct pci_dev *dev) > { > return -1; > -- > 2.34.1 >
Re: [RFC XEN PATCH v2 3/3] tools: Add new function to get gsi from irq
On Tue, 28 Nov 2023, Roger Pau Monné wrote: > On Fri, Nov 24, 2023 at 06:41:36PM +0800, Jiqian Chen wrote: > > In PVH dom0, it uses the linux local interrupt mechanism, > > when it allocs irq for a gsi, it is dynamic, and follow > > the principle of applying first, distributing first. And > > if you debug the kernel codes, you will find the irq > > number is alloced from small to large, but the applying > > gsi number is not, may gsi 38 comes before gsi 28, that > > causes the irq number is not equal with the gsi number. > > And when we passthrough a device, QEMU will use its gsi > > number to do mapping actions, see xen_pt_realize-> > > xc_physdev_map_pirq, but the gsi number is got from file > > /sys/bus/pci/devices/:xx:xx.x/irq in current code, > > so it will fail when mapping. > > And in current codes, there is no method to translate > > irq to gsi for userspace. > > I think it would be cleaner to just introduce a new sysfs node that > contains the gsi if a device is using one (much like the irq sysfs > node)? > > Such ioctl to translate from IRQ to GSI has nothing to do with Xen, so > placing it in privcmd does seem quite strange to me. I understand > that for passthrough we need the GSI, but such translation layer from > IRQ to GSI is all Linux internal, and it would be much simpler to just > expose the GSI in sysfs IMO. Maybe something to add to drivers/xen/sys-hypervisor.c in Linux. Juergen what do you think?
Re: [PATCH v4 4/6] automation: switch to a wifi card on ADL system
On Fri, 24 Nov 2023, Marek Marczykowski-Górecki wrote: > Switch to a wifi card that has registers on a MSI-X page. This tests the > "x86/hvm: Allow writes to registers on the same page as MSI-X table" > feature. Switch it only for HVM test, because MSI-X adjacent write is > not supported on PV. > > This requires also including drivers and firmware in system for tests. > Remove firmware unrelated to the test, to not increase initrd size too > much (all firmware takes over 100MB compressed). > And finally adjusts test script to handle not only eth0 as a test device, > but also wlan0 and connect it to the wifi network. > > Signed-off-by: Marek Marczykowski-Górecki Reviewed-by: Stefano Stabellini > --- > This needs two new gitlab variables: WIFI_HW2_SSID and WIFI_HW2_PSK. I'll > provide them in private. > > This change requires rebuilding test containers. > > This can be applied only after QEMU change is committed. Otherwise the > test will fail. > --- > automation/gitlab-ci/test.yaml | 4 > automation/scripts/qubes-x86-64.sh | 7 +++ > automation/tests-artifacts/alpine/3.18.dockerfile | 7 +++ > automation/tests-artifacts/kernel/6.1.19.dockerfile | 2 ++ > 4 files changed, 20 insertions(+) > > diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml > index 6aabdb9d156f..931a8fb28e1d 100644 > --- a/automation/gitlab-ci/test.yaml > +++ b/automation/gitlab-ci/test.yaml > @@ -195,6 +195,10 @@ adl-pci-pv-x86-64-gcc-debug: > > adl-pci-hvm-x86-64-gcc-debug: >extends: .adl-x86-64 > + variables: > +PCIDEV: "00:14.3" > +WIFI_SSID: "$WIFI_HW2_SSID" > +WIFI_PSK: "$WIFI_HW2_PSK" >script: > - ./automation/scripts/qubes-x86-64.sh pci-hvm 2>&1 | tee ${LOGFILE} >needs: > diff --git a/automation/scripts/qubes-x86-64.sh > b/automation/scripts/qubes-x86-64.sh > index 7eabc1bd6ad4..60498ef1e89a 100755 > --- a/automation/scripts/qubes-x86-64.sh > +++ b/automation/scripts/qubes-x86-64.sh > @@ -94,6 +94,13 @@ on_reboot = "destroy" > domU_check=" > set -x -e > interface=eth0 > +if [ -e /sys/class/net/wlan0 ]; then > +interface=wlan0 > +set +x > +wpa_passphrase "$WIFI_SSID" "$WIFI_PSK" > /etc/wpa_supplicant.conf > +set -x > +wpa_supplicant -B -iwlan0 -c /etc/wpa_supplicant.conf > +fi > ip link set \"\$interface\" up > timeout 30s udhcpc -i \"\$interface\" > pingip=\$(ip -o -4 r show default|cut -f 3 -d ' ') > diff --git a/automation/tests-artifacts/alpine/3.18.dockerfile > b/automation/tests-artifacts/alpine/3.18.dockerfile > index f1b4a8b7a191..b821a291fed3 100644 > --- a/automation/tests-artifacts/alpine/3.18.dockerfile > +++ b/automation/tests-artifacts/alpine/3.18.dockerfile > @@ -34,6 +34,13 @@ RUN \ >apk add curl && \ >apk add udev && \ >apk add pciutils && \ > + apk add wpa_supplicant && \ > + # Select firmware for hardware tests > + apk add linux-firmware-other && \ > + mkdir /lib/firmware-preserve && \ > + mv /lib/firmware/iwlwifi-so-a0-gf-a0* /lib/firmware-preserve/ && \ > + rm -rf /lib/firmware && \ > + mv /lib/firmware-preserve /lib/firmware && \ >\ ># Xen >cd / && \ > diff --git a/automation/tests-artifacts/kernel/6.1.19.dockerfile > b/automation/tests-artifacts/kernel/6.1.19.dockerfile > index 3a4096780d20..84ed5dff23ae 100644 > --- a/automation/tests-artifacts/kernel/6.1.19.dockerfile > +++ b/automation/tests-artifacts/kernel/6.1.19.dockerfile > @@ -32,6 +32,8 @@ RUN curl -fsSLO > https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-"$LINUX_VERSI > make xen.config && \ > scripts/config --enable BRIDGE && \ > scripts/config --enable IGC && \ > +scripts/config --enable IWLWIFI && \ > +scripts/config --enable IWLMVM && \ > cp .config .config.orig && \ > cat .config.orig | grep XEN | grep =m |sed 's/=m/=y/g' >> .config && \ > make -j$(nproc) bzImage && \ > -- > git-series 0.9.1 >
Re: [PATCH v4 3/6] automation: prevent QEMU access to /dev/mem in PCI passthrough tests
On Fri, 24 Nov 2023, Marek Marczykowski-Górecki wrote: > /dev/mem access doesn't work in dom0 in lockdown and in stubdomain. > Simulate this environment with removing /dev/mem device node. Full test > for lockdown and stubdomain will come later, when all requirements will > be in place. > > Signed-off-by: Marek Marczykowski-Górecki Nice! I was going to suggest to do the same for other PCI Passthrough tests but this is the only one I believe? Acked-by: Stefano Stabellini > --- > This can be applied only after QEMU change is committed. Otherwise the > test will fail. > --- > automation/scripts/qubes-x86-64.sh | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/automation/scripts/qubes-x86-64.sh > b/automation/scripts/qubes-x86-64.sh > index d81ed7b931cf..7eabc1bd6ad4 100755 > --- a/automation/scripts/qubes-x86-64.sh > +++ b/automation/scripts/qubes-x86-64.sh > @@ -163,6 +163,8 @@ ifconfig eth0 up > ifconfig xenbr0 up > ifconfig xenbr0 192.168.0.1 > > +# ensure QEMU wont have access /dev/mem > +rm -f /dev/mem > # get domU console content into test log > tail -F /var/log/xen/console/guest-domU.log 2>/dev/null | sed -e > \"s/^/(domU) /\" & > xl create /etc/xen/domU.cfg > -- > git-series 0.9.1 >
Re: [XEN PATCH v2] automation/eclair: improve scheduled analyses
On Thu, 23 Nov 2023, Simone Ballarin wrote: > The scheduled analyses are intended to maintain an overall vision > of the MISRA complaince of the entire project. For this reason, > the file exclusions in "out_of_scope.ecl" should not be applied. > > This patch amends ECLAIR settings to prevent exempting files for > scheduled analyses. > > Signed-off-by: Simone Ballarin > Reviewed-by: Stefano Stabellini > > --- > Changes in v2: > - drop changes to inhibit test and build stages in scheduled pipelines. > --- > automation/eclair_analysis/ECLAIR/action.settings | 2 +- > automation/eclair_analysis/ECLAIR/analysis.ecl| 12 ++-- > automation/gitlab-ci/analyze.yaml | 2 ++ > 3 files changed, 13 insertions(+), 3 deletions(-) > > diff --git a/automation/eclair_analysis/ECLAIR/action.settings > b/automation/eclair_analysis/ECLAIR/action.settings > index f96368ffc7..3cba1a3afb 100644 > --- a/automation/eclair_analysis/ECLAIR/action.settings > +++ b/automation/eclair_analysis/ECLAIR/action.settings > @@ -134,7 +134,7 @@ push) > badgeLabel="ECLAIR ${ANALYSIS_KIND} ${ref}${variantHeadline} #${jobId}" > ;; > auto_pull_request) > -git remote remove autoPRRemote || true > +git remote remove autoPRRemote 2>/dev/null || true > git remote add autoPRRemote "${autoPRRemoteUrl}" > git fetch -q autoPRRemote > subDir="${ref}" > diff --git a/automation/eclair_analysis/ECLAIR/analysis.ecl > b/automation/eclair_analysis/ECLAIR/analysis.ecl > index fe418d6da1..2507a8e787 100644 > --- a/automation/eclair_analysis/ECLAIR/analysis.ecl > +++ b/automation/eclair_analysis/ECLAIR/analysis.ecl > @@ -2,7 +2,13 @@ > -project_name=getenv("ECLAIR_PROJECT_NAME") > -project_root=getenv("ECLAIR_PROJECT_ROOT") > > --setq=data_dir,getenv("ECLAIR_DATA_DIR") > +setq(data_dir,getenv("ECLAIR_DATA_DIR")) > +setq(analysis_kind,getenv("ANALYSIS_KIND")) > +setq(scheduled_analysis,nil) > + > +strings_map("scheduled-analysis",500,"","^.*scheduled$",0,setq(scheduled_analysis,t)) > +strings_map("scheduled-analysis",500,"","^.*$",0) > +map_strings("scheduled-analysis",analysis_kind) > > -verbose > > @@ -15,7 +21,9 @@ > > -eval_file=toolchain.ecl > -eval_file=public_APIs.ecl > --eval_file=out_of_scope.ecl > +if(scheduled_analysis, > +eval_file("out_of_scope.ecl") > +) Overall the patch looks much better. Only one question: shouldn't it be if not scheduled_analysis instead of: if scheduled_analysis I don't know the language of analysis.ecl but we are supposed to add out_of_scope.ecl in all cases except for scheduled_analysis. Looking at this change it seems to do the opposite? > -eval_file=deviations.ecl > -eval_file=call_properties.ecl > -eval_file=tagging.ecl > diff --git a/automation/gitlab-ci/analyze.yaml > b/automation/gitlab-ci/analyze.yaml > index bd9a68de31..6631db53fa 100644 > --- a/automation/gitlab-ci/analyze.yaml > +++ b/automation/gitlab-ci/analyze.yaml > @@ -28,6 +28,8 @@ >extends: .eclair-analysis >allow_failure: true >rules: > +- if: $CI_PIPELINE_SOURCE == "schedule" > + when: never > - if: $WTOKEN && $CI_PROJECT_PATH =~ /^xen-project\/people\/.*$/ >when: manual > - !reference [.eclair-analysis, rules] > -- > 2.34.1 >
Re: [PATCH v2 5/5] automation: add x86-64 livepatching test
On Tue, 28 Nov 2023, Roger Pau Monne wrote: > Introduce a new gitlab tests for livepatching, using livepatch-build-tools, > which better reflects how downstreams build live patches rather than the > in-tree tests. > > The tests applies the dummy in-tree patch example, checks that the patch is > applied correctly and then reverts and unloads it. > > Signed-off-by: Roger Pau Monné > --- > automation/gitlab-ci/build.yaml | 8 +++ > automation/gitlab-ci/test.yaml| 8 +++ > automation/scripts/build | 21 ++ > .../scripts/qemu-alpine-x86_64-livepatch.sh | 68 +++ > 4 files changed, 105 insertions(+) > create mode 100755 automation/scripts/qemu-alpine-x86_64-livepatch.sh > > diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml > index 32af30ccedc9..22026df51b87 100644 > --- a/automation/gitlab-ci/build.yaml > +++ b/automation/gitlab-ci/build.yaml > @@ -358,6 +358,14 @@ alpine-3.18-gcc-debug: >variables: > CONTAINER: alpine:3.18 > > +alpine-3.18-gcc-livepatch: > + extends: .gcc-x86-64-build > + variables: > +CONTAINER: alpine:3.18 > +LIVEPATCH: y > +EXTRA_XEN_CONFIG: | > + CONFIG_LIVEPATCH=y > + > debian-stretch-gcc-debug: >extends: .gcc-x86-64-build-debug >variables: > diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml > index 6aabdb9d156f..58a90be5ed0e 100644 > --- a/automation/gitlab-ci/test.yaml > +++ b/automation/gitlab-ci/test.yaml > @@ -459,3 +459,11 @@ qemu-smoke-ppc64le-powernv9-gcc: >needs: > - qemu-system-ppc64-8.1.0-ppc64-export > - debian-bullseye-gcc-ppc64le-debug > + > +qemu-alpine-x86_64-gcc-livepatch: > + extends: .qemu-x86-64 > + script: > +- ./automation/scripts/qemu-alpine-x86_64-livepatch.sh 2>&1 | tee > ${LOGFILE} > + needs: > +- *x86-64-test-needs > +- alpine-3.18-gcc-livepatch > diff --git a/automation/scripts/build b/automation/scripts/build > index b3c71fb6fb60..0a0a6dceb08c 100755 > --- a/automation/scripts/build > +++ b/automation/scripts/build > @@ -103,3 +103,24 @@ else > cp -r dist binaries/ > if [[ -f xen/xen ]] ; then cp xen/xen binaries/xen; fi > fi > + > +if [[ "$LIVEPATCH" == "y" ]]; then > +# Build a test livepatch using livepatch-build-tools. > + > +if [[ "$XEN_TARGET_ARCH" != "x86_64" ]]; then > +exit 1 > +fi > + > +# git diff --no-index returns 0 if no differences, otherwise 1. > +git diff --no-index --output=test.patch xen/arch/x86/test-smc-lp.c \ > +xen/arch/x86/test-smc-lp-alt.c > && exit 1 > + > +BUILDID=$(readelf -Wn xen/xen-syms | sed -n -e 's/^.*Build ID: //p') > + > +git clone https://xenbits.xen.org/git-http/livepatch-build-tools.git > +cd livepatch-build-tools > +make > +./livepatch-build -s ../ -p ../test.patch -o out -c ../xen/.config \ > +--depends $BUILDID --xen-depends $BUILDID > +cp out/test.livepatch ../binaries/test.livepatch > +fi I realize this is a matter of taste but if possible I would move this to qemu-alpine-x86_64-livepatch.sh not to make the build script too complex. Otherwise, plase create automation/scripts/livepatch and move this code there. You can call automation/scripts/livepatch from automation/scripts/build. Other than that, this is great! I'll let other review the livepatch specific changes in this series > diff --git a/automation/scripts/qemu-alpine-x86_64-livepatch.sh > b/automation/scripts/qemu-alpine-x86_64-livepatch.sh > new file mode 100755 > index ..da478cac4376 > --- /dev/null > +++ b/automation/scripts/qemu-alpine-x86_64-livepatch.sh > @@ -0,0 +1,68 @@ > +#!/bin/bash > + > +set -ex > + > +cd binaries > +# initrd.tar.gz is Dom0 rootfs > +mkdir -p rootfs > +cd rootfs > +tar xvzf ../initrd.tar.gz > +mkdir proc > +mkdir run > +mkdir srv > +mkdir sys > +rm var/run > +cp -ar ../dist/install/* . > +cp ../test.livepatch ./root/ > +cat << "EOF" >> etc/local.d/xen-lp.start > +#!/bin/bash > + > +set -ex > + > +trap poweroff EXIT > + > +export LD_LIBRARY_PATH=/usr/local/lib > + > +xen-livepatch test && exit 1 || true > + > +xen-livepatch load /root/test.livepatch > + > +# Cannot fail now > +xen-livepatch test > + > +xen-livepatch revert test > +xen-livepatch unload test > + > +xen-livepatch test && exit 1 || true > + > +echo "SUCCESS" > +EOF > +chmod +x etc/local.d/xen-lp.start > +echo "rc_verbose=yes" >> etc/rc.conf > +# rebuild Dom0 rootfs > +find . |cpio -H newc -o|gzip > ../xen-rootfs.cpio.gz > +cd ../.. > + > +cat >> binaries/pxelinux.0 << EOF > +#!ipxe > + > +kernel xen console=com1 console_timestamps=boot > +module bzImage console=hvc0 > +module xen-rootfs.cpio.gz > +boot > +EOF > + > +# Run the test > +rm -f smoke.serial > +timeout -k 1 360 \ > +qemu-system-x86_64 \ > +-cpu qemu64,+svm \ > +-m 2G -smp 2 \ > +-monitor none -serial stdio \ > +-nographic \ > +-device
Re: [PATCH v2 3/5] xen/x86: introduce self modifying code test
On Tue, 28 Nov 2023, Roger Pau Monne wrote: > Introduce a helper to perform checks related to self modifying code, and start > by creating a simple test to check that alternatives have been applied. > > Such test is hooked into the boot process and called just after alternatives > have been applied. In case of failure a message is printed, and the > hypervisor > is tainted as not having passed the tests, this does require introducing a new > taint bit (printed as 'A'). > > A new sysctl is also introduced to run the tests on demand. While there are > no > current users introduced here, further changes will introduce those, and it's > helpful to have the interface defined in the sysctl header from the start. > > Signed-off-by: Roger Pau Monné > --- > Changes since v1: > - Rework test and interface. > --- > tools/include/xenctrl.h | 2 + > tools/libs/ctrl/xc_misc.c | 14 ++ > xen/arch/x86/Makefile | 1 + > xen/arch/x86/include/asm/test-smc.h | 18 > xen/arch/x86/setup.c| 3 ++ > xen/arch/x86/sysctl.c | 7 +++ > xen/arch/x86/test-smc.c | 68 + If possible, could we name this differently? I am asking because of these: https://documentation-service.arm.com/static/5f8ea482f86e16515cdbe3c6?token= https://gitlab.com/xen-project/xen/-/blob/staging/xen/arch/arm/vsmc.c?ref_type=heads > xen/common/kernel.c | 5 ++- > xen/include/public/sysctl.h | 9 > xen/include/xen/lib.h | 1 + > 10 files changed, 126 insertions(+), 2 deletions(-) > create mode 100644 xen/arch/x86/include/asm/test-smc.h > create mode 100644 xen/arch/x86/test-smc.c > > diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h > index 2ef8b4e05422..0f87ffa4affd 100644 > --- a/tools/include/xenctrl.h > +++ b/tools/include/xenctrl.h > @@ -2658,6 +2658,8 @@ int xc_dt_overlay(xc_interface *xch, void *overlay_fdt, >uint32_t overlay_fdt_size, uint8_t overlay_op); > #endif > > +int xc_test_smc(xc_interface *xch, uint32_t tests, uint32_t *result); > + > /* Compat shims */ > #include "xenctrl_compat.h" > > diff --git a/tools/libs/ctrl/xc_misc.c b/tools/libs/ctrl/xc_misc.c > index 5ecdfa2c7934..7f7ece589cc2 100644 > --- a/tools/libs/ctrl/xc_misc.c > +++ b/tools/libs/ctrl/xc_misc.c > @@ -1021,6 +1021,20 @@ int xc_livepatch_replace(xc_interface *xch, char > *name, uint32_t timeout, uint32 > return _xc_livepatch_action(xch, name, LIVEPATCH_ACTION_REPLACE, > timeout, flags); > } > > +int xc_test_smc(xc_interface *xch, uint32_t tests, uint32_t *result) > +{ > +struct xen_sysctl sysctl = { > +.cmd = XEN_SYSCTL_test_smc, > +.u.smc.tests = tests, > +}; > +int rc = do_sysctl(xch, ); > + > +if ( !rc ) > +*result = sysctl.u.smc.results; > + > +return rc; > +} > + > /* > * Local variables: > * mode: C > diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile > index f629157086d0..bdd2183a2fd7 100644 > --- a/xen/arch/x86/Makefile > +++ b/xen/arch/x86/Makefile > @@ -65,6 +65,7 @@ obj-y += smpboot.o > obj-y += spec_ctrl.o > obj-y += srat.o > obj-y += string.o > +obj-y += test-smc.o > obj-y += time.o > obj-y += traps.o > obj-y += tsx.o > diff --git a/xen/arch/x86/include/asm/test-smc.h > b/xen/arch/x86/include/asm/test-smc.h > new file mode 100644 > index ..18b23dbdbf2d > --- /dev/null > +++ b/xen/arch/x86/include/asm/test-smc.h > @@ -0,0 +1,18 @@ > +#ifndef _ASM_X86_TEST_SMC_H_ > +#define _ASM_X86_TEST_SMC_H_ > + > +#include > + > +int test_smc(uint32_t selection, uint32_t *results); > + > +#endif /* _ASM_X86_TEST_SMC_H_ */ > + > +/* > + * Local variables: > + * mode: C > + * c-file-style: "BSD" > + * c-basic-offset: 4 > + * tab-width: 4 > + * indent-tabs-mode: nil > + * End: > + */ > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c > index f6b8a3efd752..1f90d30204fe 100644 > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -58,6 +58,7 @@ > #include > #include > #include > +#include > > /* opt_nosmp: If true, secondary processors are ignored. */ > static bool __initdata opt_nosmp; > @@ -1952,6 +1953,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) > > alternative_branches(); > > +test_smc(XEN_SYSCTL_TEST_SMC_ALL, NULL); > + > /* > * NB: when running as a PV shim VCPUOP_up/down is wired to the shim > * physical cpu_add/remove functions, so launch the guest with only > diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c > index 1d40d82c5ad2..77d091f4bd59 100644 > --- a/xen/arch/x86/sysctl.c > +++ b/xen/arch/x86/sysctl.c > @@ -25,6 +25,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -423,6 +424,12 @@ long arch_do_sysctl( > break; > } > > +case XEN_SYSCTL_test_smc: > +ret = test_smc(sysctl->u.smc.tests, >u.smc.results); >
Re: [PATCH v2 2/5] automation/alpine: add elfutils-dev
On Tue, 28 Nov 2023, Roger Pau Monne wrote: > In preparation for adding some livepatch-build-tools test update the Alpine > container to also install elfutils-dev. > > Signed-off-by: Roger Pau Monné Acked-by: Stefano Stabellini > --- > automation/build/alpine/3.18.dockerfile | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/automation/build/alpine/3.18.dockerfile > b/automation/build/alpine/3.18.dockerfile > index 4ae9cb5e9e30..aac2d8cc82d9 100644 > --- a/automation/build/alpine/3.18.dockerfile > +++ b/automation/build/alpine/3.18.dockerfile > @@ -47,3 +47,5 @@ RUN apk --no-cache add \ >libcap-ng-dev \ >ninja \ >pixman-dev \ > + # livepatch-tools deps > + elfutils-dev > -- > 2.43.0 >
Re: [PATCH v6 4/5] [FUTURE] xen/arm: enable vPCI for domUs
On 11/14/23 04:13, Jan Beulich wrote: > On 13.11.2023 23:21, Stewart Hildebrand wrote: >> --- a/xen/arch/x86/include/asm/domain.h >> +++ b/xen/arch/x86/include/asm/domain.h >> @@ -503,6 +503,8 @@ struct arch_domain >> #define has_vpit(d)(!!((d)->arch.emulation_flags & X86_EMU_PIT)) >> #define has_pirq(d)(!!((d)->arch.emulation_flags & >> X86_EMU_USE_PIRQ)) >> >> +#define arch_needs_vpci(d) ({ (void)(d); false; }) > > See my comments on the v5 thread on both this and ... So, the goal here is to return true for a PVH dom0, and false otherwise (for now). Since dom0 can't feasibly be full HVM, and is_hvm_domain(d) returns true for PVH, how about the following? /* TODO: re-visit when vPCI is enabled for PVH domUs. */ #define arch_needs_vpci(d) ({ \ const struct domain *_d = (d); \ is_hardware_domain(_d) && is_hvm_domain(_d); }) Link to v5 thread for reference [1] [1] https://lists.xenproject.org/archives/html/xen-devel/2023-11/msg00968.html > >> --- a/xen/drivers/passthrough/pci.c >> +++ b/xen/drivers/passthrough/pci.c >> @@ -1542,6 +1542,18 @@ void iommu_dev_iotlb_flush_timeout(struct domain *d, >> struct pci_dev *pdev) >> pcidevs_unlock(); >> } >> >> +static bool needs_vpci(const struct domain *d) >> +{ >> +if ( is_hardware_domain(d) ) >> +return false; > > ... this. I'll move this check to the Arm arch_needs_vpci() in xen/arch/arm/include/asm/domain.h > (It is generally a good idea to wait a little with sending new > versions, when you can't be sure yet whether the earlier discussion has > settled.) (Sorry, I'll be better about this going forward.) > > Jan
Re: [PATCH] automation: Switch u-boot boot command to bootz for arm32 tests
On Fri, 24 Nov 2023, Michal Orzel wrote: > Thanks to recent changes added to ImageBuilder to support the bootz > command, which allows obtaining the effective image size (including NOLOAD > sections) from the zImage header, switch the BOOT_CMD for arm32 tests to > bootz. Among other scenarios, this change will enable us, in the future, > to add tests with UBSAN enabled Xen, which would otherwise fail due to > incorrect image placement resulting in overlapping. > > Signed-off-by: Michal Orzel Acked-by: Stefano Stabellini > --- > Before adding UBSAN CI tests, we still need to decide if we want to add > support > to panic on UBSAN epilogue guarded by some config option or to just grep for > UBSAN message. > > CI pipeline: > https://gitlab.com/xen-project/people/morzel/xen-orzelmichal/-/pipelines/1083821754 > --- > automation/scripts/qemu-smoke-dom0-arm32.sh | 2 +- > automation/scripts/qemu-smoke-dom0less-arm32.sh | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/automation/scripts/qemu-smoke-dom0-arm32.sh > b/automation/scripts/qemu-smoke-dom0-arm32.sh > index a4b487b08055..d91648905669 100755 > --- a/automation/scripts/qemu-smoke-dom0-arm32.sh > +++ b/automation/scripts/qemu-smoke-dom0-arm32.sh > @@ -68,7 +68,7 @@ XEN_CMD="console=dtuart dom0_mem=1024M bootscrub=0 > console_timestamps=boot" > NUM_DOMUS=0 > > LOAD_CMD="tftpb" > -BOOT_CMD="bootm" > +BOOT_CMD="bootz" > UBOOT_SOURCE="boot.source" > UBOOT_SCRIPT="boot.scr"' > config > > diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh > b/automation/scripts/qemu-smoke-dom0less-arm32.sh > index 7e3cfbe9c4d0..e31b6b9014e1 100755 > --- a/automation/scripts/qemu-smoke-dom0less-arm32.sh > +++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh > @@ -101,7 +101,7 @@ DOMU_MEM[0]="512" > NUM_DOMUS=1 > > LOAD_CMD="tftpb" > -BOOT_CMD="bootm" > +BOOT_CMD="bootz" > UBOOT_SOURCE="boot.source" > UBOOT_SCRIPT="boot.scr"' > config > > -- > 2.25.1 >
Linux 6.7-rc1+: WARNING at drivers/xen/evtchn.c:167 evtchn_interrupt
Hi, While testing 6.7-rc3 on Qubes OS I found several warning like in the subject in dom0 log. I see them when running 6.7-rc1 too. I'm not sure what exactly triggers the issue, but my guess would be unbinding an event channel from userspace (closing vchan connection). Specific message: [ 83.973377] [ cut here ] [ 83.975523] Interrupt for port 77, but apparently not enabled; per-user a0e9f1d1 [ 83.979083] WARNING: CPU: 1 PID: 2432 at drivers/xen/evtchn.c:167 evtchn_interrupt+0xb6/0xc0 [xen_evtchn] [ 83.982869] Modules linked in: joydev intel_rapl_msr intel_rapl_common snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi ppdev snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm parport_pc parport pcspkr snd_timer snd e1000e i2c_piix4 soundcore fuse loop xenfs dm_thin_pool dm_persistent_data dm_bio_prison crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 virtio_scsi bochs serio_raw drm_vram_helper xhci_pci xhci_pci_renesas floppy drm_ttm_helper xhci_hcd ttm ata_generic pata_acpi qemu_fw_cfg virtio_console xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn scsi_dh_rdac scsi_dh_emc scsi_dh_alua uinput dm_multipath [ 84.019753] CPU: 1 PID: 2432 Comm: qrexec-client Not tainted 6.7.0-0.rc3.1.qubes.1.fc37.x86_64 #1 [ 84.027045] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [ 84.033828] RIP: e030:evtchn_interrupt+0xb6/0xc0 [xen_evtchn] [ 84.037814] Code: ba 01 00 00 00 be 1d 00 00 00 48 8d bb 88 00 00 00 e8 ce 93 27 c1 eb b4 8b 76 20 48 89 da 48 c7 c7 70 52 21 c0 e8 0a 5c f0 c0 <0f> 0b e9 61 ff ff ff 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 [ 84.048932] RSP: e02b:c90041e7fd60 EFLAGS: 00010082 [ 84.051252] RAX: RBX: 888104fc9a80 RCX: 0027 [ 84.054569] RDX: 88813ff21608 RSI: 0001 RDI: 88813ff21600 [ 84.057606] RBP: 88810522a280 R08: R09: c90041e7fbf8 [ 84.060864] R10: 0003 R11: 82f460c8 R12: 888105d5b6a4 [ 84.064043] R13: 888105d5b760 R14: 88810522a280 R15: 888105d5b600 [ 84.066598] FS: 75ee3eb8ed40() GS:88813ff0() knlGS: [ 84.069251] CS: 1e030 DS: ES: CR0: 80050033 [ 84.071293] CR2: 7b86fa0f0f5c CR3: 000106e18000 CR4: 00040660 [ 84.073911] Call Trace: [ 84.074746] [ 84.075471] ? evtchn_interrupt+0xb6/0xc0 [xen_evtchn] [ 84.077448] ? __warn+0x81/0x130 [ 84.078671] ? evtchn_interrupt+0xb6/0xc0 [xen_evtchn] [ 84.080507] ? report_bug+0x171/0x1a0 [ 84.081830] ? prb_read_valid+0x1b/0x30 [ 84.083682] ? handle_bug+0x41/0x70 [ 84.084952] ? exc_invalid_op+0x17/0x70 [ 84.086314] ? asm_exc_invalid_op+0x1a/0x20 [ 84.088520] ? evtchn_interrupt+0xb6/0xc0 [xen_evtchn] [ 84.090300] __free_irq+0x114/0x2b0 [ 84.091478] free_irq+0x32/0x70 [ 84.092522] unbind_from_irqhandler+0x31/0xb0 [ 84.094035] evtchn_release+0x2b/0xa0 [xen_evtchn] [ 84.095643] __fput+0x9a/0x2c0 [ 84.096734] __x64_sys_close+0x3d/0x80 [ 84.097995] do_syscall_64+0x63/0xe0 [ 84.099188] ? __vm_munmap+0xbb/0x150 [ 84.100431] ? syscall_exit_work+0x103/0x130 [ 84.101854] ? syscall_exit_to_user_mode+0x2b/0x40 [ 84.103509] ? do_syscall_64+0x6f/0xe0 [ 84.104757] ? syscall_exit_to_user_mode+0x2b/0x40 [ 84.106350] ? do_syscall_64+0x6f/0xe0 [ 84.107587] ? exit_to_user_mode_prepare+0xbc/0xd0 [ 84.109180] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 84.110863] RIP: 0033:0x75ee3ed15a0a [ 84.111999] Code: 00 00 0f 05 48 3d 00 f0 ff ff 77 44 c3 0f 1f 00 48 83 ec 18 89 7c 24 0c e8 33 bc f8 ff 8b 7c 24 0c 89 c2 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 93 bc f8 ff 8b 44 24 [ 84.118267] RSP: 002b:70ad0f70 EFLAGS: 0293 ORIG_RAX: 0003 [ 84.120756] RAX: ffda RBX: 55d3d8f70970 RCX: 75ee3ed15a0a [ 84.123675] RDX: RSI: RDI: 000f [ 84.126012] RBP: R08: R09: [ 84.128366] R10: 75ee3ec31468 R11: 0293 R12: [ 84.130699] R13: R14: 70ad2eaa R15: 55d3d8f6f2a0 [ 84.133050] [ 84.133810] ---[ end trace ]--- Full log: https://openqa.qubes-os.org/tests/86647/logfile?filename=serial0.txt -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab signature.asc Description: PGP signature
[linux-5.4 test] 183929: tolerable FAIL - PUSHED
flight 183929 linux-5.4 real [real] http://logs.test-lab.xenproject.org/osstest/logs/183929/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-arm64-arm64-libvirt-raw 18 guest-start.2fail in 183907 pass in 183929 test-armhf-armhf-libvirt-qcow2 17 guest-start/debian.repeat fail in 183907 pass in 183929 test-armhf-armhf-libvirt 8 xen-boot fail pass in 183907 test-armhf-armhf-xl-multivcpu 14 guest-start fail pass in 183907 test-armhf-armhf-xl-rtds 14 guest-startfail pass in 183907 test-armhf-armhf-xl 19 guest-start.2 fail pass in 183907 test-armhf-armhf-xl-vhd 17 guest-start/debian.repeat fail pass in 183907 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 18 guest-start/debian.repeat fail in 183907 REGR. vs. 183803 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-multivcpu 18 guest-start/debian.repeat fail in 183907 like 183797 test-armhf-armhf-libvirt 16 saverestore-support-check fail in 183907 like 183803 test-armhf-armhf-xl-multivcpu 15 migrate-support-check fail in 183907 never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-check fail in 183907 never pass test-armhf-armhf-xl-rtds15 migrate-support-check fail in 183907 never pass test-armhf-armhf-xl-rtds 16 saverestore-support-check fail in 183907 never pass test-armhf-armhf-libvirt15 migrate-support-check fail in 183907 never pass test-armhf-armhf-xl-credit1 14 guest-start fail like 183797 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 183803 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 183803 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 183803 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 183803 test-armhf-armhf-xl-credit2 18 guest-start/debian.repeatfail like 183803 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 183803 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 183803 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 183803 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 183803 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 183803 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 183803 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 183803 test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail
Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
On Wed, 29 Nov 2023, Roger Pau Monné wrote: > On Tue, Nov 28, 2023 at 11:45:34PM +, Volodymyr Babchuk wrote: > > Hi Roger, > > > > Roger Pau Monné writes: > > > > > On Wed, Nov 22, 2023 at 01:18:32PM -0800, Stefano Stabellini wrote: > > >> On Wed, 22 Nov 2023, Roger Pau Monné wrote: > > >> > On Tue, Nov 21, 2023 at 05:12:15PM -0800, Stefano Stabellini wrote: > > >> > > Let me expand on this. Like I wrote above, I think it is important > > >> > > that > > >> > > Xen vPCI is the only in-use PCI Root Complex emulator. If it makes > > >> > > the > > >> > > QEMU implementation easier, it is OK if QEMU emulates an unneeded and > > >> > > unused PCI Root Complex. From Xen point of view, it doesn't exist. > > >> > > > > >> > > In terms if ioreq registration, QEMU calls > > >> > > xendevicemodel_map_pcidev_to_ioreq_server for each PCI BDF it wants > > >> > > to > > >> > > emulate. That way, Xen vPCI knows exactly what PCI config space > > >> > > reads/writes to forward to QEMU. > > >> > > > > >> > > Let's say that: > > >> > > - 00:02.0 is PCI passthrough device > > >> > > - 00:03.0 is a PCI emulated device > > >> > > > > >> > > QEMU would register 00:03.0 and vPCI would know to forward anything > > >> > > related to 00:03.0 to QEMU, but not 00:02.0. > > >> > > > >> > I think there's some work here so that we have a proper hierarchy > > >> > inside of Xen. Right now both ioreq and vpci expect to decode the > > >> > accesses to the PCI config space, and setup (MM)IO handlers to trap > > >> > ECAM, see vpci_ecam_{read,write}(). > > >> > > > >> > I think we want to move to a model where vPCI doesn't setup MMIO traps > > >> > itself, and instead relies on ioreq to do the decoding and forwarding > > >> > of accesses. We need some work in order to represent an internal > > >> > ioreq handler, but that shouldn't be too complicated. IOW: vpci > > >> > should register devices it's handling with ioreq, much like QEMU does. > > >> > > >> I think this could be a good idea. > > >> > > >> This would be the very first IOREQ handler implemented in Xen itself, > > >> rather than outside of Xen. Some code refactoring might be required, > > >> which worries me given that vPCI is at v10 and has been pending for > > >> years. I think it could make sense as a follow-up series, not v11. > > > > > > That's perfectly fine for me, most of the series here just deal with > > > the logic to intercept guest access to the config space and is > > > completely agnostic as to how the accesses are intercepted. > > > > > >> I think this idea would be beneficial if, in the example above, vPCI > > >> doesn't really need to know about device 00:03.0. vPCI registers via > > >> IOREQ the PCI Root Complex and device 00:02.0 only, QEMU registers > > >> 00:03.0, and everything works. vPCI is not involved at all in PCI config > > >> space reads and writes for 00:03.0. If this is the case, then moving > > >> vPCI to IOREQ could be good. > > > > > > Given your description above, with the root complex implemented in > > > vPCI, we would need to mandate vPCI together with ioreqs even if no > > > passthrough devices are using vPCI itself (just for the emulation of > > > the root complex). Which is fine, just wanted to mention the > > > dependency. > > > > > >> On the other hand if vPCI actually needs to know that 00:03.0 exists, > > >> perhaps because it changes something in the PCI Root Complex emulation > > >> or vPCI needs to take some action when PCI config space registers of > > >> 00:03.0 are written to, then I think this model doesn't work well. If > > >> this is the case, then I think it would be best to keep vPCI as MMIO > > >> handler and let vPCI forward to IOREQ when appropriate. > > > > > > At first approximation I don't think we would have such interactions, > > > otherwise the whole premise of ioreq being able to register individual > > > PCI devices would be broken. > > > > > > XenSever already has scenarios with two different user-space emulators > > > (ie: two different ioreq servers) handling accesses to different > > > devices in the same PCI bus, and there's no interaction with the root > > > complex required. Good to hear > > Out of curiosity: how legacy PCI interrupts are handled in this case? In > > my understanding, it is Root Complex's responsibility to propagate > > correct IRQ levels to an interrupt controller? > > I'm unsure whether my understanding of the question is correct, so my > reply might not be what you are asking for, sorry. > > Legacy IRQs (GSI on x86) are setup directly by the toolstack when the > device is assigned to the guest, using PHYSDEVOP_map_pirq + > XEN_DOMCTL_bind_pt_irq. Those hypercalls bind together a host IO-APIC > pin to a guest IO-APIC pin, so that interrupts originating from that > host IO-APIC pin are always forwarded to the guest an injected as > originating from the guest IO-APIC pin. > > Note that the device will always use the same IO-APIC pin, this is not > configured by
Re: [PATCH v6 00/17] Device tree based NUMA support for Arm
Hi Stefano, Thanks for the email! > On Nov 30, 2023, at 09:06, Stefano Stabellini wrote: > > Hi Henry, > > Thank you for all your work on this series and in general upstreaming > other patches too! It’s our pleasure. > How do we test it? Is there a way to test this series with QEMU and some > special device tree? I am asking because during review it would make > things easier if we could exercise the new code somehow. Emmm, let me add some details. I tested the code using FVP. From my understanding, you only need to tweak the device tree for the memory, cpus node and add a "distance-map” node following the device tree binding [1]. But I understand your pain of lacking the FVP to play with the code (same for the R82 support). For these, we’ve checked internally with the legal experts about the possibility to include the FVP in the upstream’s GitLab CI containers without triggering the redistribution issue. It seem that now all the FVPs referenced on [2], including both the “FVP_Base_RevC-2xAEMvA” and the “FVP_BaseR_AEMv8R”, are licensed under lightweight Eco System EULA that has no restrictions on the redistribution. Therefore I think now we are safe to pack them in the upstream’s containers and use it for the GitLab CI testing. I think we can add a GitLab CI job to test NUMA (and future Arm features), it is just we need to design and do a extendable framework for it. I can work on it in the next version of the NUMA series. What do you think? > I haven't had a chance to review this properly but I noticed that a few > code additions are not protected by CONFIG_NUMA. Maybe they should? I think you are referring to the code in xen/arch/arm/setup.c correct? These parts of the code is designed to work for both NUMA and non-NUMA setups (see the in-code comment on top of them). And actually, the function numa_set_cpu_core_mask() should not be protected by CONFIG_NUMA, as it is fixing an actual "bug" described in the commit message of the patch #15. > Also given that this is not a small series, I wanted to check with you > if this is a good time to do a full review of the series (in the sense > that you are OK with handling review feedback now) or whether you would > rather have us do the review at another time. Thanks for being considerate! I have been planned a task and reserved my bandwidth internally to work on the upstream of this series. Also personally I prioritize the upstream’s comment/feedback in my working schedule. So I think I will handle the review feedback in time, don’t worry :) [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/numa.txt [2] https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms Kind regards, Henry > > Cheers, > > Stefano > > > On Mon, 20 Nov 2023, Henry Wang wrote: >> The preparation work to support NUMA on Arm has been merged >> and can be found at [1] and [2]. The initial discussions of >> the Arm NUMA support can be found at [3]. >> >> - Background of this series: >> >> Xen memory allocation and scheduler modules are NUMA aware. >> But actually, on x86 has implemented the architecture APIs >> to support NUMA. Arm was providing a set of fake architecture >> APIs to make it compatible with NUMA awared memory allocation >> and scheduler. >> >> Arm system was working well as a single node NUMA system with >> these fake APIs, because we didn't have multiple nodes NUMA >> system on Arm. But in recent years, more and more Arm devices >> support multiple nodes NUMA system. >> >> So now we have a new problem. When Xen is running on these Arm >> devices, Xen still treat them as single node SMP systems. The >> NUMA affinity capability of Xen memory allocation and scheduler >> becomes meaningless. Because they rely on input data that does >> not reflect real NUMA layout. >> >> Xen still think the access time for all of the memory is the >> same for all CPUs. However, Xen may allocate memory to a VM >> from different NUMA nodes with different access speeds. This >> difference can be amplified in workloads inside VM, causing >> performance instability and timeouts. >> >> So in this patch series, we implement a set of NUMA API to use >> device tree to describe the NUMA layout. We reuse most of the >> code of x86 NUMA to create and maintain the mapping between >> memory and CPU, create the matrix between any two NUMA nodes. >> Except ACPI and some x86 specified code, we have moved other >> code to common. In next stage, when we implement ACPI based >> NUMA for Arm64, we may move the ACPI NUMA code to common too, >> but in current stage, we keep it as x86 only. >> >> This patch serires has been tested and booted well on FVP in >> Arm64 mode with NUMA configs in device tree and one HPE x86 >> NUMA machine. >> >> [1] >> https://lists.xenproject.org/archives/html/xen-devel/2022-06/msg00499.html >> [2] >> https://lists.xenproject.org/archives/html/xen-devel/2022-11/msg01043.html >> [3] >>
Re: [PATCH v6 00/17] Device tree based NUMA support for Arm
Hi Henry, Thank you for all your work on this series and in general upstreaming other patches too! How do we test it? Is there a way to test this series with QEMU and some special device tree? I am asking because during review it would make things easier if we could exercise the new code somehow. I haven't had a chance to review this properly but I noticed that a few code additions are not protected by CONFIG_NUMA. Maybe they should? Also given that this is not a small series, I wanted to check with you if this is a good time to do a full review of the series (in the sense that you are OK with handling review feedback now) or whether you would rather have us do the review at another time. Cheers, Stefano On Mon, 20 Nov 2023, Henry Wang wrote: > The preparation work to support NUMA on Arm has been merged > and can be found at [1] and [2]. The initial discussions of > the Arm NUMA support can be found at [3]. > > - Background of this series: > > Xen memory allocation and scheduler modules are NUMA aware. > But actually, on x86 has implemented the architecture APIs > to support NUMA. Arm was providing a set of fake architecture > APIs to make it compatible with NUMA awared memory allocation > and scheduler. > > Arm system was working well as a single node NUMA system with > these fake APIs, because we didn't have multiple nodes NUMA > system on Arm. But in recent years, more and more Arm devices > support multiple nodes NUMA system. > > So now we have a new problem. When Xen is running on these Arm > devices, Xen still treat them as single node SMP systems. The > NUMA affinity capability of Xen memory allocation and scheduler > becomes meaningless. Because they rely on input data that does > not reflect real NUMA layout. > > Xen still think the access time for all of the memory is the > same for all CPUs. However, Xen may allocate memory to a VM > from different NUMA nodes with different access speeds. This > difference can be amplified in workloads inside VM, causing > performance instability and timeouts. > > So in this patch series, we implement a set of NUMA API to use > device tree to describe the NUMA layout. We reuse most of the > code of x86 NUMA to create and maintain the mapping between > memory and CPU, create the matrix between any two NUMA nodes. > Except ACPI and some x86 specified code, we have moved other > code to common. In next stage, when we implement ACPI based > NUMA for Arm64, we may move the ACPI NUMA code to common too, > but in current stage, we keep it as x86 only. > > This patch serires has been tested and booted well on FVP in > Arm64 mode with NUMA configs in device tree and one HPE x86 > NUMA machine. > > [1] https://lists.xenproject.org/archives/html/xen-devel/2022-06/msg00499.html > [2] https://lists.xenproject.org/archives/html/xen-devel/2022-11/msg01043.html > [3] https://lists.xenproject.org/archives/html/xen-devel/2021-09/msg01903.html > > Henry Wang (1): > xen/arm: Set correct per-cpu cpu_core_mask > > Wei Chen (16): > xen/arm: use NR_MEM_BANKS to override default NR_NODE_MEMBLKS > xen/arm: implement helpers to get and update NUMA status > xen/arm: implement node distance helpers for Arm > xen/arm: use arch_get_ram_range to get memory ranges from bootinfo > xen/arm: build NUMA cpu_to_node map in dt_smp_init_cpus > xen/arm: Add boot and secondary CPU to NUMA system > xen/arm: introduce a helper to parse device tree processor node > xen/arm: introduce a helper to parse device tree memory node > xen/arm: introduce a helper to parse device tree NUMA distance map > xen/arm: unified entry to parse all NUMA data from device tree > xen/arm: keep guest still be NUMA unware > xen/arm: enable device tree based NUMA in system init > xen/arm: implement numa_node_to_arch_nid for device tree NUMA > xen/arm: use CONFIG_NUMA to gate node_online_map in smpboot > xen/arm: Provide Kconfig options for Arm to enable NUMA > docs: update numa command line to support Arm > > CHANGELOG.md | 1 + > SUPPORT.md| 1 + > docs/misc/xen-command-line.pandoc | 2 +- > xen/arch/arm/Kconfig | 11 ++ > xen/arch/arm/Makefile | 2 + > xen/arch/arm/domain_build.c | 6 + > xen/arch/arm/include/asm/numa.h | 91 - > xen/arch/arm/numa-dt.c| 299 ++ > xen/arch/arm/numa.c | 184 ++ > xen/arch/arm/setup.c | 17 ++ > xen/arch/arm/smpboot.c| 38 > xen/arch/x86/include/asm/numa.h | 1 - > xen/arch/x86/srat.c | 2 +- > xen/include/xen/numa.h| 10 + > 14 files changed, 661 insertions(+), 4 deletions(-) > create mode 100644 xen/arch/arm/numa-dt.c > create mode 100644 xen/arch/arm/numa.c > > -- > 2.25.1 >
[libvirt test] 183927: tolerable FAIL - PUSHED
flight 183927 libvirt real [real] flight 183943 libvirt real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/183927/ http://logs.test-lab.xenproject.org/osstest/logs/183943/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-armhf-armhf-libvirt-raw 13 guest-start fail pass in 183943-retest Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-raw 15 saverestore-support-check fail in 183943 like 183880 test-armhf-armhf-libvirt-raw 14 migrate-support-check fail in 183943 never pass test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 183880 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 183880 test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 15 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass version targeted for testing: libvirt f3573b5efaa2002c6c9efc379e3f8578c9bbbdf5 baseline version: libvirt 6f11304849f303cbc2d3896261eff9f91acae52d Last test of basis 183880 2023-11-28 04:18:52 Z1 days Testing same since 183927 2023-11-29 04:18:51 Z0 days1 attempts People who touched revisions under test: Andrea Bolognani jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-arm64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-arm64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-libvirt-xsm pass test-arm64-arm64-libvirt-xsm pass test-amd64-i386-libvirt-xsm pass test-amd64-amd64-libvirt pass test-arm64-arm64-libvirt pass test-armhf-armhf-libvirt pass test-amd64-i386-libvirt pass test-amd64-amd64-libvirt-pairpass test-amd64-i386-libvirt-pair pass test-arm64-arm64-libvirt-qcow2 pass test-armhf-armhf-libvirt-qcow2 pass test-arm64-arm64-libvirt-raw pass test-armhf-armhf-libvirt-raw fail test-amd64-i386-libvirt-raw pass test-amd64-amd64-libvirt-vhd pass
[ovmf test] 183942: all pass - PUSHED
flight 183942 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/183942/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 4f99b5fb936a2a0239d5212948b7104d75d1558c baseline version: ovmf 68d506e0d15c0c412142be68ed006c65b641560f Last test of basis 183932 2023-11-29 07:41:09 Z0 days Testing same since 183942 2023-11-29 21:12:45 Z0 days1 attempts People who touched revisions under test: Michael Kubacki jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git 68d506e0d1..4f99b5fb93 4f99b5fb936a2a0239d5212948b7104d75d1558c -> xen-tested-master
Re: We are not able to virtualize FreeBSD using xen 4.17 on Arm 32 bit
Hi Mario, I suggest to do one step at a time and first to get it fully working using "xl" to create the FreeBSD guest. And afterwords to try to get libvirt to work. Both xl and libvirt call the same library underneath (libxl) to create guests, but xl is tested a lot more with Xen on ARM. I am not going to comment on how to create a FreeBSD xenvm.img as I don't have any knowledge on FreeBSD. On Wed, 29 Nov 2023, Mario Marietto wrote: > Hello to everyone. > > I tried to use xen as a hypervisor instead of kvm + libvirt + virt-manager to > boot FreeBSD on my ARM Chromebook where I have installed > Devuan 5,since Stefano said : > > "That might work. libvirt + virt-manager with the xen accelerator might work > on the ARM Chromebook. That's because as far as I know Xen > integration in libvirt is done via linking to libxl directly and libxl is > supported and working on ARM" > > Unfortunately something is not working properly. What I did has been to > reboot the machine in xen,enable libvirtd & and virtlogd & and > virt-manager &,but this is what happened : > > Traceback (most recent call last): > File "/usr/lib/xen-4.17/bin/pygrub", line 884, in > raise RuntimeError("Unable to find partition containing kernel") > RuntimeError: Unable to find partition containing kernel > > I think it does not recognize the FreeBSD file system structure and its > kernel. Libvirt seems to have been programmed to boot Linux,not > FreeBSD. > > In Fact,I did the counterproof and it seems to be like this : > > > According with this post : > > https://blog.roberthallam.org/2020/05/solving-unable-to-find-partition-containing-kernel/ > > I've created a file called menu.lst inside the boot directory of the image > file called "debian.img",adding the following content inside : > > default 0 > timeout 10 > title Debian > root (hd0,1) > kernel /boot/vmlinux-6.1.59-stb-xen-cbe+ root=/dev/xvda > initrd /boot/initrd.img-6.1.59-stb-xen-cbe+ > > and I tried again to boot the image using virt-manager. It gave this error > again : > > root@devuan-bunsen:/mnt/zroot2/zroot2/OS/Chromebook/FreeBSD-guestOS/linux-xen/debian2/boot# > 2023-11-29 15:21:09.266+: 2467: error : > libxlDomainStartPerform:1256 : internal error: libxenlight failed to create > new domain 'debian12' > > but giving a look inside the log file and I found this interesting situation ; > > Using to parse /boot/grub/menu.lst > (B )0 [1;24r [m [?7h [?1h = [H [J [?1h = [1B pyGRUB version 0.6 > [1B [0m > lk > [1B [0m x [0;7m Debian 12 > [m [0m x > [1B [0m x [72C [0m x > [1B [0m x [72C [0m x > [1B [0m x [72C [0m x > [1B [0m x [72C [0m x > [1B [0m x [72C [0m x > [1B [0m x [72C [0m x > [1B [0m x [72C [0m x > [1B [0m > mj > [1B [70D [0m Use the ↑ and ↓ keys to select which > entry is highlighted. [1B [58DPress enter to boot the selected OS, 'e' to > edit the [1B [52Dcommands before booting, 'a' to modify the > kernel arguments [1B [59Dbefore booting, or 'c' for a command line. [12A [26C > [17B [68DWill boot selected entry in 10 seconds [?1h = > [J [17A [73C [17B [68DWill boot selected entry in 9 seconds [?1h = > [J [17A [73C [17B [68DWill boot selected entry in 8 seconds [?1h = > [J [17A [73C [17B [68DWill boot selected entry in 7 seconds [?1h = > [J [17A [73C [17B [68DWill boot selected entry in 6 seconds [?1h = > [J [17A [73C [17B [68DWill boot selected entry in 5 seconds [?1h = > [J [17A [73C [17B [68DWill boot selected entry in 4 seconds [?1h = > [J [17A [73C [17B [68DWill boot selected entry in 3 seconds [?1h = > [J [17A [73C [17B [68DWill boot selected entry in 2 seconds [?1h = > [J [17A [73C [17B [68DWill boot selected entry in 1 seconds [?1l > [24;1H > [?1l > > > so,it seems that it tried to boot,but for an unknown reason,it still gives > the error. > > Anyway : My xen setup is not broken anymore ; Using Libvirt Linux seems to be > able to boot,FreeBSD does not. Using the "raw" method of > booting FreeBSD as domU could be another story that I will try soon. > > But before trying to compile the correct freebsd kernel that's recognized by > xen using the Elliott Michell code,I need to understand well > what's the procedure that will work. So below you can read what I will try to > do : > > $ truncate -s 100G xenvm.img > $ mdconfig -f xenvm.img -u 0 > $ newfs /dev/md0 > $ mount /dev/md0 /mnt > $ git clone https://gitlab.com/ehem/freebsd-src.git > $ cd freebsd-src > $ make -DNO_MODULES KERNCONF=GENERIC TARGET=arm TARGET_ARCH=armv7 > DESTDIR=/build buildkernel > $ echo "/dev/xbd0 / ufs rw 1 > 1" > /mnt/etc/fstab > > $ nano /build/etc/ttys (add the line 'xc0 "/usr/libexec/getty Pc" xterm on > secure") > $ umount /build > $ mdconfig -d -u 0 > > Do you see errors ? some missing ? > --->
xen | Successful pipeline for staging | f0dd0cd9
Pipeline #1089709752 has passed! Project: xen ( https://gitlab.com/xen-project/xen ) Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging ) Commit: f0dd0cd9 ( https://gitlab.com/xen-project/xen/-/commit/f0dd0cd9598f22ee5509bb5d1466e4821834c4ba ) Commit Message: docs/misra: add guidance on the format of Dir 4... Commit Author: Nicola Vetrini Committed by: Julien Grall Pipeline #1089709752 ( https://gitlab.com/xen-project/xen/-/pipelines/1089709752 ) triggered by Ganis ( https://gitlab.com/ganis ) successfully completed 129 jobs in 3 stages. -- You're receiving this email because of your account on gitlab.com.
[PATCH 1/6] system/cpus: rename qemu_mutex_lock_iothread() to qemu_bql_lock()
The Big QEMU Lock (BQL) has many names and they are confusing. The actual QemuMutex variable is called qemu_global_mutex but it's commonly referred to as the BQL in discussions and some code comments. The locking APIs, however, are called qemu_mutex_lock_iothread() and qemu_mutex_unlock_iothread(). The "iothread" name is historic and comes from when the main thread was split into into KVM vcpu threads and the "iothread" (now called the main loop thread). I have contributed to the confusion myself by introducing a separate --object iothread, a separate concept unrelated to the BQL. The "iothread" name is no longer appropriate for the BQL. Rename the locking APIs to: - void qemu_bql_lock(void) - void qemu_bql_unlock(void) - bool qemu_bql_locked(void) There are more APIs with "iothread" in their names. Subsequent patches will rename them. There are also comments and documentation that will be updated in later patches. Signed-off-by: Stefan Hajnoczi --- include/block/aio-wait.h | 2 +- include/qemu/main-loop.h | 26 +++--- accel/accel-blocker.c| 10 +-- accel/dummy-cpus.c | 8 +- accel/hvf/hvf-accel-ops.c| 4 +- accel/kvm/kvm-accel-ops.c| 4 +- accel/kvm/kvm-all.c | 22 ++--- accel/tcg/cpu-exec.c | 26 +++--- accel/tcg/cputlb.c | 16 ++-- accel/tcg/tcg-accel-ops-icount.c | 4 +- accel/tcg/tcg-accel-ops-mttcg.c | 12 +-- accel/tcg/tcg-accel-ops-rr.c | 14 ++-- accel/tcg/tcg-accel-ops.c| 2 +- accel/tcg/translate-all.c| 2 +- cpu-common.c | 4 +- dump/dump.c | 4 +- hw/core/cpu-common.c | 6 +- hw/i386/intel_iommu.c| 6 +- hw/i386/kvm/xen_evtchn.c | 16 ++-- hw/i386/kvm/xen_overlay.c| 2 +- hw/i386/kvm/xen_xenstore.c | 2 +- hw/intc/arm_gicv3_cpuif.c| 2 +- hw/intc/s390_flic.c | 18 ++-- hw/misc/edu.c| 4 +- hw/misc/imx6_src.c | 2 +- hw/misc/imx7_src.c | 2 +- hw/net/xen_nic.c | 8 +- hw/ppc/pegasos2.c| 2 +- hw/ppc/ppc.c | 4 +- hw/ppc/spapr.c | 2 +- hw/ppc/spapr_rng.c | 4 +- hw/ppc/spapr_softmmu.c | 4 +- hw/remote/mpqemu-link.c | 12 +-- hw/remote/vfio-user-obj.c| 2 +- hw/s390x/s390-skeys.c| 2 +- migration/block-dirty-bitmap.c | 4 +- migration/block.c| 16 ++-- migration/colo.c | 60 +++--- migration/dirtyrate.c| 12 +-- migration/migration.c| 52 ++-- migration/ram.c | 12 +-- replay/replay-internal.c | 2 +- semihosting/console.c| 8 +- stubs/iothread-lock.c| 6 +- system/cpu-throttle.c| 4 +- system/cpus.c| 28 +++ system/dirtylimit.c | 4 +- system/memory.c | 2 +- system/physmem.c | 8 +- system/runstate.c| 2 +- system/watchpoint.c | 4 +- target/arm/arm-powerctl.c| 14 ++-- target/arm/helper.c | 4 +- target/arm/hvf/hvf.c | 8 +- target/arm/kvm.c | 4 +- target/arm/kvm64.c | 4 +- target/arm/ptw.c | 6 +- target/arm/tcg/helper-a64.c | 8 +- target/arm/tcg/m_helper.c| 4 +- target/arm/tcg/op_helper.c | 24 +++--- target/arm/tcg/psci.c| 2 +- target/hppa/int_helper.c | 8 +- target/i386/hvf/hvf.c| 6 +- target/i386/kvm/hyperv.c | 4 +- target/i386/kvm/kvm.c| 28 +++ target/i386/kvm/xen-emu.c| 14 ++-- target/i386/nvmm/nvmm-accel-ops.c| 4 +- target/i386/nvmm/nvmm-all.c | 20 ++--- target/i386/tcg/sysemu/fpu_helper.c | 6 +- target/i386/tcg/sysemu/misc_helper.c | 4 +- target/i386/whpx/whpx-accel-ops.c| 4 +- target/i386/whpx/whpx-all.c | 24 +++--- target/loongarch/csr_helper.c| 4 +- target/mips/kvm.c| 4 +- target/mips/tcg/sysemu/cp0_helper.c | 4 +- target/openrisc/sys_helper.c | 16 ++-- target/ppc/excp_helper.c | 12 +-- target/ppc/kvm.c | 4 +- target/ppc/misc_helper.c | 8 +- target/ppc/timebase_helper.c | 8 +- target/s390x/kvm/kvm.c | 4 +- target/s390x/tcg/misc_helper.c | 118 +-- target/sparc/int32_helper.c | 2 +-
[PATCH 6/6] Rename "QEMU global mutex" to "BQL" in comments and docs
The term "QEMU global mutex" is identical to the more widely used Big QEMU Lock ("BQL"). Update the code comments and documentation to use "BQL" instead of "QEMU global mutex". Signed-off-by: Stefan Hajnoczi --- docs/devel/multi-thread-tcg.rst | 7 +++ docs/devel/qapi-code-gen.rst | 2 +- docs/devel/replay.rst | 2 +- docs/devel/multiple-iothreads.txt | 16 include/block/blockjob.h | 6 +++--- include/io/task.h | 2 +- include/qemu/coroutine-core.h | 2 +- include/qemu/coroutine.h | 2 +- hw/block/dataplane/virtio-blk.c | 8 hw/block/virtio-blk.c | 2 +- hw/scsi/virtio-scsi-dataplane.c | 6 +++--- net/tap.c | 2 +- 12 files changed, 28 insertions(+), 29 deletions(-) diff --git a/docs/devel/multi-thread-tcg.rst b/docs/devel/multi-thread-tcg.rst index c9541a7b20..7302c3bf53 100644 --- a/docs/devel/multi-thread-tcg.rst +++ b/docs/devel/multi-thread-tcg.rst @@ -226,10 +226,9 @@ instruction. This could be a future optimisation. Emulated hardware state --- -Currently thanks to KVM work any access to IO memory is automatically -protected by the global iothread mutex, also known as the BQL (Big -QEMU Lock). Any IO region that doesn't use global mutex is expected to -do its own locking. +Currently thanks to KVM work any access to IO memory is automatically protected +by the BQL (Big QEMU Lock). Any IO region that doesn't use the BQL is expected +to do its own locking. However IO memory isn't the only way emulated hardware state can be modified. Some architectures have model specific registers that diff --git a/docs/devel/qapi-code-gen.rst b/docs/devel/qapi-code-gen.rst index 7f78183cd4..ea8228518c 100644 --- a/docs/devel/qapi-code-gen.rst +++ b/docs/devel/qapi-code-gen.rst @@ -594,7 +594,7 @@ blocking the guest and other background operations. Coroutine safety can be hard to prove, similar to thread safety. Common pitfalls are: -- The global mutex isn't held across ``qemu_coroutine_yield()``, so +- The BQL isn't held across ``qemu_coroutine_yield()``, so operations that used to assume that they execute atomically may have to be more careful to protect against changes in the global state. diff --git a/docs/devel/replay.rst b/docs/devel/replay.rst index 0244be8b9c..effd856f0c 100644 --- a/docs/devel/replay.rst +++ b/docs/devel/replay.rst @@ -184,7 +184,7 @@ modes. Reading and writing requests are created by CPU thread of QEMU. Later these requests proceed to block layer which creates "bottom halves". Bottom halves consist of callback and its parameters. They are processed when -main loop locks the global mutex. These locks are not synchronized with +main loop locks the BQL. These locks are not synchronized with replaying process because main loop also processes the events that do not affect the virtual machine state (like user interaction with monitor). diff --git a/docs/devel/multiple-iothreads.txt b/docs/devel/multiple-iothreads.txt index a3e949f6b3..828e5527a3 100644 --- a/docs/devel/multiple-iothreads.txt +++ b/docs/devel/multiple-iothreads.txt @@ -5,7 +5,7 @@ the COPYING file in the top-level directory. This document explains the IOThread feature and how to write code that runs -outside the QEMU global mutex. +outside the BQL. The main loop and IOThreads --- @@ -29,13 +29,13 @@ scalability bottleneck on hosts with many CPUs. Work can be spread across several IOThreads instead of just one main loop. When set up correctly this can improve I/O latency and reduce jitter seen by the guest. -The main loop is also deeply associated with the QEMU global mutex, which is a -scalability bottleneck in itself. vCPU threads and the main loop use the QEMU -global mutex to serialize execution of QEMU code. This mutex is necessary -because a lot of QEMU's code historically was not thread-safe. +The main loop is also deeply associated with the BQL, which is a +scalability bottleneck in itself. vCPU threads and the main loop use the BQL +to serialize execution of QEMU code. This mutex is necessary because a lot of +QEMU's code historically was not thread-safe. The fact that all I/O processing is done in a single main loop and that the -QEMU global mutex is contended by all vCPU threads and the main loop explain +BQL is contended by all vCPU threads and the main loop explain why it is desirable to place work into IOThreads. The experimental virtio-blk data-plane implementation has been benchmarked and @@ -66,7 +66,7 @@ There are several old APIs that use the main loop AioContext: Since they implicitly work on the main loop they cannot be used in code that runs in an IOThread. They might cause a crash or deadlock if called from an -IOThread since the QEMU global mutex is not held. +IOThread since the BQL is not held. Instead, use the AioContext functions directly (see
[PATCH 3/6] qemu/main-loop: rename qemu_cond_wait_iothread() to qemu_cond_wait_bql()
The name "iothread" is overloaded. Use the term Big QEMU Lock (BQL) instead, it is already widely used and unambiguous. Signed-off-by: Stefan Hajnoczi --- include/qemu/main-loop.h | 8 accel/tcg/tcg-accel-ops-rr.c | 4 ++-- hw/display/virtio-gpu.c | 2 +- hw/ppc/spapr_events.c | 2 +- system/cpu-throttle.c | 2 +- system/cpus.c | 4 ++-- target/i386/nvmm/nvmm-accel-ops.c | 2 +- target/i386/whpx/whpx-accel-ops.c | 2 +- 8 files changed, 13 insertions(+), 13 deletions(-) diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h index 0b6a3e4824..ec2a70f041 100644 --- a/include/qemu/main-loop.h +++ b/include/qemu/main-loop.h @@ -373,17 +373,17 @@ G_DEFINE_AUTOPTR_CLEANUP_FUNC(BQLLockAuto, qemu_bql_auto_unlock) = qemu_bql_auto_lock(__FILE__, __LINE__) /* - * qemu_cond_wait_iothread: Wait on condition for the main loop mutex + * qemu_cond_wait_bql: Wait on condition for the main loop mutex * * This function atomically releases the main loop mutex and causes * the calling thread to block on the condition. */ -void qemu_cond_wait_iothread(QemuCond *cond); +void qemu_cond_wait_bql(QemuCond *cond); /* - * qemu_cond_timedwait_iothread: like the previous, but with timeout + * qemu_cond_timedwait_bql: like the previous, but with timeout */ -void qemu_cond_timedwait_iothread(QemuCond *cond, int ms); +void qemu_cond_timedwait_bql(QemuCond *cond, int ms); /* internal interfaces */ diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c index c21215a094..1e5a688085 100644 --- a/accel/tcg/tcg-accel-ops-rr.c +++ b/accel/tcg/tcg-accel-ops-rr.c @@ -111,7 +111,7 @@ static void rr_wait_io_event(void) while (all_cpu_threads_idle()) { rr_stop_kick_timer(); -qemu_cond_wait_iothread(first_cpu->halt_cond); +qemu_cond_wait_bql(first_cpu->halt_cond); } rr_start_kick_timer(); @@ -198,7 +198,7 @@ static void *rr_cpu_thread_fn(void *arg) /* wait for initial kick-off after machine start */ while (first_cpu->stopped) { -qemu_cond_wait_iothread(first_cpu->halt_cond); +qemu_cond_wait_bql(first_cpu->halt_cond); /* process any pending work */ CPU_FOREACH(cpu) { diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index b016d3bac8..67c5be1a4e 100644 --- a/hw/display/virtio-gpu.c +++ b/hw/display/virtio-gpu.c @@ -1512,7 +1512,7 @@ void virtio_gpu_reset(VirtIODevice *vdev) g->reset_finished = false; qemu_bh_schedule(g->reset_bh); while (!g->reset_finished) { -qemu_cond_wait_iothread(>reset_cond); +qemu_cond_wait_bql(>reset_cond); } } else { virtio_gpu_reset_bh(g); diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c index deb4641505..cb0587 100644 --- a/hw/ppc/spapr_events.c +++ b/hw/ppc/spapr_events.c @@ -899,7 +899,7 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered) } return; } -qemu_cond_wait_iothread(>fwnmi_machine_check_interlock_cond); +qemu_cond_wait_bql(>fwnmi_machine_check_interlock_cond); if (spapr->fwnmi_machine_check_addr == -1) { /* * If the machine was reset while waiting for the interlock, diff --git a/system/cpu-throttle.c b/system/cpu-throttle.c index e98836311b..1d2b73369e 100644 --- a/system/cpu-throttle.c +++ b/system/cpu-throttle.c @@ -54,7 +54,7 @@ static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque) endtime_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + sleeptime_ns; while (sleeptime_ns > 0 && !cpu->stop) { if (sleeptime_ns > SCALE_MS) { -qemu_cond_timedwait_iothread(cpu->halt_cond, +qemu_cond_timedwait_bql(cpu->halt_cond, sleeptime_ns / SCALE_MS); } else { qemu_bql_unlock(); diff --git a/system/cpus.c b/system/cpus.c index d5b98c11f5..eb24a4db8e 100644 --- a/system/cpus.c +++ b/system/cpus.c @@ -513,12 +513,12 @@ void qemu_bql_unlock(void) qemu_mutex_unlock(_global_mutex); } -void qemu_cond_wait_iothread(QemuCond *cond) +void qemu_cond_wait_bql(QemuCond *cond) { qemu_cond_wait(cond, _global_mutex); } -void qemu_cond_timedwait_iothread(QemuCond *cond, int ms) +void qemu_cond_timedwait_bql(QemuCond *cond, int ms) { qemu_cond_timedwait(cond, _global_mutex, ms); } diff --git a/target/i386/nvmm/nvmm-accel-ops.c b/target/i386/nvmm/nvmm-accel-ops.c index 387ccfcce5..0fe8a76820 100644 --- a/target/i386/nvmm/nvmm-accel-ops.c +++ b/target/i386/nvmm/nvmm-accel-ops.c @@ -48,7 +48,7 @@ static void *qemu_nvmm_cpu_thread_fn(void *arg) } } while (cpu_thread_is_idle(cpu)) { -qemu_cond_wait_iothread(cpu->halt_cond); +qemu_cond_wait_bql(cpu->halt_cond); } qemu_wait_io_event_common(cpu); } while
[PATCH 0/6] Make Big QEMU Lock naming consistent
The Big QEMU Lock ("BQL") has two other names: "iothread lock" and "QEMU global mutex". The term "iothread lock" is easily confused with the unrelated --object iothread (iothread.c). This series updates the code and documentation to consistently use "BQL". This makes the code easier to understand. Stefan Hajnoczi (6): system/cpus: rename qemu_mutex_lock_iothread() to qemu_bql_lock() qemu/main-loop: rename QEMU_IOTHREAD_LOCK_GUARD to QEMU_BQL_LOCK_GUARD qemu/main-loop: rename qemu_cond_wait_iothread() to qemu_cond_wait_bql() system/cpus: rename qemu_global_mutex to qemu_bql Replace "iothread lock" with "BQL" in comments Rename "QEMU global mutex" to "BQL" in comments and docs docs/devel/multi-thread-tcg.rst | 7 +- docs/devel/qapi-code-gen.rst | 2 +- docs/devel/replay.rst| 2 +- docs/devel/reset.rst | 2 +- docs/devel/multiple-iothreads.txt| 16 ++-- hw/display/qxl.h | 2 +- include/block/aio-wait.h | 2 +- include/block/blockjob.h | 6 +- include/exec/cpu-common.h| 2 +- include/exec/memory.h| 4 +- include/exec/ramblock.h | 2 +- include/io/task.h| 2 +- include/migration/register.h | 8 +- include/qemu/coroutine-core.h| 2 +- include/qemu/coroutine.h | 2 +- include/qemu/main-loop.h | 54 ++-- target/arm/internals.h | 4 +- accel/accel-blocker.c| 10 +-- accel/dummy-cpus.c | 8 +- accel/hvf/hvf-accel-ops.c| 4 +- accel/kvm/kvm-accel-ops.c| 4 +- accel/kvm/kvm-all.c | 22 ++--- accel/tcg/cpu-exec.c | 26 +++--- accel/tcg/cputlb.c | 20 ++--- accel/tcg/tcg-accel-ops-icount.c | 6 +- accel/tcg/tcg-accel-ops-mttcg.c | 12 +-- accel/tcg/tcg-accel-ops-rr.c | 18 ++-- accel/tcg/tcg-accel-ops.c| 2 +- accel/tcg/translate-all.c| 2 +- cpu-common.c | 4 +- dump/dump.c | 4 +- hw/block/dataplane/virtio-blk.c | 8 +- hw/block/virtio-blk.c| 2 +- hw/core/cpu-common.c | 6 +- hw/display/virtio-gpu.c | 2 +- hw/i386/intel_iommu.c| 6 +- hw/i386/kvm/xen_evtchn.c | 30 +++ hw/i386/kvm/xen_gnttab.c | 2 +- hw/i386/kvm/xen_overlay.c| 2 +- hw/i386/kvm/xen_xenstore.c | 2 +- hw/intc/arm_gicv3_cpuif.c| 2 +- hw/intc/s390_flic.c | 18 ++-- hw/mips/mips_int.c | 2 +- hw/misc/edu.c| 4 +- hw/misc/imx6_src.c | 2 +- hw/misc/imx7_src.c | 2 +- hw/net/xen_nic.c | 8 +- hw/ppc/pegasos2.c| 2 +- hw/ppc/ppc.c | 6 +- hw/ppc/spapr.c | 2 +- hw/ppc/spapr_events.c| 2 +- hw/ppc/spapr_rng.c | 4 +- hw/ppc/spapr_softmmu.c | 4 +- hw/remote/mpqemu-link.c | 14 ++-- hw/remote/vfio-user-obj.c| 2 +- hw/s390x/s390-skeys.c| 2 +- hw/scsi/virtio-scsi-dataplane.c | 6 +- migration/block-dirty-bitmap.c | 14 ++-- migration/block.c| 40 - migration/colo.c | 62 +++--- migration/dirtyrate.c| 12 +-- migration/migration.c| 54 ++-- migration/ram.c | 16 ++-- net/tap.c| 2 +- replay/replay-internal.c | 2 +- semihosting/console.c| 8 +- stubs/iothread-lock.c| 6 +- system/cpu-throttle.c| 6 +- system/cpus.c| 52 ++-- system/dirtylimit.c | 4 +- system/memory.c | 2 +- system/physmem.c | 14 ++-- system/runstate.c| 2 +- system/watchpoint.c | 4 +- target/arm/arm-powerctl.c| 14 ++-- target/arm/helper.c | 6 +- target/arm/hvf/hvf.c | 8 +- target/arm/kvm.c | 4 +- target/arm/kvm64.c | 4 +- target/arm/ptw.c | 6 +- target/arm/tcg/helper-a64.c | 8 +- target/arm/tcg/m_helper.c| 6 +- target/arm/tcg/op_helper.c | 24 +++--- target/arm/tcg/psci.c| 2 +- target/hppa/int_helper.c | 8 +- target/i386/hvf/hvf.c| 6 +- target/i386/kvm/hyperv.c | 4 +- target/i386/kvm/kvm.c| 28 +++ target/i386/kvm/xen-emu.c| 16 ++--
[PATCH 5/6] Replace "iothread lock" with "BQL" in comments
The term "iothread lock" is obsolete. The APIs use Big QEMU Lock (BQL) in their names. Update the code comments to use "BQL" instead of "iothread lock". Signed-off-by: Stefan Hajnoczi --- docs/devel/reset.rst | 2 +- hw/display/qxl.h | 2 +- include/exec/cpu-common.h| 2 +- include/exec/memory.h| 4 ++-- include/exec/ramblock.h | 2 +- include/migration/register.h | 8 target/arm/internals.h | 4 ++-- accel/tcg/cputlb.c | 4 ++-- accel/tcg/tcg-accel-ops-icount.c | 2 +- hw/remote/mpqemu-link.c | 2 +- migration/block-dirty-bitmap.c | 10 +- migration/block.c| 24 migration/colo.c | 2 +- migration/migration.c| 2 +- migration/ram.c | 4 ++-- system/physmem.c | 6 +++--- target/arm/helper.c | 2 +- target/arm/tcg/m_helper.c| 2 +- ui/spice-core.c | 2 +- util/rcu.c | 2 +- audio/coreaudio.m| 4 ++-- ui/cocoa.m | 6 +++--- 22 files changed, 49 insertions(+), 49 deletions(-) diff --git a/docs/devel/reset.rst b/docs/devel/reset.rst index 38ed1790f7..d4e79718ba 100644 --- a/docs/devel/reset.rst +++ b/docs/devel/reset.rst @@ -19,7 +19,7 @@ Triggering reset This section documents the APIs which "users" of a resettable object should use to control it. All resettable control functions must be called while holding -the iothread lock. +the BQL. You can apply a reset to an object using ``resettable_assert_reset()``. You need to call ``resettable_release_reset()`` to release the object from reset. To diff --git a/hw/display/qxl.h b/hw/display/qxl.h index fdac14edad..e0a85a5ca4 100644 --- a/hw/display/qxl.h +++ b/hw/display/qxl.h @@ -159,7 +159,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(PCIQXLDevice, PCI_QXL) * * Use with care; by the time this function returns, the returned pointer is * not protected by RCU anymore. If the caller is not within an RCU critical - * section and does not hold the iothread lock, it must have other means of + * section and does not hold the BQL, it must have other means of * protecting the pointer, such as a reference to the region that includes * the incoming ram_addr_t. * diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 41115d8919..fef3138d29 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -92,7 +92,7 @@ RAMBlock *qemu_ram_block_by_name(const char *name); * * By the time this function returns, the returned pointer is not protected * by RCU anymore. If the caller is not within an RCU critical section and - * does not hold the iothread lock, it must have other means of protecting the + * does not hold the BQL, it must have other means of protecting the * pointer, such as a reference to the memory region that owns the RAMBlock. */ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset, diff --git a/include/exec/memory.h b/include/exec/memory.h index 831f7c996d..ad6466b07e 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -1962,7 +1962,7 @@ int memory_region_get_fd(MemoryRegion *mr); * * Use with care; by the time this function returns, the returned pointer is * not protected by RCU anymore. If the caller is not within an RCU critical - * section and does not hold the iothread lock, it must have other means of + * section and does not hold the BQL, it must have other means of * protecting the pointer, such as a reference to the region that includes * the incoming ram_addr_t. * @@ -1979,7 +1979,7 @@ MemoryRegion *memory_region_from_host(void *ptr, ram_addr_t *offset); * * Use with care; by the time this function returns, the returned pointer is * not protected by RCU anymore. If the caller is not within an RCU critical - * section and does not hold the iothread lock, it must have other means of + * section and does not hold the BQL, it must have other means of * protecting the pointer, such as a reference to the region that includes * the incoming ram_addr_t. * diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h index 69c6a53902..a2bc0a345d 100644 --- a/include/exec/ramblock.h +++ b/include/exec/ramblock.h @@ -34,7 +34,7 @@ struct RAMBlock { ram_addr_t max_length; void (*resized)(const char*, uint64_t length, void *host); uint32_t flags; -/* Protected by iothread lock. */ +/* Protected by BQL. */ char idstr[256]; /* RCU-enabled, writes protected by the ramlist lock */ QLIST_ENTRY(RAMBlock) next; diff --git a/include/migration/register.h b/include/migration/register.h index fed1d04a3c..9ab1f79512 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -17,7 +17,7 @@ #include "hw/vmstate-if.h" typedef struct SaveVMHandlers { -/* This runs inside the iothread
[PATCH 4/6] system/cpus: rename qemu_global_mutex to qemu_bql
The APIs using qemu_global_mutex now follow the Big QEMU Lock (BQL) nomenclature. It's a little strange that the actual QemuMutex variable that embodies the BQL is called qemu_global_mutex instead of qemu_bql. Rename it for consistency. Signed-off-by: Stefan Hajnoczi --- system/cpus.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/system/cpus.c b/system/cpus.c index eb24a4db8e..138720a540 100644 --- a/system/cpus.c +++ b/system/cpus.c @@ -65,7 +65,7 @@ #endif /* CONFIG_LINUX */ -static QemuMutex qemu_global_mutex; +static QemuMutex qemu_bql; /* * The chosen accelerator is supposed to register this. @@ -389,14 +389,14 @@ void qemu_init_cpu_loop(void) qemu_init_sigbus(); qemu_cond_init(_cpu_cond); qemu_cond_init(_pause_cond); -qemu_mutex_init(_global_mutex); +qemu_mutex_init(_bql); qemu_thread_get_self(_thread); } void run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data) { -do_run_on_cpu(cpu, func, data, _global_mutex); +do_run_on_cpu(cpu, func, data, _bql); } static void qemu_cpu_stop(CPUState *cpu, bool exit) @@ -428,7 +428,7 @@ void qemu_wait_io_event(CPUState *cpu) slept = true; qemu_plugin_vcpu_idle_cb(cpu); } -qemu_cond_wait(cpu->halt_cond, _global_mutex); +qemu_cond_wait(cpu->halt_cond, _bql); } if (slept) { qemu_plugin_vcpu_resume_cb(cpu); @@ -502,7 +502,7 @@ void qemu_bql_lock_impl(const char *file, int line) QemuMutexLockFunc bql_lock = qatomic_read(_bql_mutex_lock_func); g_assert(!qemu_bql_locked()); -bql_lock(_global_mutex, file, line); +bql_lock(_bql, file, line); set_bql_locked(true); } @@ -510,17 +510,17 @@ void qemu_bql_unlock(void) { g_assert(qemu_bql_locked()); set_bql_locked(false); -qemu_mutex_unlock(_global_mutex); +qemu_mutex_unlock(_bql); } void qemu_cond_wait_bql(QemuCond *cond) { -qemu_cond_wait(cond, _global_mutex); +qemu_cond_wait(cond, _bql); } void qemu_cond_timedwait_bql(QemuCond *cond, int ms) { -qemu_cond_timedwait(cond, _global_mutex, ms); +qemu_cond_timedwait(cond, _bql, ms); } /* signal CPU creation */ @@ -571,7 +571,7 @@ void pause_all_vcpus(void) replay_mutex_unlock(); while (!all_vcpus_paused()) { -qemu_cond_wait(_pause_cond, _global_mutex); +qemu_cond_wait(_pause_cond, _bql); CPU_FOREACH(cpu) { qemu_cpu_kick(cpu); } @@ -649,7 +649,7 @@ void qemu_init_vcpu(CPUState *cpu) cpus_accel->create_vcpu_thread(cpu); while (!cpu->created) { -qemu_cond_wait(_cpu_cond, _global_mutex); +qemu_cond_wait(_cpu_cond, _bql); } } -- 2.42.0
[PATCH 2/6] qemu/main-loop: rename QEMU_IOTHREAD_LOCK_GUARD to QEMU_BQL_LOCK_GUARD
The name "iothread" is overloaded. Use the term Big QEMU Lock (BQL) instead, it is already widely used and unambiguous. Signed-off-by: Stefan Hajnoczi --- include/qemu/main-loop.h | 20 ++-- hw/i386/kvm/xen_evtchn.c | 14 +++--- hw/i386/kvm/xen_gnttab.c | 2 +- hw/mips/mips_int.c| 2 +- hw/ppc/ppc.c | 2 +- target/i386/kvm/xen-emu.c | 2 +- target/ppc/excp_helper.c | 2 +- target/ppc/helper_regs.c | 2 +- target/riscv/cpu_helper.c | 4 ++-- 9 files changed, 25 insertions(+), 25 deletions(-) diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h index d6f75e57bd..0b6a3e4824 100644 --- a/include/qemu/main-loop.h +++ b/include/qemu/main-loop.h @@ -344,13 +344,13 @@ void qemu_bql_lock_impl(const char *file, int line); void qemu_bql_unlock(void); /** - * QEMU_IOTHREAD_LOCK_GUARD + * QEMU_BQL_LOCK_GUARD * - * Wrap a block of code in a conditional qemu_mutex_{lock,unlock}_iothread. + * Wrap a block of code in a conditional qemu_bql_{lock,unlock}. */ -typedef struct IOThreadLockAuto IOThreadLockAuto; +typedef struct BQLLockAuto BQLLockAuto; -static inline IOThreadLockAuto *qemu_iothread_auto_lock(const char *file, +static inline BQLLockAuto *qemu_bql_auto_lock(const char *file, int line) { if (qemu_bql_locked()) { @@ -358,19 +358,19 @@ static inline IOThreadLockAuto *qemu_iothread_auto_lock(const char *file, } qemu_bql_lock_impl(file, line); /* Anything non-NULL causes the cleanup function to be called */ -return (IOThreadLockAuto *)(uintptr_t)1; +return (BQLLockAuto *)(uintptr_t)1; } -static inline void qemu_iothread_auto_unlock(IOThreadLockAuto *l) +static inline void qemu_bql_auto_unlock(BQLLockAuto *l) { qemu_bql_unlock(); } -G_DEFINE_AUTOPTR_CLEANUP_FUNC(IOThreadLockAuto, qemu_iothread_auto_unlock) +G_DEFINE_AUTOPTR_CLEANUP_FUNC(BQLLockAuto, qemu_bql_auto_unlock) -#define QEMU_IOTHREAD_LOCK_GUARD() \ -g_autoptr(IOThreadLockAuto) _iothread_lock_auto __attribute__((unused)) \ -= qemu_iothread_auto_lock(__FILE__, __LINE__) +#define QEMU_BQL_LOCK_GUARD() \ +g_autoptr(BQLLockAuto) _bql_lock_auto __attribute__((unused)) \ += qemu_bql_auto_lock(__FILE__, __LINE__) /* * qemu_cond_wait_iothread: Wait on condition for the main loop mutex diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c index 07d0ff0253..3ab686bd79 100644 --- a/hw/i386/kvm/xen_evtchn.c +++ b/hw/i386/kvm/xen_evtchn.c @@ -1127,7 +1127,7 @@ int xen_evtchn_reset_op(struct evtchn_reset *reset) return -ESRCH; } -QEMU_IOTHREAD_LOCK_GUARD(); +QEMU_BQL_LOCK_GUARD(); return xen_evtchn_soft_reset(); } @@ -1145,7 +1145,7 @@ int xen_evtchn_close_op(struct evtchn_close *close) return -EINVAL; } -QEMU_IOTHREAD_LOCK_GUARD(); +QEMU_BQL_LOCK_GUARD(); qemu_mutex_lock(>port_lock); ret = close_port(s, close->port, _kvm_routes); @@ -1272,7 +1272,7 @@ int xen_evtchn_bind_pirq_op(struct evtchn_bind_pirq *pirq) return -EINVAL; } -QEMU_IOTHREAD_LOCK_GUARD(); +QEMU_BQL_LOCK_GUARD(); if (s->pirq[pirq->pirq].port) { return -EBUSY; @@ -1824,7 +1824,7 @@ int xen_physdev_map_pirq(struct physdev_map_pirq *map) return -ENOTSUP; } -QEMU_IOTHREAD_LOCK_GUARD(); +QEMU_BQL_LOCK_GUARD(); QEMU_LOCK_GUARD(>port_lock); if (map->domid != DOMID_SELF && map->domid != xen_domid) { @@ -1884,7 +1884,7 @@ int xen_physdev_unmap_pirq(struct physdev_unmap_pirq *unmap) return -EINVAL; } -QEMU_IOTHREAD_LOCK_GUARD(); +QEMU_BQL_LOCK_GUARD(); qemu_mutex_lock(>port_lock); if (!pirq_inuse(s, pirq)) { @@ -1924,7 +1924,7 @@ int xen_physdev_eoi_pirq(struct physdev_eoi *eoi) return -ENOTSUP; } -QEMU_IOTHREAD_LOCK_GUARD(); +QEMU_BQL_LOCK_GUARD(); QEMU_LOCK_GUARD(>port_lock); if (!pirq_inuse(s, pirq)) { @@ -1956,7 +1956,7 @@ int xen_physdev_query_pirq(struct physdev_irq_status_query *query) return -ENOTSUP; } -QEMU_IOTHREAD_LOCK_GUARD(); +QEMU_BQL_LOCK_GUARD(); QEMU_LOCK_GUARD(>port_lock); if (!pirq_inuse(s, pirq)) { diff --git a/hw/i386/kvm/xen_gnttab.c b/hw/i386/kvm/xen_gnttab.c index 0a24f53f20..ee5f8cf257 100644 --- a/hw/i386/kvm/xen_gnttab.c +++ b/hw/i386/kvm/xen_gnttab.c @@ -176,7 +176,7 @@ int xen_gnttab_map_page(uint64_t idx, uint64_t gfn) return -EINVAL; } -QEMU_IOTHREAD_LOCK_GUARD(); +QEMU_BQL_LOCK_GUARD(); QEMU_LOCK_GUARD(>gnt_lock); xen_overlay_do_map_page(>gnt_aliases[idx], gpa); diff --git a/hw/mips/mips_int.c b/hw/mips/mips_int.c index 6c32e466a3..c2454f9724 100644 --- a/hw/mips/mips_int.c +++ b/hw/mips/mips_int.c @@ -36,7 +36,7 @@ static void cpu_mips_irq_request(void *opaque, int irq, int level) return; } -QEMU_IOTHREAD_LOCK_GUARD(); +QEMU_BQL_LOCK_GUARD(); if
Re: [PATCH v6 2/5] xen/vpci: move xen_domctl_createdomain vPCI flag to common
On 11/14/23 04:11, Jan Beulich wrote: > On 13.11.2023 23:21, Stewart Hildebrand wrote: >> @@ -709,10 +710,17 @@ int arch_sanitise_domain_config(struct >> xen_domctl_createdomain *config) >> return -EINVAL; >> } >> >> +if ( vpci && !hvm ) >> +{ >> +dprintk(XENLOG_INFO, "vPCI requested for non-HVM guest\n"); >> +return -EINVAL; >> +} >> + >> return 0; >> } > > As said on the v5 thread, I think my comment was misguided (I'm sorry) > and this wants keeping in common code as you had it. I'll move it back to xen/common/domain.c. No worries. > >> --- a/xen/include/public/arch-x86/xen.h >> +++ b/xen/include/public/arch-x86/xen.h >> @@ -283,15 +283,16 @@ struct xen_arch_domainconfig { >> #define XEN_X86_EMU_PIT (1U<<_XEN_X86_EMU_PIT) >> #define _XEN_X86_EMU_USE_PIRQ 9 >> #define XEN_X86_EMU_USE_PIRQ(1U<<_XEN_X86_EMU_USE_PIRQ) >> -#define _XEN_X86_EMU_VPCI 10 >> -#define XEN_X86_EMU_VPCI(1U<<_XEN_X86_EMU_VPCI) >> +/* >> + * Note: bit 10 was previously used for a XEN_X86_EMU_VPCI flag. This bit >> should >> + * not be re-used without careful consideration. >> + */ > > I think a multi-line comment is drawing overly much attention to this. > How about "Note: Bit 10 was previously used for XEN_X86_EMU_VPCI. Re-use > with care." which I think fits in a single line comment. Sounds good. > > Jan
Re: [RFC PATCH v2 17/19] heki: x86: Update permissions counters during text patching
On 11/27/23 14:08, Peter Zijlstra wrote: > On Mon, Nov 27, 2023 at 10:48:29AM -0600, Madhavan T. Venkataraman wrote: >> Apologies for the late reply. I was on vacation. Please see my response >> below: >> >> On 11/13/23 02:19, Peter Zijlstra wrote: >>> On Sun, Nov 12, 2023 at 09:23:24PM -0500, Mickaël Salaün wrote: From: Madhavan T. Venkataraman X86 uses a function called __text_poke() to modify executable code. This patching function is used by many features such as KProbes and FTrace. Update the permissions counters for the text page so that write permissions can be temporarily established in the EPT to modify the instructions in that page. Cc: Borislav Petkov Cc: Dave Hansen Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Kees Cook Cc: Madhavan T. Venkataraman Cc: Mickaël Salaün Cc: Paolo Bonzini Cc: Sean Christopherson Cc: Thomas Gleixner Cc: Vitaly Kuznetsov Cc: Wanpeng Li Signed-off-by: Madhavan T. Venkataraman --- Changes since v1: * New patch --- arch/x86/kernel/alternative.c | 5 arch/x86/mm/heki.c| 49 +++ include/linux/heki.h | 14 ++ 3 files changed, 68 insertions(+) diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 517ee01503be..64fd8757ba5c 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -1801,6 +1802,7 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l */ pgprot = __pgprot(pgprot_val(PAGE_KERNEL) & ~_PAGE_GLOBAL); + heki_text_poke_start(pages, cross_page_boundary ? 2 : 1, pgprot); /* * The lock is not really needed, but this allows to avoid open-coding. */ @@ -1865,7 +1867,10 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l } local_irq_restore(flags); + pte_unmap_unlock(ptep, ptl); + heki_text_poke_end(pages, cross_page_boundary ? 2 : 1, pgprot); + return addr; } >>> >>> This makes no sense, we already use a custom CR3 with userspace alias >>> for the actual pages to write to, why are you then frobbing permissions >>> on that *again* ? >> >> Today, the permissions for a guest page in the extended page table >> (EPT) are RWX (unless permissions are restricted for some specific >> reason like for shadow page table pages). In this Heki feature, we >> don't allow RWX by default in the EPT. We only allow those permissions >> in the EPT that the guest page actually needs. E.g., for a text page, >> it is R_X in both the guest page table and the EPT. > > To what end? If you always mirror what the guest does, you've not > actually gained anything. > >> For text patching, the above code establishes an alternate mapping in >> the guest page table that is RW_ so that the text can be patched. That >> needs to be reflected in the EPT so that the EPT permissions will >> change from R_X to RWX. In other words, RWX is allowed only as >> necessary. At the end of patching, the EPT permissions are restored to >> R_X. >> >> Does that address your comment? > > No, if you want to mirror the native PTEs why don't you hook into the > paravirt page-table muck and get all that for free? > > Also, this is the user range, are you saying you're also playing these > daft games with user maps? I think that we should have done a better job of communicating the threat model in Heki and how we are trying to address it. I will correct that here. I think this will help answer many questions. Some of these questions also came up in the LPC when we presented this. Apologies for the slightly long answer. It is for everyone's benefit. Bear with me. Threat Model In the threat model in Heki, the attacker is a user space attacker who exploits a kernel vulnerability to gain more privileges or bypass the kernel's access control and self-protection mechanisms. In the context of the guest page table, one of the things that the threat model translates to is a hacker gaining access to a guest page with RWX permissions. E.g., by adding execute permissions to a writable page or by adding write permissions to an executable page. Today, the permissions for a guest page in the extended page table are RWX by default. So, if a hacker manages to establish RWX for a page in the guest page table, then that is all he needs to do some damage. How to defeat the threat To defeat this, we need to establish the correct permissions for a guest page in the extended page table as well. That is, R_X for a text page, R__ for a read-only page and RW_
[xen-unstable-smoke test] 183939: tolerable all pass - PUSHED
flight 183939 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/183939/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen f0dd0cd9598f22ee5509bb5d1466e4821834c4ba baseline version: xen 902377b690f42ddf44ae91c4b0751d597f1cd694 Last test of basis 183934 2023-11-29 11:04:12 Z0 days Testing same since 183939 2023-11-29 18:02:12 Z0 days1 attempts People who touched revisions under test: Julien Grall Nicola Vetrini jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 902377b690..f0dd0cd959 f0dd0cd9598f22ee5509bb5d1466e4821834c4ba -> smoke
Re: xen 4.15.5: msr_relaxed required for MSR 0x1a2
On Mon, Nov 20, 2023 at 10:24:05AM +0100, Roger Pau Monné wrote: > On Mon, Nov 20, 2023 at 08:27:36AM +, James Dingwall wrote: > > On Fri, Nov 17, 2023 at 10:56:30AM +0100, Jan Beulich wrote: > > > On 17.11.2023 10:18, James Dingwall wrote: > > > > On Thu, Nov 16, 2023 at 04:32:47PM +, Andrew Cooper wrote: > > > >> On 16/11/2023 4:15 pm, James Dingwall wrote: > > > >>> Hi, > > > >>> > > > >>> Per the msr_relaxed documentation: > > > >>> > > > >>>"If using this option is necessary to fix an issue, please report > > > >>> a bug." > > > >>> > > > >>> After recently upgrading an environment from Xen 4.14.5 to Xen 4.15.5 > > > >>> we > > > >>> started experiencing a BSOD at boot with one of our Windows guests. > > > >>> We found > > > >>> that enabling `msr_relaxed = 1` in the guest configuration has > > > >>> resolved the > > > >>> problem. With a debug build of Xen and `hvm_debug=2048` on the > > > >>> command line > > > >>> the following messages were caught as the BSOD happened: > > > >>> > > > >>> (XEN) [HVM:11.0] ecx=0x1a2 > > > >>> (XEN) vmx.c:3298:d11v0 RDMSR 0x01a2 unimplemented > > > >>> (XEN) d11v0 VIRIDIAN CRASH: 1e c096 f80b8de81eb5 0 0 > > > >>> > > > >>> I found that MSR 0x1a2 is MSR_TEMPERATURE_TARGET and from that this > > > >>> patch > > > >>> series from last month: > > > >>> > > > >>> https://patchwork.kernel.org/project/xen-devel/list/?series=796550 > > > >>> > > > >>> Picking out just a small part of that fixes the problem for us. > > > >>> Although the > > > >>> the patch is against 4.15.5 I think it would be relevant to more > > > >>> recent > > > >>> releases too. > > > >> > > > >> Which version of Windows, and what hardware? > > > >> > > > >> The Viridian Crash isn't about the RDMSR itself - it's presumably > > > >> collateral damage shortly thereafter. > > > >> > > > >> Does filling in 0 for that MSR also resolve the issue? It's model > > > >> specific and we absolutely cannot pass it through from real hardware > > > >> like that. > > > >> > > > > > > > > Hi Andrew, > > > > > > > > Thanks for your response. The guest is running Windows 10 and the crash > > > > happens in a proprietary hardware driver. A little bit of knowledge as > > > > they say was enough to stop the crash but I don't understand the impact > > > > of what I've actually done... > > > > > > > > To rework the patch I'd need a bit of guidance, if I understand your > > > > suggestion I set the MSR to 0 with this change in emul-priv-op.c: > > > > > > For the purpose of the experiment suggested by Andrew ... > > > > > > > diff --git a/xen/arch/x86/pv/emul-priv-op.c > > > > b/xen/arch/x86/pv/emul-priv-op.c > > > > index ed97b1d6fcc..66f5e417df6 100644 > > > > --- a/xen/arch/x86/pv/emul-priv-op.c > > > > +++ b/xen/arch/x86/pv/emul-priv-op.c > > > > @@ -976,6 +976,10 @@ static int read_msr(unsigned int reg, uint64_t > > > > *val, > > > > *val = 0; > > > > return X86EMUL_OKAY; > > > > > > > > +case MSR_TEMPERATURE_TARGET: > > > > +*val = 0; > > > > +return X86EMUL_OKAY; > > > > + > > > > case MSR_P6_PERFCTR(0) ... MSR_P6_PERFCTR(7): > > > > case MSR_P6_EVNTSEL(0) ... MSR_P6_EVNTSEL(3): > > > > case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR2: > > > > > > ... you wouldn't need this (affects PV domains only), and ... > > > > > > > and this in vmx.c: > > > > > > > > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c > > > > index 54023a92587..bbf37b7f272 100644 > > > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > > > @@ -3259,6 +3259,11 @@ static int vmx_msr_read_intercept(unsigned int > > > > msr, uint64_t *msr_content) > > > > if ( !nvmx_msr_read_intercept(msr, msr_content) ) > > > > goto gp_fault; > > > > break; > > > > + > > > > +case MSR_TEMPERATURE_TARGET: > > > > +*msr_content = 0; > > > > +break; > > I think the preference now is to add such handling directly in > guest_rdmsr()? Protected with a: > > if ( !(cp->x86_vendor & (X86_VENDOR_INTEL)) ) > goto gp_fault; > It is possible we can patch the the driver which is triggering the BSOD but it seems unlileky we'd be able to roll that out in advance of doing the Xen upgrade for dom0. If the problem we are encountering is specific to our situation rather than a general case issue then we can easily carry a patch for that. Thanks for the help, James
[PATCH 12/12] block: remove outdated AioContext locking comments
The AioContext lock no longer exists. There is one noteworthy change: - * More specifically, these functions use BDRV_POLL_WHILE(bs), which - * requires the caller to be either in the main thread and hold - * the BlockdriverState (bs) AioContext lock, or directly in the - * home thread that runs the bs AioContext. Calling them from - * another thread in another AioContext would cause deadlocks. + * More specifically, these functions use BDRV_POLL_WHILE(bs), which requires + * the caller to be either in the main thread or directly in the home thread + * that runs the bs AioContext. Calling them from another thread in another + * AioContext would cause deadlocks. I am not sure whether deadlocks are still possible. Maybe they have just moved to the fine-grained locks that have replaced the AioContext. Since I am not sure if the deadlocks are gone, I have kept the substance unchanged and just removed mention of the AioContext. Signed-off-by: Stefan Hajnoczi --- include/block/block-common.h | 3 -- include/block/block-io.h | 9 ++-- include/block/block_int-common.h | 2 - block.c | 73 ++-- block/block-backend.c| 8 --- block/export/vhost-user-blk-server.c | 4 -- tests/qemu-iotests/202 | 2 +- tests/qemu-iotests/203 | 3 +- 8 files changed, 22 insertions(+), 82 deletions(-) diff --git a/include/block/block-common.h b/include/block/block-common.h index d7599564db..a846023a09 100644 --- a/include/block/block-common.h +++ b/include/block/block-common.h @@ -70,9 +70,6 @@ * automatically takes the graph rdlock when calling the wrapped function. In * the same way, no_co_wrapper_bdrv_wrlock functions automatically take the * graph wrlock. - * - * If the first parameter of the function is a BlockDriverState, BdrvChild or - * BlockBackend pointer, the AioContext lock for it is taken in the wrapper. */ #define no_co_wrapper #define no_co_wrapper_bdrv_rdlock diff --git a/include/block/block-io.h b/include/block/block-io.h index 8eb39a858b..b49e0537dd 100644 --- a/include/block/block-io.h +++ b/include/block/block-io.h @@ -332,11 +332,10 @@ bdrv_co_copy_range(BdrvChild *src, int64_t src_offset, * "I/O or GS" API functions. These functions can run without * the BQL, but only in one specific iothread/main loop. * - * More specifically, these functions use BDRV_POLL_WHILE(bs), which - * requires the caller to be either in the main thread and hold - * the BlockdriverState (bs) AioContext lock, or directly in the - * home thread that runs the bs AioContext. Calling them from - * another thread in another AioContext would cause deadlocks. + * More specifically, these functions use BDRV_POLL_WHILE(bs), which requires + * the caller to be either in the main thread or directly in the home thread + * that runs the bs AioContext. Calling them from another thread in another + * AioContext would cause deadlocks. * * Therefore, these functions are not proper I/O, because they * can't run in *any* iothreads, but only in a specific one. diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h index 4e31d161c5..151279d481 100644 --- a/include/block/block_int-common.h +++ b/include/block/block_int-common.h @@ -1192,8 +1192,6 @@ struct BlockDriverState { /* The error object in use for blocking operations on backing_hd */ Error *backing_blocker; -/* Protected by AioContext lock */ - /* * If we are reading a disk image, give its size in sectors. * Generally read-only; it is written to by load_snapshot and diff --git a/block.c b/block.c index 91ace5d2d5..e773584dfd 100644 --- a/block.c +++ b/block.c @@ -1616,11 +1616,6 @@ out: g_free(gen_node_name); } -/* - * The caller must always hold @bs AioContext lock, because this function calls - * bdrv_refresh_total_sectors() which polls when called from non-coroutine - * context. - */ static int no_coroutine_fn GRAPH_UNLOCKED bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv, const char *node_name, QDict *options, int open_flags, Error **errp) @@ -2901,7 +2896,7 @@ uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission qapi_perm) * Replaces the node that a BdrvChild points to without updating permissions. * * If @new_bs is non-NULL, the parent of @child must already be drained through - * @child and the caller must hold the AioContext lock for @new_bs. + * @child. */ static void GRAPH_WRLOCK bdrv_replace_child_noperm(BdrvChild *child, BlockDriverState *new_bs) @@ -3041,9 +3036,8 @@ static TransactionActionDrv bdrv_attach_child_common_drv = { * * Returns new created child. * - * The caller must hold the AioContext lock for @child_bs. Both @parent_bs and - * @child_bs can move to a different AioContext in this function. Callers must - * make sure that their AioContext locking is still correct after this. + * Both @parent_bs
[PATCH 11/12] job: remove outdated AioContext locking comments
The AioContext lock no longer exists. Signed-off-by: Stefan Hajnoczi --- include/qemu/job.h | 20 1 file changed, 20 deletions(-) diff --git a/include/qemu/job.h b/include/qemu/job.h index e502787dd8..9ea98b5927 100644 --- a/include/qemu/job.h +++ b/include/qemu/job.h @@ -67,8 +67,6 @@ typedef struct Job { /** * The completion function that will be called when the job completes. - * Called with AioContext lock held, since many callback implementations - * use bdrv_* functions that require to hold the lock. */ BlockCompletionFunc *cb; @@ -264,9 +262,6 @@ struct JobDriver { * * This callback will not be invoked if the job has already failed. * If it fails, abort and then clean will be called. - * - * Called with AioContext lock held, since many callbacs implementations - * use bdrv_* functions that require to hold the lock. */ int (*prepare)(Job *job); @@ -277,9 +272,6 @@ struct JobDriver { * * All jobs will complete with a call to either .commit() or .abort() but * never both. - * - * Called with AioContext lock held, since many callback implementations - * use bdrv_* functions that require to hold the lock. */ void (*commit)(Job *job); @@ -290,9 +282,6 @@ struct JobDriver { * * All jobs will complete with a call to either .commit() or .abort() but * never both. - * - * Called with AioContext lock held, since many callback implementations - * use bdrv_* functions that require to hold the lock. */ void (*abort)(Job *job); @@ -301,9 +290,6 @@ struct JobDriver { * .commit() or .abort(). Regardless of which callback is invoked after * completion, .clean() will always be called, even if the job does not * belong to a transaction group. - * - * Called with AioContext lock held, since many callbacs implementations - * use bdrv_* functions that require to hold the lock. */ void (*clean)(Job *job); @@ -318,17 +304,12 @@ struct JobDriver { * READY). * (If the callback is NULL, the job is assumed to terminate * without I/O.) - * - * Called with AioContext lock held, since many callback implementations - * use bdrv_* functions that require to hold the lock. */ bool (*cancel)(Job *job, bool force); /** * Called when the job is freed. - * Called with AioContext lock held, since many callback implementations - * use bdrv_* functions that require to hold the lock. */ void (*free)(Job *job); }; @@ -424,7 +405,6 @@ void job_ref_locked(Job *job); * Release a reference that was previously acquired with job_ref_locked() or * job_create(). If it's the last reference to the object, it will be freed. * - * Takes AioContext lock internally to invoke a job->driver callback. * Called with job lock held. */ void job_unref_locked(Job *job); -- 2.42.0
Re: [PATCH v4 12/14] xen/asm-generic: introduce stub header softirq.h
Hi Oleksii, On 11/27/23 8:13 AM, Oleksii Kurochko wrote: > is common between Arm, PPC and RISC-V so it is > moved to asm-generic. > > Drop Arm and PPC's softirq.h and use asm-generic version instead. > Acked-by: Shawn Anastasio Thanks, Shawn
[PATCH 10/12] scsi: remove outdated AioContext lock comment
The SCSI subsystem no longer uses the AioContext lock. Request processing runs exclusively in the BlockBackend's AioContext since "scsi: only access SCSIDevice->requests from one thread" and hence the lock is unnecessary. Signed-off-by: Stefan Hajnoczi --- hw/scsi/scsi-disk.c | 1 - 1 file changed, 1 deletion(-) diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index f1bd5f5c6e..ef0d21d737 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -351,7 +351,6 @@ done: scsi_req_unref(>req); } -/* Called with AioContext lock held */ static void scsi_dma_complete(void *opaque, int ret) { SCSIDiskReq *r = (SCSIDiskReq *)opaque; -- 2.42.0
[PATCH 06/12] scsi: remove AioContext locking
The AioContext lock no longer has any effect. Remove it. Signed-off-by: Stefan Hajnoczi --- include/hw/virtio/virtio-scsi.h | 14 -- hw/scsi/scsi-bus.c | 2 -- hw/scsi/scsi-disk.c | 28 hw/scsi/virtio-scsi.c | 18 -- 4 files changed, 4 insertions(+), 58 deletions(-) diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h index da8cb928d9..7f0573b1bf 100644 --- a/include/hw/virtio/virtio-scsi.h +++ b/include/hw/virtio/virtio-scsi.h @@ -101,20 +101,6 @@ struct VirtIOSCSI { uint32_t host_features; }; -static inline void virtio_scsi_acquire(VirtIOSCSI *s) -{ -if (s->ctx) { -aio_context_acquire(s->ctx); -} -} - -static inline void virtio_scsi_release(VirtIOSCSI *s) -{ -if (s->ctx) { -aio_context_release(s->ctx); -} -} - void virtio_scsi_common_realize(DeviceState *dev, VirtIOHandleOutput ctrl, VirtIOHandleOutput evt, diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c index b8bfde9565..0031164cc3 100644 --- a/hw/scsi/scsi-bus.c +++ b/hw/scsi/scsi-bus.c @@ -1725,9 +1725,7 @@ void scsi_device_purge_requests(SCSIDevice *sdev, SCSISense sense) { scsi_device_for_each_req_async(sdev, scsi_device_purge_one_req, NULL); -aio_context_acquire(blk_get_aio_context(sdev->conf.blk)); blk_drain(sdev->conf.blk); -aio_context_release(blk_get_aio_context(sdev->conf.blk)); scsi_device_set_ua(sdev, sense); } diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index 2c1bbb3530..f1bd5f5c6e 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -2325,14 +2325,10 @@ static void scsi_disk_reset(DeviceState *dev) { SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev.qdev, dev); uint64_t nb_sectors; -AioContext *ctx; scsi_device_purge_requests(>qdev, SENSE_CODE(RESET)); -ctx = blk_get_aio_context(s->qdev.conf.blk); -aio_context_acquire(ctx); blk_get_geometry(s->qdev.conf.blk, _sectors); -aio_context_release(ctx); nb_sectors /= s->qdev.blocksize / BDRV_SECTOR_SIZE; if (nb_sectors) { @@ -2531,13 +2527,11 @@ static void scsi_unrealize(SCSIDevice *dev) static void scsi_hd_realize(SCSIDevice *dev, Error **errp) { SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev); -AioContext *ctx = NULL; + /* can happen for devices without drive. The error message for missing * backend will be issued in scsi_realize */ if (s->qdev.conf.blk) { -ctx = blk_get_aio_context(s->qdev.conf.blk); -aio_context_acquire(ctx); if (!blkconf_blocksizes(>qdev.conf, errp)) { goto out; } @@ -2549,15 +2543,11 @@ static void scsi_hd_realize(SCSIDevice *dev, Error **errp) } scsi_realize(>qdev, errp); out: -if (ctx) { -aio_context_release(ctx); -} } static void scsi_cd_realize(SCSIDevice *dev, Error **errp) { SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev); -AioContext *ctx; int ret; uint32_t blocksize = 2048; @@ -2573,8 +2563,6 @@ static void scsi_cd_realize(SCSIDevice *dev, Error **errp) blocksize = dev->conf.physical_block_size; } -ctx = blk_get_aio_context(dev->conf.blk); -aio_context_acquire(ctx); s->qdev.blocksize = blocksize; s->qdev.type = TYPE_ROM; s->features |= 1 << SCSI_DISK_F_REMOVABLE; @@ -2582,7 +2570,6 @@ static void scsi_cd_realize(SCSIDevice *dev, Error **errp) s->product = g_strdup("QEMU CD-ROM"); } scsi_realize(>qdev, errp); -aio_context_release(ctx); } @@ -2713,7 +2700,6 @@ static int get_device_type(SCSIDiskState *s) static void scsi_block_realize(SCSIDevice *dev, Error **errp) { SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev); -AioContext *ctx; int sg_version; int rc; @@ -2728,9 +2714,6 @@ static void scsi_block_realize(SCSIDevice *dev, Error **errp) "be removed in a future version"); } -ctx = blk_get_aio_context(s->qdev.conf.blk); -aio_context_acquire(ctx); - /* check we are using a driver managing SG_IO (version 3 and after) */ rc = blk_ioctl(s->qdev.conf.blk, SG_GET_VERSION_NUM, _version); if (rc < 0) { @@ -2738,18 +2721,18 @@ static void scsi_block_realize(SCSIDevice *dev, Error **errp) if (rc != -EPERM) { error_append_hint(errp, "Is this a SCSI device?\n"); } -goto out; +return; } if (sg_version < 3) { error_setg(errp, "scsi generic interface too old"); -goto out; +return; } /* get device type from INQUIRY data */ rc = get_device_type(s); if (rc < 0) { error_setg(errp, "INQUIRY failed"); -goto out; +return; } /* Make a guess for the block size, we'll fix it when the guest sends. @@ -2769,9 +2752,6 @@
[PATCH 07/12] aio-wait: draw equivalence between AIO_WAIT_WHILE() and AIO_WAIT_WHILE_UNLOCKED()
Now that the AioContext lock no longer exists, AIO_WAIT_WHILE() and AIO_WAIT_WHILE_UNLOCKED() are equivalent. A future patch will get rid of AIO_WAIT_WHILE_UNLOCKED(). Signed-off-by: Stefan Hajnoczi --- include/block/aio-wait.h | 16 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h index 5449b6d742..157f105916 100644 --- a/include/block/aio-wait.h +++ b/include/block/aio-wait.h @@ -63,9 +63,6 @@ extern AioWait global_aio_wait; * @ctx: the aio context, or NULL if multiple aio contexts (for which the * caller does not hold a lock) are involved in the polling condition. * @cond: wait while this conditional expression is true - * @unlock: whether to unlock and then lock again @ctx. This applies - * only when waiting for another AioContext from the main loop. - * Otherwise it's ignored. * * Wait while a condition is true. Use this to implement synchronous * operations that require event loop activity. @@ -78,7 +75,7 @@ extern AioWait global_aio_wait; * wait on conditions between two IOThreads since that could lead to deadlock, * go via the main loop instead. */ -#define AIO_WAIT_WHILE_INTERNAL(ctx, cond, unlock) ({ \ +#define AIO_WAIT_WHILE_INTERNAL(ctx, cond) ({ \ bool waited_ = false; \ AioWait *wait_ = _aio_wait; \ AioContext *ctx_ = (ctx); \ @@ -95,13 +92,7 @@ extern AioWait global_aio_wait; assert(qemu_get_current_aio_context() == \ qemu_get_aio_context());\ while ((cond)) { \ -if (unlock && ctx_) { \ -aio_context_release(ctx_); \ -} \ aio_poll(qemu_get_aio_context(), true);\ -if (unlock && ctx_) { \ -aio_context_acquire(ctx_); \ -} \ waited_ = true;\ } \ } \ @@ -109,10 +100,11 @@ extern AioWait global_aio_wait; waited_; }) #define AIO_WAIT_WHILE(ctx, cond) \ -AIO_WAIT_WHILE_INTERNAL(ctx, cond, true) +AIO_WAIT_WHILE_INTERNAL(ctx, cond) +/* TODO replace this with AIO_WAIT_WHILE() in a future patch */ #define AIO_WAIT_WHILE_UNLOCKED(ctx, cond) \ -AIO_WAIT_WHILE_INTERNAL(ctx, cond, false) +AIO_WAIT_WHILE_INTERNAL(ctx, cond) /** * aio_wait_kick: -- 2.42.0
[PATCH 04/12] graph-lock: remove AioContext locking
Stop acquiring/releasing the AioContext lock in bdrv_graph_wrlock()/bdrv_graph_unlock() since the lock no longer has any effect. The distinction between bdrv_graph_wrunlock() and bdrv_graph_wrunlock_ctx() becomes meaningless and they can be collapsed into one function. Signed-off-by: Stefan Hajnoczi --- include/block/graph-lock.h | 21 ++--- block.c| 50 +++--- block/backup.c | 4 +-- block/blklogwrites.c | 8 ++--- block/blkverify.c | 4 +-- block/block-backend.c | 11 +++ block/commit.c | 16 +- block/graph-lock.c | 44 ++ block/mirror.c | 22 ++--- block/qcow2.c | 4 +-- block/quorum.c | 8 ++--- block/replication.c| 14 - block/snapshot.c | 4 +-- block/stream.c | 12 +++ block/vmdk.c | 20 ++-- blockdev.c | 8 ++--- blockjob.c | 12 +++ tests/unit/test-bdrv-drain.c | 40 tests/unit/test-bdrv-graph-mod.c | 20 ++-- scripts/block-coroutine-wrapper.py | 4 +-- 20 files changed, 133 insertions(+), 193 deletions(-) diff --git a/include/block/graph-lock.h b/include/block/graph-lock.h index 22b5db1ed9..d7545e82d0 100644 --- a/include/block/graph-lock.h +++ b/include/block/graph-lock.h @@ -110,34 +110,17 @@ void unregister_aiocontext(AioContext *ctx); * * The wrlock can only be taken from the main loop, with BQL held, as only the * main loop is allowed to modify the graph. - * - * If @bs is non-NULL, its AioContext is temporarily released. - * - * This function polls. Callers must not hold the lock of any AioContext other - * than the current one and the one of @bs. */ void no_coroutine_fn TSA_ACQUIRE(graph_lock) TSA_NO_TSA -bdrv_graph_wrlock(BlockDriverState *bs); +bdrv_graph_wrlock(void); /* * bdrv_graph_wrunlock: * Write finished, reset global has_writer to 0 and restart * all readers that are waiting. - * - * If @bs is non-NULL, its AioContext is temporarily released. */ void no_coroutine_fn TSA_RELEASE(graph_lock) TSA_NO_TSA -bdrv_graph_wrunlock(BlockDriverState *bs); - -/* - * bdrv_graph_wrunlock_ctx: - * Write finished, reset global has_writer to 0 and restart - * all readers that are waiting. - * - * If @ctx is non-NULL, its lock is temporarily released. - */ -void no_coroutine_fn TSA_RELEASE(graph_lock) TSA_NO_TSA -bdrv_graph_wrunlock_ctx(AioContext *ctx); +bdrv_graph_wrunlock(void); /* * bdrv_graph_co_rdlock: diff --git a/block.c b/block.c index bfb0861ec6..25e1ebc606 100644 --- a/block.c +++ b/block.c @@ -1708,12 +1708,12 @@ bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv, const char *node_name, open_failed: bs->drv = NULL; -bdrv_graph_wrlock(NULL); +bdrv_graph_wrlock(); if (bs->file != NULL) { bdrv_unref_child(bs, bs->file); assert(!bs->file); } -bdrv_graph_wrunlock(NULL); +bdrv_graph_wrunlock(); g_free(bs->opaque); bs->opaque = NULL; @@ -3575,9 +3575,9 @@ int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd, bdrv_ref(drain_bs); bdrv_drained_begin(drain_bs); -bdrv_graph_wrlock(backing_hd); +bdrv_graph_wrlock(); ret = bdrv_set_backing_hd_drained(bs, backing_hd, errp); -bdrv_graph_wrunlock(backing_hd); +bdrv_graph_wrunlock(); bdrv_drained_end(drain_bs); bdrv_unref(drain_bs); @@ -3790,13 +3790,13 @@ BdrvChild *bdrv_open_child(const char *filename, return NULL; } -bdrv_graph_wrlock(NULL); +bdrv_graph_wrlock(); ctx = bdrv_get_aio_context(bs); aio_context_acquire(ctx); child = bdrv_attach_child(parent, bs, bdref_key, child_class, child_role, errp); aio_context_release(ctx); -bdrv_graph_wrunlock(NULL); +bdrv_graph_wrunlock(); return child; } @@ -4650,9 +4650,9 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp) aio_context_release(ctx); } -bdrv_graph_wrlock(NULL); +bdrv_graph_wrlock(); tran_commit(tran); -bdrv_graph_wrunlock(NULL); +bdrv_graph_wrunlock(); QTAILQ_FOREACH_REVERSE(bs_entry, bs_queue, entry) { BlockDriverState *bs = bs_entry->state.bs; @@ -4669,9 +4669,9 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp) goto cleanup; abort: -bdrv_graph_wrlock(NULL); +bdrv_graph_wrlock(); tran_abort(tran); -bdrv_graph_wrunlock(NULL); +bdrv_graph_wrunlock(); QTAILQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) { if (bs_entry->prepared) { @@ -4852,12 +4852,12 @@ bdrv_reopen_parse_file_or_backing(BDRVReopenState *reopen_state, }
[PATCH 09/12] docs: remove AioContext lock from IOThread docs
Encourage the use of locking primitives and stop mentioning the AioContext lock since it is being removed. Signed-off-by: Stefan Hajnoczi --- docs/devel/multiple-iothreads.txt | 45 +++ 1 file changed, 15 insertions(+), 30 deletions(-) diff --git a/docs/devel/multiple-iothreads.txt b/docs/devel/multiple-iothreads.txt index a3e949f6b3..4865196bde 100644 --- a/docs/devel/multiple-iothreads.txt +++ b/docs/devel/multiple-iothreads.txt @@ -88,27 +88,18 @@ loop, depending on which AioContext instance the caller passes in. How to synchronize with an IOThread --- -AioContext is not thread-safe so some rules must be followed when using file -descriptors, event notifiers, timers, or BHs across threads: +Variables that can be accessed by multiple threads require some form of +synchronization such as qemu_mutex_lock(), rcu_read_lock(), etc. -1. AioContext functions can always be called safely. They handle their -own locking internally. - -2. Other threads wishing to access the AioContext must use -aio_context_acquire()/aio_context_release() for mutual exclusion. Once the -context is acquired no other thread can access it or run event loop iterations -in this AioContext. - -Legacy code sometimes nests aio_context_acquire()/aio_context_release() calls. -Do not use nesting anymore, it is incompatible with the BDRV_POLL_WHILE() macro -used in the block layer and can lead to hangs. - -There is currently no lock ordering rule if a thread needs to acquire multiple -AioContexts simultaneously. Therefore, it is only safe for code holding the -QEMU global mutex to acquire other AioContexts. +AioContext functions like aio_set_fd_handler(), aio_set_event_notifier(), +aio_bh_new(), and aio_timer_new() are thread-safe. They can be used to trigger +activity in an IOThread. Side note: the best way to schedule a function call across threads is to call -aio_bh_schedule_oneshot(). No acquire/release or locking is needed. +aio_bh_schedule_oneshot(). + +The main loop thread can wait synchronously for a condition using +AIO_WAIT_WHILE(). AioContext and the block layer -- @@ -124,22 +115,16 @@ Block layer code must therefore expect to run in an IOThread and avoid using old APIs that implicitly use the main loop. See the "How to program for IOThreads" above for information on how to do that. -If main loop code such as a QMP function wishes to access a BlockDriverState -it must first call aio_context_acquire(bdrv_get_aio_context(bs)) to ensure -that callbacks in the IOThread do not run in parallel. - Code running in the monitor typically needs to ensure that past requests from the guest are completed. When a block device is running in an IOThread, the IOThread can also process requests from the guest (via ioeventfd). To achieve both objects, wrap the code between bdrv_drained_begin() and bdrv_drained_end(), thus creating a "drained -section". The functions must be called between aio_context_acquire() -and aio_context_release(). You can freely release and re-acquire the -AioContext within a drained section. +section". -Long-running jobs (usually in the form of coroutines) are best scheduled in -the BlockDriverState's AioContext to avoid the need to acquire/release around -each bdrv_*() call. The functions bdrv_add/remove_aio_context_notifier, -or alternatively blk_add/remove_aio_context_notifier if you use BlockBackends, -can be used to get a notification whenever bdrv_try_change_aio_context() moves a +Long-running jobs (usually in the form of coroutines) are often scheduled in +the BlockDriverState's AioContext. The functions +bdrv_add/remove_aio_context_notifier, or alternatively +blk_add/remove_aio_context_notifier if you use BlockBackends, can be used to +get a notification whenever bdrv_try_change_aio_context() moves a BlockDriverState to a different AioContext. -- 2.42.0
[PATCH 08/12] aio: remove aio_context_acquire()/aio_context_release() API
Delete these functions because nothing calls these functions anymore. I introduced these APIs in commit 98563fc3ec44 ("aio: add aio_context_acquire() and aio_context_release()") in 2014. It's with a sigh of relief that I delete these APIs almost 10 years later. Thanks to Paolo Bonzini's vision for multi-queue QEMU, we got an understanding of where the code needed to go in order to remove the limitations that the original dataplane and the IOThread/AioContext approach that followed it. Emanuele Giuseppe Esposito had the splendid determination to convert large parts of the codebase so that they no longer needed the AioContext lock. This was a painstaking process, both in the actual code changes required and the iterations of code review that Emanuele eeked out of Kevin and me over many months. Kevin Wolf tackled multitudes of graph locking conversions to protect in-flight I/O from run-time changes to the block graph as well as the clang Thread Safety Analysis annotations that allow the compiler to check whether the graph lock is being used correctly. And me, well, I'm just here to add some pizzazz to the QEMU multi-queue block layer :). Thank you to everyone who helped with this effort, including Eric Blake, code reviewer extraordinaire, and others who I've forgotten to mention. Signed-off-by: Stefan Hajnoczi --- include/block/aio.h | 17 - util/async.c| 10 -- 2 files changed, 27 deletions(-) diff --git a/include/block/aio.h b/include/block/aio.h index f08b358077..af05512a7d 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -278,23 +278,6 @@ void aio_context_ref(AioContext *ctx); */ void aio_context_unref(AioContext *ctx); -/* Take ownership of the AioContext. If the AioContext will be shared between - * threads, and a thread does not want to be interrupted, it will have to - * take ownership around calls to aio_poll(). Otherwise, aio_poll() - * automatically takes care of calling aio_context_acquire and - * aio_context_release. - * - * Note that this is separate from bdrv_drained_begin/bdrv_drained_end. A - * thread still has to call those to avoid being interrupted by the guest. - * - * Bottom halves, timers and callbacks can be created or removed without - * acquiring the AioContext. - */ -void aio_context_acquire(AioContext *ctx); - -/* Relinquish ownership of the AioContext. */ -void aio_context_release(AioContext *ctx); - /** * aio_bh_schedule_oneshot_full: Allocate a new bottom half structure that will * run only once and as soon as possible. diff --git a/util/async.c b/util/async.c index dfd44ef612..460529057c 100644 --- a/util/async.c +++ b/util/async.c @@ -719,16 +719,6 @@ void aio_context_unref(AioContext *ctx) g_source_unref(>source); } -void aio_context_acquire(AioContext *ctx) -{ -/* TODO remove this function */ -} - -void aio_context_release(AioContext *ctx) -{ -/* TODO remove this function */ -} - QEMU_DEFINE_STATIC_CO_TLS(AioContext *, my_aiocontext) AioContext *qemu_get_current_aio_context(void) -- 2.42.0
[PATCH 00/12] aio: remove AioContext lock
This series removes the AioContext locking APIs from QEMU. aio_context_acquire() and aio_context_release() are currently only needed to support the locking discipline required by AIO_POLL_WHILE() (except for a stray user that I converted in Patch 1). AIO_POLL_WHILE() doesn't really need the AioContext lock anymore, so it's possible to remove the API. This is a nice simplification because the AioContext locking rules were sometimes tricky or underspecified, leading to many bugs of the years. This patch series removes these APIs across the codebase and cleans up the documentation/comments that refers to them. Patch 1 is a AioContext lock user I forgot to convert in my earlier SCSI conversion series. Patch 2 removes tests for the AioContext lock because they will no longer be needed when the lock is gone. Patches 3-9 remove the AioContext lock. These can be reviewed by categorizing the call sites into 1. places that take the lock because they call an API that requires the lock (ultimately AIO_POLL_WHILE()) and 2. places that take the lock to protect state. There should be no instances of case 2 left. If you see one, you've found a bug in this patch series! Patches 10-12 remove comments. Based-on: 20231123194931.171598-1-stefa...@redhat.com ("[PATCH 0/4] scsi: eliminate AioContext lock") Since SCSI needs to stop relying on the AioContext lock before we can remove the lock. Stefan Hajnoczi (12): virtio-scsi: replace AioContext lock with tmf_bh_lock tests: remove aio_context_acquire() tests aio: make aio_context_acquire()/aio_context_release() a no-op graph-lock: remove AioContext locking block: remove AioContext locking scsi: remove AioContext locking aio-wait: draw equivalence between AIO_WAIT_WHILE() and AIO_WAIT_WHILE_UNLOCKED() aio: remove aio_context_acquire()/aio_context_release() API docs: remove AioContext lock from IOThread docs scsi: remove outdated AioContext lock comment job: remove outdated AioContext locking comments block: remove outdated AioContext locking comments docs/devel/multiple-iothreads.txt| 45 ++-- include/block/aio-wait.h | 16 +- include/block/aio.h | 17 -- include/block/block-common.h | 3 - include/block/block-global-state.h | 9 +- include/block/block-io.h | 12 +- include/block/block_int-common.h | 2 - include/block/graph-lock.h | 21 +- include/block/snapshot.h | 2 - include/hw/virtio/virtio-scsi.h | 17 +- include/qemu/job.h | 20 -- block.c | 357 --- block/backup.c | 4 +- block/blklogwrites.c | 8 +- block/blkverify.c| 4 +- block/block-backend.c| 33 +-- block/commit.c | 16 +- block/copy-before-write.c| 22 +- block/export/export.c| 22 +- block/export/vhost-user-blk-server.c | 4 - block/graph-lock.c | 44 +--- block/io.c | 45 +--- block/mirror.c | 41 +-- block/monitor/bitmap-qmp-cmds.c | 20 +- block/monitor/block-hmp-cmds.c | 29 --- block/qapi-sysemu.c | 27 +- block/qapi.c | 18 +- block/qcow2.c| 4 +- block/quorum.c | 8 +- block/raw-format.c | 5 - block/replication.c | 72 +- block/snapshot.c | 26 +- block/stream.c | 12 +- block/vmdk.c | 20 +- block/write-threshold.c | 6 - blockdev.c | 315 +-- blockjob.c | 30 +-- hw/block/dataplane/virtio-blk.c | 10 - hw/block/dataplane/xen-block.c | 17 +- hw/block/virtio-blk.c| 45 +--- hw/core/qdev-properties-system.c | 9 - hw/scsi/scsi-bus.c | 2 - hw/scsi/scsi-disk.c | 29 +-- hw/scsi/virtio-scsi.c| 80 +++--- job.c| 16 -- migration/block.c| 33 +-- migration/migration-hmp-cmds.c | 3 - migration/savevm.c | 22 -- net/colo-compare.c | 2 - qemu-img.c | 4 - qemu-io.c| 10 +- qemu-nbd.c | 2 - replay/replay-debugging.c| 4 - tests/unit/test-aio.c| 67 + tests/unit/test-bdrv-drain.c | 91 ++- tests/unit/test-bdrv-graph-mod.c | 26 +- tests/unit/test-block-iothread.c | 31 --- tests/unit/test-blockjob.c | 137 -- tests/unit/test-replication.c| 11 - util/async.c | 14 -- util/vhost-user-server.c | 3 -
[PATCH 03/12] aio: make aio_context_acquire()/aio_context_release() a no-op
aio_context_acquire()/aio_context_release() has been replaced by fine-grained locking to protect state shared by multiple threads. The AioContext lock still plays the role of balancing locking in AIO_WAIT_WHILE() and many functions in QEMU either require that the AioContext lock is held or not held for this reason. In other words, the AioContext lock is purely there for consistency with itself and serves no real purpose anymore. Stop actually acquiring/releasing the lock in aio_context_acquire()/aio_context_release() so that subsequent patches can remove callers across the codebase incrementally. I have performed "make check" and qemu-iotests stress tests across x86-64, ppc64le, and aarch64 to confirm that there are no failures as a result of eliminating the lock. Signed-off-by: Stefan Hajnoczi --- util/async.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/util/async.c b/util/async.c index 8f90ddc304..04ee83d220 100644 --- a/util/async.c +++ b/util/async.c @@ -725,12 +725,12 @@ void aio_context_unref(AioContext *ctx) void aio_context_acquire(AioContext *ctx) { -qemu_rec_mutex_lock(>lock); +/* TODO remove this function */ } void aio_context_release(AioContext *ctx) { -qemu_rec_mutex_unlock(>lock); +/* TODO remove this function */ } QEMU_DEFINE_STATIC_CO_TLS(AioContext *, my_aiocontext) -- 2.42.0
[PATCH 01/12] virtio-scsi: replace AioContext lock with tmf_bh_lock
Protect the Task Management Function BH state with a lock. The TMF BH runs in the main loop thread. An IOThread might process a TMF at the same time as the TMF BH is running. Therefore tmf_bh_list and tmf_bh must be protected by a lock. Run TMF request completion in the IOThread using aio_wait_bh_oneshot(). This avoids more locking to protect the virtqueue and SCSI layer state. Signed-off-by: Stefan Hajnoczi --- include/hw/virtio/virtio-scsi.h | 3 +- hw/scsi/virtio-scsi.c | 62 ++--- 2 files changed, 43 insertions(+), 22 deletions(-) diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h index 779568ab5d..da8cb928d9 100644 --- a/include/hw/virtio/virtio-scsi.h +++ b/include/hw/virtio/virtio-scsi.h @@ -85,8 +85,9 @@ struct VirtIOSCSI { /* * TMFs deferred to main loop BH. These fields are protected by - * virtio_scsi_acquire(). + * tmf_bh_lock. */ +QemuMutex tmf_bh_lock; QEMUBH *tmf_bh; QTAILQ_HEAD(, VirtIOSCSIReq) tmf_bh_list; diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 9c751bf296..4f8d35facc 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -123,6 +123,30 @@ static void virtio_scsi_complete_req(VirtIOSCSIReq *req) virtio_scsi_free_req(req); } +static void virtio_scsi_complete_req_bh(void *opaque) +{ +VirtIOSCSIReq *req = opaque; + +virtio_scsi_complete_req(req); +} + +/* + * Called from virtio_scsi_do_one_tmf_bh() in main loop thread. The main loop + * thread cannot touch the virtqueue since that could race with an IOThread. + */ +static void virtio_scsi_complete_req_from_main_loop(VirtIOSCSIReq *req) +{ +VirtIOSCSI *s = req->dev; + +if (!s->ctx || s->ctx == qemu_get_aio_context()) { +/* No need to schedule a BH when there is no IOThread */ +virtio_scsi_complete_req(req); +} else { +/* Run request completion in the IOThread */ +aio_wait_bh_oneshot(s->ctx, virtio_scsi_complete_req_bh, req); +} +} + static void virtio_scsi_bad_req(VirtIOSCSIReq *req) { virtio_error(VIRTIO_DEVICE(req->dev), "wrong size for virtio-scsi headers"); @@ -338,10 +362,7 @@ static void virtio_scsi_do_one_tmf_bh(VirtIOSCSIReq *req) out: object_unref(OBJECT(d)); - -virtio_scsi_acquire(s); -virtio_scsi_complete_req(req); -virtio_scsi_release(s); +virtio_scsi_complete_req_from_main_loop(req); } /* Some TMFs must be processed from the main loop thread */ @@ -354,18 +375,16 @@ static void virtio_scsi_do_tmf_bh(void *opaque) GLOBAL_STATE_CODE(); -virtio_scsi_acquire(s); +WITH_QEMU_LOCK_GUARD(>tmf_bh_lock) { +QTAILQ_FOREACH_SAFE(req, >tmf_bh_list, next, tmp) { +QTAILQ_REMOVE(>tmf_bh_list, req, next); +QTAILQ_INSERT_TAIL(, req, next); +} -QTAILQ_FOREACH_SAFE(req, >tmf_bh_list, next, tmp) { -QTAILQ_REMOVE(>tmf_bh_list, req, next); -QTAILQ_INSERT_TAIL(, req, next); +qemu_bh_delete(s->tmf_bh); +s->tmf_bh = NULL; } -qemu_bh_delete(s->tmf_bh); -s->tmf_bh = NULL; - -virtio_scsi_release(s); - QTAILQ_FOREACH_SAFE(req, , next, tmp) { QTAILQ_REMOVE(, req, next); virtio_scsi_do_one_tmf_bh(req); @@ -379,8 +398,7 @@ static void virtio_scsi_reset_tmf_bh(VirtIOSCSI *s) GLOBAL_STATE_CODE(); -virtio_scsi_acquire(s); - +/* Called after ioeventfd has been stopped, so tmf_bh_lock is not needed */ if (s->tmf_bh) { qemu_bh_delete(s->tmf_bh); s->tmf_bh = NULL; @@ -393,19 +411,19 @@ static void virtio_scsi_reset_tmf_bh(VirtIOSCSI *s) req->resp.tmf.response = VIRTIO_SCSI_S_TARGET_FAILURE; virtio_scsi_complete_req(req); } - -virtio_scsi_release(s); } static void virtio_scsi_defer_tmf_to_bh(VirtIOSCSIReq *req) { VirtIOSCSI *s = req->dev; -QTAILQ_INSERT_TAIL(>tmf_bh_list, req, next); +WITH_QEMU_LOCK_GUARD(>tmf_bh_lock) { +QTAILQ_INSERT_TAIL(>tmf_bh_list, req, next); -if (!s->tmf_bh) { -s->tmf_bh = qemu_bh_new(virtio_scsi_do_tmf_bh, s); -qemu_bh_schedule(s->tmf_bh); +if (!s->tmf_bh) { +s->tmf_bh = qemu_bh_new(virtio_scsi_do_tmf_bh, s); +qemu_bh_schedule(s->tmf_bh); +} } } @@ -1235,6 +1253,7 @@ static void virtio_scsi_device_realize(DeviceState *dev, Error **errp) Error *err = NULL; QTAILQ_INIT(>tmf_bh_list); +qemu_mutex_init(>tmf_bh_lock); virtio_scsi_common_realize(dev, virtio_scsi_handle_ctrl, @@ -1277,6 +1296,7 @@ static void virtio_scsi_device_unrealize(DeviceState *dev) qbus_set_hotplug_handler(BUS(>bus), NULL); virtio_scsi_common_unrealize(dev); +qemu_mutex_destroy(>tmf_bh_lock); } static Property virtio_scsi_properties[] = { -- 2.42.0
[PATCH 02/12] tests: remove aio_context_acquire() tests
The aio_context_acquire() API is being removed. Drop the test case that calls the API. Signed-off-by: Stefan Hajnoczi --- tests/unit/test-aio.c | 67 +-- 1 file changed, 1 insertion(+), 66 deletions(-) diff --git a/tests/unit/test-aio.c b/tests/unit/test-aio.c index 337b6e4ea7..e77d86be87 100644 --- a/tests/unit/test-aio.c +++ b/tests/unit/test-aio.c @@ -100,76 +100,12 @@ static void event_ready_cb(EventNotifier *e) /* Tests using aio_*. */ -typedef struct { -QemuMutex start_lock; -EventNotifier notifier; -bool thread_acquired; -} AcquireTestData; - -static void *test_acquire_thread(void *opaque) -{ -AcquireTestData *data = opaque; - -/* Wait for other thread to let us start */ -qemu_mutex_lock(>start_lock); -qemu_mutex_unlock(>start_lock); - -/* event_notifier_set might be called either before or after - * the main thread's call to poll(). The test case's outcome - * should be the same in either case. - */ -event_notifier_set(>notifier); -aio_context_acquire(ctx); -aio_context_release(ctx); - -data->thread_acquired = true; /* success, we got here */ - -return NULL; -} - static void set_event_notifier(AioContext *nctx, EventNotifier *notifier, EventNotifierHandler *handler) { aio_set_event_notifier(nctx, notifier, handler, NULL, NULL); } -static void dummy_notifier_read(EventNotifier *n) -{ -event_notifier_test_and_clear(n); -} - -static void test_acquire(void) -{ -QemuThread thread; -AcquireTestData data; - -/* Dummy event notifier ensures aio_poll() will block */ -event_notifier_init(, false); -set_event_notifier(ctx, , dummy_notifier_read); -g_assert(!aio_poll(ctx, false)); /* consume aio_notify() */ - -qemu_mutex_init(_lock); -qemu_mutex_lock(_lock); -data.thread_acquired = false; - -qemu_thread_create(, "test_acquire_thread", - test_acquire_thread, - , QEMU_THREAD_JOINABLE); - -/* Block in aio_poll(), let other thread kick us and acquire context */ -aio_context_acquire(ctx); -qemu_mutex_unlock(_lock); /* let the thread run */ -g_assert(aio_poll(ctx, true)); -g_assert(!data.thread_acquired); -aio_context_release(ctx); - -qemu_thread_join(); -set_event_notifier(ctx, , NULL); -event_notifier_cleanup(); - -g_assert(data.thread_acquired); -} - static void test_bh_schedule(void) { BHTestData data = { .n = 0 }; @@ -879,7 +815,7 @@ static void test_worker_thread_co_enter(void) qemu_thread_get_self(_thread); co = qemu_coroutine_create(co_check_current_thread, _thread); -qemu_thread_create(_thread, "test_acquire_thread", +qemu_thread_create(_thread, "test_aio_co_enter", test_aio_co_enter, co, QEMU_THREAD_JOINABLE); @@ -899,7 +835,6 @@ int main(int argc, char **argv) while (g_main_context_iteration(NULL, false)); g_test_init(, , NULL); -g_test_add_func("/aio/acquire", test_acquire); g_test_add_func("/aio/bh/schedule", test_bh_schedule); g_test_add_func("/aio/bh/schedule10", test_bh_schedule10); g_test_add_func("/aio/bh/cancel", test_bh_cancel); -- 2.42.0
Re: [PATCH v4 12/14] xen/asm-generic: introduce stub header softirq.h
Hi Oleksii, On 11/27/23 8:13 AM, Oleksii Kurochko wrote: > is common between Arm, PPC and RISC-V so it is > moved to asm-generic. > > Drop Arm and PPC's softirq.h and use asm-generic version instead. > Acked-by: Shawn Anastasio Thanks, Shawn
Re: [PATCH v4 11/14] xen/asm-generic: introduce stub header numa.h
On 11/27/23 8:13 AM, Oleksii Kurochko wrote: > diff --git a/xen/arch/arm/include/asm/numa.h b/xen/include/asm-generic/numa.h > similarity index 76% > rename from xen/arch/arm/include/asm/numa.h > rename to xen/include/asm-generic/numa.h > index e2bee2bd82..b00fca4978 100644 > --- a/xen/arch/arm/include/asm/numa.h > +++ b/xen/include/asm-generic/numa.h > @@ -1,12 +1,15 @@ > -#ifndef __ARCH_ARM_NUMA_H > -#define __ARCH_ARM_NUMA_H > +/* SPDX-License-Identifier: GPL-2.0-only */ > +#ifndef __ASM_GENERIC_NUMA_H > +#define __ASM_GENERIC_NUMA_H > > -#include > +#include > > -typedef u8 nodeid_t; > +typedef uint8_t nodeid_t; > > #ifndef CONFIG_NUMA > > +#include > + > /* Fake one node for now. See also node_online_map. */ > #define cpu_to_node(cpu) 0 > #define node_to_cpumask(node) (cpu_online_map) > @@ -26,7 +29,8 @@ extern mfn_t first_valid_mfn; Minor nit: in this part of the file (that wasn't included by git in this diff), there's a comment that references Arm: /* * TODO: make first_valid_mfn static when NUMA is supported on Arm, this * is required because the dummy helpers are using it. */ extern mfn_t first_valid_mfn; This should probably be changed to generic/GENERIC as you've done elsewhere in the series. In any case, Acked-by: Shawn Anastasio Thanks, Shawn
Re: [RFC PATCH v2 18/19] heki: x86: Protect guest kernel memory using the KVM hypervisor
On 11/27/23 14:03, Peter Zijlstra wrote: > On Mon, Nov 27, 2023 at 11:05:23AM -0600, Madhavan T. Venkataraman wrote: >> Apologies for the late reply. I was on vacation. Please see my response >> below: >> >> On 11/13/23 02:54, Peter Zijlstra wrote: >>> On Sun, Nov 12, 2023 at 09:23:25PM -0500, Mickaël Salaün wrote: From: Madhavan T. Venkataraman Implement a hypervisor function, kvm_protect_memory() that calls the KVM_HC_PROTECT_MEMORY hypercall to request the KVM hypervisor to set specified permissions on a list of guest pages. Using the protect_memory() function, set proper EPT permissions for all guest pages. Use the MEM_ATTR_IMMUTABLE property to protect the kernel static sections and the boot-time read-only sections. This enables to make sure a compromised guest will not be able to change its main physical memory page permissions. However, this also disable any feature that may change the kernel's text section (e.g., ftrace, Kprobes), but they can still be used on kernel modules. Module loading/unloading, and eBPF JIT is allowed without restrictions for now, but we'll need a way to authenticate these code changes to really improve the guests' security. We plan to use module signatures, but there is no solution yet to authenticate eBPF programs. Being able to use ftrace and Kprobes in a secure way is a challenge not solved yet. We're looking for ideas to make this work. Likewise, the JUMP_LABEL feature cannot work because the kernel's text section is read-only. >>> >>> What is the actual problem? As is the kernel text map is already RO and >>> never changed. >> >> For the JUMP_LABEL optimization, the text needs to be patched at some point. >> That patching requires a writable mapping of the text page at the time of >> patching. >> >> In this Heki feature, we currently lock down the kernel text at the end of >> kernel boot just before kicking off the init process. The lockdown is >> implemented by setting the permissions of a text page to R_X in the extended >> page table and not allowing write permissions in the EPT after that. So, >> jump label >> patching during kernel boot is not a problem. But doing it after kernel >> boot is a problem. > > But you see, that's exactly what the kernel already does with the normal > permissions. They get set to RX after init and are never changed. > > See the previous patch, we establish a read-write alias and write there. > > You seem to lack basic understanding of how the kernel works in this > regard, which makes me very nervous about you touching any of this. > > I must also say I really dislike your extra/random permssion calls all > over the place. They don't really get us anything afaict. Why can't you > plumb into the existing set_memory_*() family? I have responded to your comments on your other email. Please read my response there. Thanks. Madhavan
Re: [PATCH v4 10/14] xen/asm-generic: introduce stub header monitor.h
On 11/29/23 6:39 AM, Oleksii wrote: > Hi Shawn, > > On Tue, 2023-11-28 at 16:21 -0600, Shawn Anastasio wrote: >> Hi Oleksii, >> >> On 11/27/23 8:13 AM, Oleksii Kurochko wrote: >>> diff --git a/xen/arch/ppc/include/asm/Makefile >>> b/xen/arch/ppc/include/asm/Makefile >>> index 319e90955b..bcddcc181a 100644 >>> --- a/xen/arch/ppc/include/asm/Makefile >>> +++ b/xen/arch/ppc/include/asm/Makefile >>> @@ -5,6 +5,7 @@ generic-y += div64.h >>> generic-y += hardirq.h >>> generic-y += hypercall.h >>> generic-y += iocap.h >>> +generic-y += monitor.h >>> generic-y += paging.h >>> generic-y += percpu.h >>> generic-y += random.h >>> diff --git a/xen/arch/ppc/include/asm/monitor.h >>> b/xen/arch/ppc/include/asm/monitor.h >>> deleted file mode 100644 >>> index e5b0282bf1..00 >>> --- a/xen/arch/ppc/include/asm/monitor.h >>> +++ /dev/null >>> @@ -1,43 +0,0 @@ >>> -/* SPDX-License-Identifier: GPL-2.0-only */ >>> -/* Derived from xen/arch/arm/include/asm/monitor.h */ >>> -#ifndef __ASM_PPC_MONITOR_H__ >>> -#define __ASM_PPC_MONITOR_H__ >>> - >>> -#include >>> -#include >>> - >>> -static inline >>> -void arch_monitor_allow_userspace(struct domain *d, bool >>> allow_userspace) >>> -{ >>> -} >>> - >>> -static inline >>> -int arch_monitor_domctl_op(struct domain *d, struct >>> xen_domctl_monitor_op *mop) >>> -{ >>> -/* No arch-specific monitor ops on PPC. */ >>> -return -EOPNOTSUPP; >>> -} >>> - >>> -int arch_monitor_domctl_event(struct domain *d, >>> - struct xen_domctl_monitor_op *mop); >>> - >>> -static inline >>> -int arch_monitor_init_domain(struct domain *d) >>> -{ >>> -/* No arch-specific domain initialization on PPC. */ >>> -return 0; >>> -} >>> - >>> -static inline >>> -void arch_monitor_cleanup_domain(struct domain *d) >>> -{ >>> -/* No arch-specific domain cleanup on PPC. */ >>> -} >>> - >>> -static inline uint32_t arch_monitor_get_capabilities(struct domain >>> *d) >>> -{ >>> -BUG_ON("unimplemented"); >> >> I'm not sure how I feel about this assertion being dropped in the >> generic header. In general my philosophy when creating these stub >> headers was to insert as many of these assertions as possible so we >> don't end up in a scenario where during early bringup I miss a >> function >> that was originally stubbed but ought to be implemented eventually. >> >> Looking at ARM's monitor.h too, it seems that this function is the >> only >> one that differs from the generic/stub version. I'm wondering if it >> would make sense to leave this function out of the generic header, to >> be >> defined by the arch similar to what you've done with some other >> functions in this series. That would also allow ARM to be brought in >> to >> using the generic variant. > Thanks for the comment. > > For RISC-V, I did in the same way ( about BUG() and unimplemented for > time being functions ). > > I agree that such implementation isn't correct for generic header. I > think the best one solution will be to include > in whwere only this function will be implemented ( > because implementation of other functions are the same for Arm, PPC and > RISC-V ). > That approach sounds great to me. > ~ Oleksii Thanks, Shawn
Re: [PATCH v4 02/14] xen/asm-generic: introduce generic device.h
On 11/29/23 6:49 AM, Oleksii wrote: > On Tue, 2023-11-28 at 15:28 -0600, Shawn Anastasio wrote: >> Hi Oleksii, >> >> On 11/27/23 8:13 AM, Oleksii Kurochko wrote: >>> diff --git a/xen/arch/ppc/include/asm/irq.h >>> b/xen/arch/ppc/include/asm/irq.h >>> index 5c37d0cf25..49193fddff 100644 >>> --- a/xen/arch/ppc/include/asm/irq.h >>> +++ b/xen/arch/ppc/include/asm/irq.h >>> @@ -3,7 +3,9 @@ >>> #define __ASM_PPC_IRQ_H__ >>> >>> #include >>> +#ifdef CONFIG_HAS_DEVICE_TREE >> >> I realize that you were likely following PPC's device.h which also >> checks CONFIG_HAS_DEVICE_TREE, but I'm not sure that it makes sense >> to >> check this conditional in PPC code at all -- we will always have >> HAS_DEVICE_TREE (selected by PPC) and I can't imagine a scenario >> where >> this will ever not be the case. > What about case if ACPI is used? Does ACPI is supported by PPC? > > But if you are sure that CONFIG_HAS_DEVICE_TREE will be always selected > then I am OK to remove this change. > ACPI isn't supported by any PPC platform, we always use device tree. >> >> Unless Jan (or someone else) disagrees, I'd rather this conditional >> be >> dropped inside of PPC code. > I'll double check but I think I had a compilation issue if it isn't > defined. > I'm not encountering any issues locally with the conditional dropped, but if you are able to reproduce them then let me know. > Thanks. > > ~ Oleksii Thanks, Shawn
Re: [PATCH] xen/arm: Move static event channel feature to a separate module
On 29/11/2023 18:54, Julien Grall wrote: > > > Hi Michal, > > On 29/11/2023 18:41, Michal Orzel wrote: >> On 29/11/2023 18:17, Julien Grall wrote: >>> That said, I could settle on defining the two helpers in the *.c >>> directly because they are not meant to be used outside of a single *.c. >>> >>> Simarly... >>> diff --git a/xen/arch/arm/include/asm/static-evtchn.h b/xen/arch/arm/include/asm/static-evtchn.h new file mode 100644 index ..472673fae345 --- /dev/null +++ b/xen/arch/arm/include/asm/static-evtchn.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef __ASM_STATIC_EVTCHN_H_ +#define __ASM_STATIC_EVTCHN_H_ + +#ifdef CONFIG_STATIC_EVTCHN + +#include + +#define STATIC_EVTCHN_NODE_SIZE_CELLS 2 >>> >>> ... this used to be defined in domain_build.c. AFAICT the only use is >>> now in static-evtchn.c. So why is it defined in the header? >>> >>> If this is moved in the *.c, then the header static-evtchn.h would only >>> contain alloc_static_evtchn(). This could be moved in domain_build.h or >>> setup.h (I don't have any preference). >> Apart from a prototype, we still need a stub. Therefore I would prefer to >> still have a header (will >> be needed for future upgrades e.g. port exposure in fdt) and move the >> prototype and a stub there (the macro >> I can move to *.c). It just looks better for me + we reduce ifdefery in >> domain_build/setup.h. >> Would you be ok with that? > > I don't much like headers containing just one prototype. But I can live > with them if you think more will be added. I can at least think of adding support for discovering static ports. Thank you for meeting me halfway. ~Michal
Re: [PATCH] .gitignore: generalize *.new
Hi Jan, On 28/11/2023 13:51, Jan Beulich wrote: It's not only in xen/include/xen/ that we generate (intermediate) *.new files. Signed-off-by: Jan Beulich Acked-by: Julien Grall Cheers, -- Julien Grall
Re: [PATCH] xen/arm: Move static event channel feature to a separate module
Hi Michal, On 29/11/2023 18:41, Michal Orzel wrote: On 29/11/2023 18:17, Julien Grall wrote: That said, I could settle on defining the two helpers in the *.c directly because they are not meant to be used outside of a single *.c. Simarly... diff --git a/xen/arch/arm/include/asm/static-evtchn.h b/xen/arch/arm/include/asm/static-evtchn.h new file mode 100644 index ..472673fae345 --- /dev/null +++ b/xen/arch/arm/include/asm/static-evtchn.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef __ASM_STATIC_EVTCHN_H_ +#define __ASM_STATIC_EVTCHN_H_ + +#ifdef CONFIG_STATIC_EVTCHN + +#include + +#define STATIC_EVTCHN_NODE_SIZE_CELLS 2 ... this used to be defined in domain_build.c. AFAICT the only use is now in static-evtchn.c. So why is it defined in the header? If this is moved in the *.c, then the header static-evtchn.h would only contain alloc_static_evtchn(). This could be moved in domain_build.h or setup.h (I don't have any preference). Apart from a prototype, we still need a stub. Therefore I would prefer to still have a header (will be needed for future upgrades e.g. port exposure in fdt) and move the prototype and a stub there (the macro I can move to *.c). It just looks better for me + we reduce ifdefery in domain_build/setup.h. Would you be ok with that? I don't much like headers containing just one prototype. But I can live with them if you think more will be added. Cheers, -- Julien Grall
Re: [PATCH 5/5] xen/arm: address violations of MISRA C:2012 Rule 11.8
Hi, On 24/11/2023 17:29, Simone Ballarin wrote: From: Maria Celeste Cesario Add or amend casts to comply with Rule 11.8. The violations are resolved either: - by adding a missing const qualifier in the cast - by removing a cast to non-const on a const-qualified object No functional change. Signed-off-by: Maria Celeste Cesario Signed-off-by: Simone Ballarin --- xen/arch/arm/domain_build.c | 2 +- xen/arch/arm/include/asm/atomic.h | 2 +- xen/arch/arm/include/asm/regs.h | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 2dd2926b41..c17214f738 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -2749,7 +2749,7 @@ static int __init handle_passthrough_prop(struct kernel_info *kinfo, if ( node == NULL ) { printk(XENLOG_ERR "Couldn't find node %s in host_dt!\n", - (char *)xen_path->data); + xen_path->data); I am a little bit puzzled why the cast was originally added here. Stefano, do you remember? Also, this hunk will not apply on staging (the code has moved to dom0less-build.c). This will want a new version. return -EINVAL; } diff --git a/xen/arch/arm/include/asm/atomic.h b/xen/arch/arm/include/asm/atomic.h index 64314d59b3..517216d2a8 100644 --- a/xen/arch/arm/include/asm/atomic.h +++ b/xen/arch/arm/include/asm/atomic.h @@ -154,7 +154,7 @@ static always_inline void write_atomic_size(volatile void *p, */ static inline int atomic_read(const atomic_t *v) { -return *(volatile int *)>counter; +return *(const volatile int *)>counter; } static inline int _atomic_read(atomic_t v) diff --git a/xen/arch/arm/include/asm/regs.h b/xen/arch/arm/include/asm/regs.h index 8a0db95415..79050937f3 100644 --- a/xen/arch/arm/include/asm/regs.h +++ b/xen/arch/arm/include/asm/regs.h @@ -48,7 +48,7 @@ static inline bool regs_mode_is_32bit(const struct cpu_user_regs *regs) static inline bool guest_mode(const struct cpu_user_regs *r) { -unsigned long diff = (char *)guest_cpu_user_regs() - (char *)(r); +unsigned long diff = (char *)guest_cpu_user_regs() - (const char *)(r); NIT: I would take the opportunity to use 'const char*' for the first one as well. /* Frame pointer must point into current CPU stack. */ ASSERT(diff < STACK_SIZE); /* If not a guest frame, it must be a hypervisor frame. */ Cheers, -- Julien Grall
Re: [PATCH] xen/arm: Move static event channel feature to a separate module
Hi Julien, On 29/11/2023 18:17, Julien Grall wrote: > > > Hi Michal > > On 29/11/2023 16:34, Michal Orzel wrote: >> Move static event channel feature related code to a separate module >> (static-evtchn.{c,h}) in the spirit of fine granular configuration, so >> that the feature can be disabled if not needed. >> >> Introduce Kconfig option CONFIG_STATIC_EVTCHN, enabled by default (to >> keep the current behavior) dependent on CONFIG_DOM0LESS. While it could >> be possible to create a loopback connection for dom0 only, this use case >> does not really need this feature and all the docs and commit messages >> refer explicitly to the use in dom0less system. >> >> The only function visible externally is alloc_static_evtchn(), so move >> the prototype to static-evtchn.h and provide a stub in case a feature >> is disabled. Guard static_evtchn_created in struct dt_device_node and >> move its helpers to static-evtchn.h. > > I guess this is a matter of taste, but I think > dt_device_set_static_evtchn_created() and > dt_device_static_evtchn_created() are better suited in device_tree.h > because they are touching a field in the device tree structure. > > This would stay consistent with the helper dt_device_set_protected() > which is only used by the IOMMU code yet is defined in device_tree.h. Good point about consistency. I will keep the helpers in device_tree.h + add guards. > > That said, I could settle on defining the two helpers in the *.c > directly because they are not meant to be used outside of a single *.c. > > Simarly... > >> diff --git a/xen/arch/arm/include/asm/static-evtchn.h >> b/xen/arch/arm/include/asm/static-evtchn.h >> new file mode 100644 >> index ..472673fae345 >> --- /dev/null >> +++ b/xen/arch/arm/include/asm/static-evtchn.h >> @@ -0,0 +1,41 @@ >> +/* SPDX-License-Identifier: GPL-2.0-only */ >> + >> +#ifndef __ASM_STATIC_EVTCHN_H_ >> +#define __ASM_STATIC_EVTCHN_H_ >> + >> +#ifdef CONFIG_STATIC_EVTCHN >> + >> +#include >> + >> +#define STATIC_EVTCHN_NODE_SIZE_CELLS 2 > > ... this used to be defined in domain_build.c. AFAICT the only use is > now in static-evtchn.c. So why is it defined in the header? > > If this is moved in the *.c, then the header static-evtchn.h would only > contain alloc_static_evtchn(). This could be moved in domain_build.h or > setup.h (I don't have any preference). Apart from a prototype, we still need a stub. Therefore I would prefer to still have a header (will be needed for future upgrades e.g. port exposure in fdt) and move the prototype and a stub there (the macro I can move to *.c). It just looks better for me + we reduce ifdefery in domain_build/setup.h. Would you be ok with that? ~Michal
Re: [XEN PATCH v5 1/2] automation/eclair: make the docs for MISRA C:2012 Dir 4.1 visible to ECLAIR
Hi, On 29/11/2023 04:16, Stefano Stabellini wrote: On Fri, 17 Nov 2023, Nicola Vetrini wrote: To be able to check for the existence of the necessary subsections in the documentation for MISRA C:2012 Dir 4.1, ECLAIR needs to have a source file that is built. This file is generated from 'C-runtime-failures.rst' in docs/misra and the configuration is updated accordingly. Signed-off-by: Nicola Vetrini It looks like you also addressed all Julien's comments appropriately They are indeed. So I have committed the series. Cheers, -- Julien Grall
Re: [XEN PATCH v6] arm/bitops: encapsulate violation of MISRA C:2012 Rule 10.1
Hi Nicola, On 29/11/2023 10:32, Nicola Vetrini wrote: The definitions of ffs{l}? violate Rule 10.1, by using the well-known pattern (x & -x); its usage is wrapped by the ISOLATE_LSB macro. No functional change. Signed-off-by: Nicola Vetrini Reviewed-by: Stefano Stabellini This is now committed. Cheers, -- Julien Grall
Re: [PATCH] xen/arm: Move static event channel feature to a separate module
Hi Michal On 29/11/2023 16:34, Michal Orzel wrote: Move static event channel feature related code to a separate module (static-evtchn.{c,h}) in the spirit of fine granular configuration, so that the feature can be disabled if not needed. Introduce Kconfig option CONFIG_STATIC_EVTCHN, enabled by default (to keep the current behavior) dependent on CONFIG_DOM0LESS. While it could be possible to create a loopback connection for dom0 only, this use case does not really need this feature and all the docs and commit messages refer explicitly to the use in dom0less system. The only function visible externally is alloc_static_evtchn(), so move the prototype to static-evtchn.h and provide a stub in case a feature is disabled. Guard static_evtchn_created in struct dt_device_node and move its helpers to static-evtchn.h. I guess this is a matter of taste, but I think dt_device_set_static_evtchn_created() and dt_device_static_evtchn_created() are better suited in device_tree.h because they are touching a field in the device tree structure. This would stay consistent with the helper dt_device_set_protected() which is only used by the IOMMU code yet is defined in device_tree.h. That said, I could settle on defining the two helpers in the *.c directly because they are not meant to be used outside of a single *.c. Simarly... diff --git a/xen/arch/arm/include/asm/static-evtchn.h b/xen/arch/arm/include/asm/static-evtchn.h new file mode 100644 index ..472673fae345 --- /dev/null +++ b/xen/arch/arm/include/asm/static-evtchn.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef __ASM_STATIC_EVTCHN_H_ +#define __ASM_STATIC_EVTCHN_H_ + +#ifdef CONFIG_STATIC_EVTCHN + +#include + +#define STATIC_EVTCHN_NODE_SIZE_CELLS 2 ... this used to be defined in domain_build.c. AFAICT the only use is now in static-evtchn.c. So why is it defined in the header? If this is moved in the *.c, then the header static-evtchn.h would only contain alloc_static_evtchn(). This could be moved in domain_build.h or setup.h (I don't have any preference). Cheers, -- Julien Grall
Re: We are not able to virtualize FreeBSD using xen 4.17 on Arm 32 bit
On Tue, Nov 28, 2023 at 03:04:50PM -0800, Elliott Mitchell wrote: > On Tue, Nov 28, 2023 at 04:10:50PM +0100, Roger Pau Monné wrote: > > On Tue, Nov 28, 2023 at 03:09:14PM +0100, Mario Marietto wrote: > > > For booting a FreeBSD kernel as a guest OS on XEN,should we install xen > > > 4.18 from source ? > > > I don't think so, I'm not aware of the FreeBSD port requiring a > > specific version of Xen. I do think the work is limited to aarch64 > > however, so there's no support in sight for arm32 FreeBSD guests as > > far as I'm aware. > > I've only ever tried arm64, but since arm32 didn't appear to need much > to have operational I tried to make it possible. In theory it /should/ > work on arm32, but I've never tried it. What was missing was I had never > added it to the configuration and one link was needed. Updated "submit" > branch has the tiny adjustment. > > (the only difference is the hypercall wrappers, register naming and where > the op code goes, very simple compatibility) Ugh, goof and send this before the final check completes... Appears a bit more work is needed to get this operational. Mainly need to figure out the Clang option to enable the key op code. Issue is the arm hypercall.h header and the HYPERVISOR_*() wrappers. I fear the traditional approach might be easier to get working. -- (\___(\___(\__ --=> 8-) EHM <=-- __/)___/)___/) \BS (| ehem+sig...@m5p.com PGP 87145445 |) / \_CS\ | _ -O #include O- _ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
Re: We are not able to virtualize FreeBSD using xen 4.17 on Arm 32 bit
Hello to everyone. I tried to use xen as a hypervisor instead of kvm + libvirt + virt-manager to boot FreeBSD on my ARM Chromebook where I have installed Devuan 5,since Stefano said : "That might work. libvirt + virt-manager with the xen accelerator might work on the ARM Chromebook. That's because as far as I know Xen integration in libvirt is done via linking to libxl directly and libxl is supported and working on ARM" Unfortunately something is not working properly. What I did has been to reboot the machine in xen,enable libvirtd & and virtlogd & and virt-manager &,but this is what happened : Traceback (most recent call last): File "/usr/lib/xen-4.17/bin/pygrub", line 884, in raise RuntimeError("Unable to find partition containing kernel") RuntimeError: Unable to find partition containing kernel I think it does not recognize the FreeBSD file system structure and its kernel. Libvirt seems to have been programmed to boot Linux,not FreeBSD. In Fact,I did the counterproof and it seems to be like this : According with this post : https://blog.roberthallam.org/2020/05/solving-unable-to-find-partition-containing-kernel/ I've created a file called menu.lst inside the boot directory of the image file called "debian.img",adding the following content inside : default 0 timeout 10 title Debian root (hd0,1) kernel /boot/vmlinux-6.1.59-stb-xen-cbe+ root=/dev/xvda initrd /boot/initrd.img-6.1.59-stb-xen-cbe+ and I tried again to boot the image using virt-manager. It gave this error again : root@devuan-bunsen:/mnt/zroot2/zroot2/OS/Chromebook/FreeBSD-guestOS/linux-xen/debian2/boot# 2023-11-29 15:21:09.266+: 2467: error : libxlDomainStartPerform:1256 : internal error: libxenlight failed to create new domain 'debian12' but giving a look inside the log file and I found this interesting situation ; Using to parse /boot/grub/menu.lst (B )0 [1;24r [m [?7h [?1h = [H [J [?1h = [1BpyGRUB version 0.6 [1B [0m lk [1B [0m x [0;7m Debian 12 [m [0m x [1B [0m x [72C [0m x [1B [0m x [72C [0m x [1B [0m x [72C [0m x [1B [0m x [72C [0m x [1B [0m x [72C [0m x [1B [0m x [72C [0m x [1B [0m x [72C [0m x [1B [0m mj [1B [70D [0m Use the ↑ and ↓ keys to select which entry is highlighted. [1B [58DPress enter to boot the selected OS, 'e' to edit the [1B [52Dcommands before booting, 'a' to modify the kernel arguments [1B [59Dbefore booting, or 'c' for a command line. [12A [26C [17B [68DWill boot selected entry in 10 seconds [?1h = [J [17A [73C [17B [68DWill boot selected entry in 9 seconds [?1h = [J [17A [73C [17B [68DWill boot selected entry in 8 seconds [?1h = [J [17A [73C [17B [68DWill boot selected entry in 7 seconds [?1h = [J [17A [73C [17B [68DWill boot selected entry in 6 seconds [?1h = [J [17A [73C [17B [68DWill boot selected entry in 5 seconds [?1h = [J [17A [73C [17B [68DWill boot selected entry in 4 seconds [?1h = [J [17A [73C [17B [68DWill boot selected entry in 3 seconds [?1h = [J [17A [73C [17B [68DWill boot selected entry in 2 seconds [?1h = [J [17A [73C [17B [68DWill boot selected entry in 1 seconds [?1l > [24;1H [?1l > so,it seems that it tried to boot,but for an unknown reason,it still gives the error. Anyway : My xen setup is not broken anymore ; Using Libvirt Linux seems to be able to boot,FreeBSD does not. Using the "raw" method of booting FreeBSD as domU could be another story that I will try soon. But before trying to compile the correct freebsd kernel that's recognized by xen using the Elliott Michell code,I need to understand well what's the procedure that will work. So below you can read what I will try to do : $ truncate -s 100G xenvm.img $ mdconfig -f xenvm.img -u 0 $ newfs /dev/md0 $ mount /dev/md0 /mnt $ git clone https://gitlab.com/ehem/freebsd-src.git $ cd freebsd-src $ make -DNO_MODULES KERNCONF=GENERIC TARGET=arm TARGET_ARCH=armv7 DESTDIR=/build buildkernel $ echo "/dev/xbd0 / ufs rw 1 1" > /mnt/etc/fstab $ nano /build/etc/ttys (add the line 'xc0 "/usr/libexec/getty Pc" xterm on secure") $ umount /build $ mdconfig -d -u 0 Do you see errors ? some missing ? ---> I've only ever tried arm64, but since arm32 didn't appear to need much to be operational I tried to make it possible. In theory it /should/ work on arm32, but I've never tried it. What was missing was I had never added it to the configuration and one link was needed. Updated "submit" branch has a tiny adjustment. I didn't understand how to apply the patch that's inside the "submit" branch. On Wed, Nov 29, 2023 at 12:05 AM Elliott Mitchell wrote: > On Tue, Nov 28, 2023 at 04:10:50PM +0100, Roger Pau Monné wrote: > > On Tue, Nov 28, 2023 at 03:09:14PM +0100, Mario Marietto wrote: > > > For booting a FreeBSD kernel as a guest OS on XEN,should we install xen > > > 4.18 from source ? > > > I don't think so, I'm not aware of the
[xen-unstable test] 183922: tolerable FAIL
flight 183922 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/183922/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-win7-amd64 12 windows-install fail like 183860 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 183877 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 183877 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 183877 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 183877 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 183877 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 183877 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 183877 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 183877 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 183877 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 183877 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 183877 test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim14 guest-start fail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass version targeted for testing: xen 80c153c48b255bae61948827241c26671207cf4e baseline version: xen 80c153c48b255bae61948827241c26671207cf4e Last test of basis 183922 2023-11-29 01:52:19 Z0 days Testing same since (not found) 0 attempts jobs: build-amd64-xsm pass build-arm64-xsm pass
[PATCH] tools/xg: Fix potential memory leak in cpu policy getters/setters
They allocate two different hypercall buffers, but leak the first allocation if the second one failed due to an early return that bypasses cleanup. Remove the early exit and go through _post() instead. Invoking _post() is benign even if _pre() failed. Fixes: 6b85e427098c ('x86/sysctl: Implement XEN_SYSCTL_get_cpu_policy') Fixes: 60529dfeca14 ('x86/domctl: Implement XEN_DOMCTL_get_cpu_policy') Fixes: 14ba07e6f816 ('x86/domctl: Implement XEN_DOMCTL_set_cpumsr_policy') Signed-off-by: Alejandro Vallejo --- tools/libs/guest/xg_cpuid_x86.c | 86 +++-- 1 file changed, 39 insertions(+), 47 deletions(-) diff --git a/tools/libs/guest/xg_cpuid_x86.c b/tools/libs/guest/xg_cpuid_x86.c index 4497087daa..db5aebc815 100644 --- a/tools/libs/guest/xg_cpuid_x86.c +++ b/tools/libs/guest/xg_cpuid_x86.c @@ -136,20 +136,20 @@ static int get_system_cpu_policy(xc_interface *xch, uint32_t index, DECLARE_HYPERCALL_BOUNCE(msrs, *nr_msrs * sizeof(*msrs), XC_HYPERCALL_BUFFER_BOUNCE_OUT); -int ret; - -if ( xc_hypercall_bounce_pre(xch, leaves) || - xc_hypercall_bounce_pre(xch, msrs) ) -return -1; +int ret = -1; -sysctl.cmd = XEN_SYSCTL_get_cpu_policy; -sysctl.u.cpu_policy.index = index; -sysctl.u.cpu_policy.nr_leaves = *nr_leaves; -set_xen_guest_handle(sysctl.u.cpu_policy.leaves, leaves); -sysctl.u.cpu_policy.nr_msrs = *nr_msrs; -set_xen_guest_handle(sysctl.u.cpu_policy.msrs, msrs); - -ret = do_sysctl(xch, ); +if ( !xc_hypercall_bounce_pre(xch, leaves) && + !xc_hypercall_bounce_pre(xch, msrs) ) +{ +sysctl.cmd = XEN_SYSCTL_get_cpu_policy; +sysctl.u.cpu_policy.index = index; +sysctl.u.cpu_policy.nr_leaves = *nr_leaves; +set_xen_guest_handle(sysctl.u.cpu_policy.leaves, leaves); +sysctl.u.cpu_policy.nr_msrs = *nr_msrs; +set_xen_guest_handle(sysctl.u.cpu_policy.msrs, msrs); + +ret = do_sysctl(xch, ); +} xc_hypercall_bounce_post(xch, leaves); xc_hypercall_bounce_post(xch, msrs); @@ -174,20 +174,20 @@ static int get_domain_cpu_policy(xc_interface *xch, uint32_t domid, DECLARE_HYPERCALL_BOUNCE(msrs, *nr_msrs * sizeof(*msrs), XC_HYPERCALL_BUFFER_BOUNCE_OUT); -int ret; - -if ( xc_hypercall_bounce_pre(xch, leaves) || - xc_hypercall_bounce_pre(xch, msrs) ) -return -1; - -domctl.cmd = XEN_DOMCTL_get_cpu_policy; -domctl.domain = domid; -domctl.u.cpu_policy.nr_leaves = *nr_leaves; -set_xen_guest_handle(domctl.u.cpu_policy.leaves, leaves); -domctl.u.cpu_policy.nr_msrs = *nr_msrs; -set_xen_guest_handle(domctl.u.cpu_policy.msrs, msrs); +int ret = -1; -ret = do_domctl(xch, ); +if ( !xc_hypercall_bounce_pre(xch, leaves) && + !xc_hypercall_bounce_pre(xch, msrs) ) +{ +domctl.cmd = XEN_DOMCTL_get_cpu_policy; +domctl.domain = domid; +domctl.u.cpu_policy.nr_leaves = *nr_leaves; +set_xen_guest_handle(domctl.u.cpu_policy.leaves, leaves); +domctl.u.cpu_policy.nr_msrs = *nr_msrs; +set_xen_guest_handle(domctl.u.cpu_policy.msrs, msrs); + +ret = do_domctl(xch, ); +} xc_hypercall_bounce_post(xch, leaves); xc_hypercall_bounce_post(xch, msrs); @@ -214,32 +214,24 @@ int xc_set_domain_cpu_policy(xc_interface *xch, uint32_t domid, DECLARE_HYPERCALL_BOUNCE(msrs, nr_msrs * sizeof(*msrs), XC_HYPERCALL_BUFFER_BOUNCE_IN); -int ret; - -if ( err_leaf_p ) -*err_leaf_p = -1; -if ( err_subleaf_p ) -*err_subleaf_p = -1; -if ( err_msr_p ) -*err_msr_p = -1; +int ret = -1; -if ( xc_hypercall_bounce_pre(xch, leaves) ) -return -1; - -if ( xc_hypercall_bounce_pre(xch, msrs) ) -return -1; - -domctl.cmd = XEN_DOMCTL_set_cpu_policy; -domctl.domain = domid; -domctl.u.cpu_policy.nr_leaves = nr_leaves; -set_xen_guest_handle(domctl.u.cpu_policy.leaves, leaves); -domctl.u.cpu_policy.nr_msrs = nr_msrs; -set_xen_guest_handle(domctl.u.cpu_policy.msrs, msrs); domctl.u.cpu_policy.err_leaf = -1; domctl.u.cpu_policy.err_subleaf = -1; domctl.u.cpu_policy.err_msr = -1; -ret = do_domctl(xch, ); +if ( !xc_hypercall_bounce_pre(xch, leaves) && + !xc_hypercall_bounce_pre(xch, msrs) ) +{ +domctl.cmd = XEN_DOMCTL_set_cpu_policy; +domctl.domain = domid; +domctl.u.cpu_policy.nr_leaves = nr_leaves; +set_xen_guest_handle(domctl.u.cpu_policy.leaves, leaves); +domctl.u.cpu_policy.nr_msrs = nr_msrs; +set_xen_guest_handle(domctl.u.cpu_policy.msrs, msrs); + +ret = do_domctl(xch, ); +} xc_hypercall_bounce_post(xch, leaves); xc_hypercall_bounce_post(xch, msrs); -- 2.34.1
Re: [PATCH v8 2/2] xen/vpci: header: filter PCI capabilities
On 11/29/23 09:05, Roger Pau Monné wrote: > On Tue, Nov 28, 2023 at 02:44:25PM -0500, Stewart Hildebrand wrote: >> Currently, Xen vPCI only supports virtualizing the MSI and MSI-X >> capabilities. >> Hide all other PCI capabilities (including extended capabilities) from domUs >> for >> now, even though there may be certain devices/drivers that depend on being >> able >> to discover certain capabilities. >> >> We parse the physical PCI capabilities linked list and add vPCI register >> handlers for the next elements, inserting our own next value, thus >> presenting a >> modified linked list to the domU. >> >> Introduce helper functions vpci_hw_read8 and vpci_read_val. The vpci_read_val >> helper function returns a fixed value, which may be used for RAZ registers, >> or >^ read as zero I'll change it >> registers whose value doesn't change. >> >> Introduce pci_find_next_cap_ttl() helper while adapting the logic from >> pci_find_next_cap() to suit our needs, and implement the existing >> pci_find_next_cap() in terms of the new helper. >> >> Signed-off-by: Stewart Hildebrand > > LGTM, some nits below: > > Reviewed-by: Roger Pau Monné Thanks! > >> --- >> v7->v8: >> * use to array instead of match function >> * include lib.h for ARRAY_SIZE >> * don't emulate PCI_CAPABILITY_LIST register if PCI_STATUS_CAP_LIST bit is >> not >> set in hardware >> * spell out RAZ/WI acronym >> * dropped R-b tag since the patch has changed moderately since the last rev >> >> v6->v7: >> * no change >> >> v5->v6: >> * add register handlers before status register handler in init_bars() >> * s/header->mask_cap_list/mask_cap_list/ >> >> v4->v5: >> * use more appropriate types, continued >> * get rid of unnecessary hook function >> * add Jan's R-b >> >> v3->v4: >> * move mask_cap_list setting to this patch >> * leave pci_find_next_cap signature alone >> * use more appropriate types >> >> v2->v3: >> * get rid of > 0 in loop condition >> * implement pci_find_next_cap in terms of new pci_find_next_cap_ttl function >> so >> that hypothetical future callers wouldn't be required to pass >> * change NULL to (void *)0 for RAZ value passed to vpci_read_val >> * change type of ttl to unsigned int >> * remember to mask off the low 2 bits of next in the initial loop iteration >> * change return type of pci_find_next_cap and pci_find_next_cap_ttl >> * avoid wrapping the PCI_STATUS_CAP_LIST condition by using ! instead of == 0 >> >> v1->v2: >> * change type of ttl to int >> * use switch statement instead of if/else >> * adapt existing pci_find_next_cap helper instead of rolling our own >> * pass ttl as in/out >> * "pass through" the lower 2 bits of the next pointer >> * squash helper functions into this patch to avoid transient dead code >> situation >> * extended capabilities RAZ/WI >> --- >> xen/drivers/pci/pci.c | 31 --- >> xen/drivers/vpci/header.c | 63 +++ >> xen/drivers/vpci/vpci.c | 12 >> xen/include/xen/pci.h | 3 ++ >> xen/include/xen/vpci.h| 5 >> 5 files changed, 104 insertions(+), 10 deletions(-) >> >> diff --git a/xen/drivers/pci/pci.c b/xen/drivers/pci/pci.c >> index 3569ccb24e9e..1645b3118220 100644 >> --- a/xen/drivers/pci/pci.c >> +++ b/xen/drivers/pci/pci.c >> @@ -39,31 +39,42 @@ unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, >> unsigned int cap) >> return 0; >> } >> >> -unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos, >> - unsigned int cap) >> +unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos, >> + unsigned int *cap, unsigned int n, >> + unsigned int *ttl) >> { >> -u8 id; >> -int ttl = 48; >> +unsigned int id, i; > > Nit: those can be defined inside the while loop. I'll move them > >> -while ( ttl-- ) >> +while ( (*ttl)-- ) >> { >> pos = pci_conf_read8(sbdf, pos); >> if ( pos < 0x40 ) >> break; >> >> -pos &= ~3; >> -id = pci_conf_read8(sbdf, pos + PCI_CAP_LIST_ID); >> +id = pci_conf_read8(sbdf, (pos & ~3) + PCI_CAP_LIST_ID); >> >> if ( id == 0xff ) >> break; >> -if ( id == cap ) >> -return pos; >> +for ( i = 0; i < n; i++ ) >> +{ >> +if ( id == cap[i] ) >> +return pos; >> +} >> >> -pos += PCI_CAP_LIST_NEXT; >> +pos = (pos & ~3) + PCI_CAP_LIST_NEXT; >> } >> + >> return 0; >> } >> >> +unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos, >> + unsigned int cap) >> +{ >> +unsigned int ttl = 48; >> + >> +return pci_find_next_cap_ttl(sbdf, pos, , 1, ) & ~3; >> +} >> + >> /** >> * pci_find_ext_capability - Find an extended capability >> * @sbdf: PCI device to
Re: Important - Documentation Discussion
Hi all, After gathering different feedback from members of the community, please see below. *We will be moving ahead with Git + RST + Sphinx + Kroki as the new platforms for user documentation.* *Many of you have specified you don't necessarily have a preference for the platform, but instead a need for updated content. * Anticipated next steps: 1. Begin scoping how documentation will be split out (e.g. getting started/tutorials/how-tos) 2. Building out the new documentation platform 3. Migrate the existing/relevant content over to the new platform 4. Commit changes in Git 5. Use Sphinx to generate HTML or other formats needed from RST files 6. Align this to be launched on our new Xen Project webpage. (We may have the existing content visible whilst we work on revamping the documentation.) 7. Create smaller working groups to update and plan documentation, and update these periodically, before pushing to the new webpage. This will be a joint effort within the community, and I am hoping to count on our members to facilitate these changes. I'll be asking everyone for their help, but if you'd like to volunteer on any of the specific steps, please let me know. The smaller working groups are likely to split into teams concerning the subprojects, but this is open for discussion. Many thanks, Kelly Choi Xen Project Community Manager XenServer, Cloud Software Group On Fri, Nov 17, 2023 at 1:58 PM Kelly Choi wrote: > Hey Alejandro, > > Thanks for your feedback. > I'll consider all your points, and any other comments from the community > before proceeding on the next steps. > > If anyone else has any further ideas, please let me know *Friday 24th > November 2023.* > > Many thanks, > Kelly Choi > > Open Source Community Manager > XenServer, Cloud Software Group > > > On Wed, Nov 15, 2023 at 12:16 PM Alejandro Vallejo < > alejandro.vall...@cloud.com> wrote: > >> Hi, >> >> On Wed, Nov 15, 2023 at 11:43:46AM +, Kelly Choi wrote: >> > Hi all, >> > >> > As you may be aware, we are in the process of reviewing our existing >> > documentation platform and content. In order to meet the requirements of >> > the community and project, I need your input and feedback. >> > >> > The aim of the new documentation is to encourage community members to >> > collaborate in updating content (where possible) and educate users on >> how >> > the project works. The updated documentation will be hosted on our new >> > upcoming website. >> > >> > *Suggestion for user-orientated documentation:* >> > >> >- git - for hosting (gitlab recommended) >> >- RST - for the format of the documentation >> >- Sphynx - to generate web pages and PDF and other formats from RST >> In my experience Sphinx's strength is in its ability to cross-reference >> the >> code. That isn't something terribly helpful for user documentation, and it >> makes the whole thing harder to set up for no clear benefit. >> >> For user-facing docs I'd propose `mkdocs` instead, which is a lot more >> focused and simpler to set up (can be done literally in a couple of >> minutes). The main difference would be that it takes Markdown rather than >> RST[1]. It trivially supports plugins for interesting things, like mermaid >> (for sequence/block diagrams or FSMs) >> >> Main website: https://www.mkdocs.org/ >> Plugin catalog: https://github.com/mkdocs/catalog >> * mermaid plugin: https://github.com/fralau/mkdocs-mermaid2-plugin >> * kroki plugin: https://kroki.io/ >> >> [1] I happen to prefer Markdown, as I find it easier to read and write, >> but >> that's just personal preference >> >> > >> > *Suggestion for developer reference documentation:* >> > >> >- Doxygen >> >- Kerneldoc >> >- Open to other suggestions here >> There's 2 areas here. The format for the annotations, and the >> visualization >> frontend. They need not be the same. Using Doxygen seems the less >> "not-invented-here" approach, while kerneldoc would >> >> As for the frontend I would suggest to _not_ use Doxygen itself as the >> generated websites are hideous to look at. Sphinx (through Breathe) or any >> other Doxygen-database parse wouldr do the job as well providing a much >> (much) better output. >> >> > >> > Example of how documentation will be split: >> > >> >1. Getting started/beginner user guides >> >2. Learning orientated tutorials >> >3. Goal-orientated how-to guides >> >4. Developer related documentation >> (1-3) seem like pretty much the same thing. Guides of increasing >> complexity >> meant to train a new user/admin. Dividing such a section in those 3 blocks >> seems sensible. >> >> (4) could be split in a "Developer Manual", which would contain plain >> explanation for dev-heavy concepts, and a "Reference Manual" with links to >> the Doxygen-esque websites and a higher focus on implementation details. >> >> > >> > Side note: Whilst I appreciate everyone has a different vision of what >> > ideal documentation
[PATCH] xen/arm: Move static event channel feature to a separate module
Move static event channel feature related code to a separate module (static-evtchn.{c,h}) in the spirit of fine granular configuration, so that the feature can be disabled if not needed. Introduce Kconfig option CONFIG_STATIC_EVTCHN, enabled by default (to keep the current behavior) dependent on CONFIG_DOM0LESS. While it could be possible to create a loopback connection for dom0 only, this use case does not really need this feature and all the docs and commit messages refer explicitly to the use in dom0less system. The only function visible externally is alloc_static_evtchn(), so move the prototype to static-evtchn.h and provide a stub in case a feature is disabled. Guard static_evtchn_created in struct dt_device_node and move its helpers to static-evtchn.h. Signed-off-by: Michal Orzel --- This is a follow up to Luca's dom0less features modularization. I'm not sure what's the best strategy for static_evtchn_created helpers. We could also guard them in device_tree.h for ease of removal when we find a better way for handling the flag. --- xen/arch/arm/Kconfig | 8 ++ xen/arch/arm/Makefile| 1 + xen/arch/arm/domain_build.c | 146 - xen/arch/arm/include/asm/setup.h | 1 - xen/arch/arm/include/asm/static-evtchn.h | 41 ++ xen/arch/arm/setup.c | 1 + xen/arch/arm/static-evtchn.c | 159 +++ xen/include/xen/device_tree.h| 14 +- 8 files changed, 212 insertions(+), 159 deletions(-) create mode 100644 xen/arch/arm/include/asm/static-evtchn.h create mode 100644 xen/arch/arm/static-evtchn.c diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig index f73b62e50d00..50e9bfae1ac8 100644 --- a/xen/arch/arm/Kconfig +++ b/xen/arch/arm/Kconfig @@ -217,6 +217,14 @@ config STATIC_SHM help This option enables statically shared memory on a dom0less system. +config STATIC_EVTCHN + bool "Static event channel support on a dom0less system" + depends on DOM0LESS_BOOT + default y + help + This option enables establishing static event channel communication + between domains on a dom0less system (domU-domU as well as domU-dom0). + endmenu menu "ARM errata workaround via the alternative framework" diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile index 809772417c14..33c677672fe6 100644 --- a/xen/arch/arm/Makefile +++ b/xen/arch/arm/Makefile @@ -51,6 +51,7 @@ obj-y += setup.o obj-y += shutdown.o obj-y += smp.o obj-y += smpboot.o +obj-$(CONFIG_STATIC_EVTCHN) += static-evtchn.init.o obj-$(CONFIG_STATIC_MEMORY) += static-memory.init.o obj-$(CONFIG_STATIC_SHM) += static-shmem.init.o obj-y += sysctl.o diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index df66fb88d8ec..613b2885cebb 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -38,8 +38,6 @@ #include #include -#define STATIC_EVTCHN_NODE_SIZE_CELLS 2 - static unsigned int __initdata opt_dom0_max_vcpus; integer_param("dom0_max_vcpus", opt_dom0_max_vcpus); @@ -1919,150 +1917,6 @@ void __init evtchn_allocate(struct domain *d) d->arch.hvm.params[HVM_PARAM_CALLBACK_IRQ] = val; } -static int __init get_evtchn_dt_property(const struct dt_device_node *np, - uint32_t *port, uint32_t *phandle) -{ -const __be32 *prop = NULL; -uint32_t len; - -prop = dt_get_property(np, "xen,evtchn", ); -if ( !prop ) -{ -printk(XENLOG_ERR "xen,evtchn property should not be empty.\n"); -return -EINVAL; -} - -if ( !len || len < dt_cells_to_size(STATIC_EVTCHN_NODE_SIZE_CELLS) ) -{ -printk(XENLOG_ERR "xen,evtchn property value is not valid.\n"); -return -EINVAL; -} - -*port = dt_next_cell(1, ); -*phandle = dt_next_cell(1, ); - -return 0; -} - -static int __init alloc_domain_evtchn(struct dt_device_node *node) -{ -int rc; -uint32_t domU1_port, domU2_port, remote_phandle; -struct dt_device_node *remote_node; -const struct dt_device_node *p1_node, *p2_node; -struct evtchn_alloc_unbound alloc_unbound; -struct evtchn_bind_interdomain bind_interdomain; -struct domain *d1 = NULL, *d2 = NULL; - -if ( !dt_device_is_compatible(node, "xen,evtchn-v1") ) -return 0; - -/* - * Event channel is already created while parsing the other side of - * evtchn node. - */ -if ( dt_device_static_evtchn_created(node) ) -return 0; - -rc = get_evtchn_dt_property(node, _port, _phandle); -if ( rc ) -return rc; - -remote_node = dt_find_node_by_phandle(remote_phandle); -if ( !remote_node ) -{ -printk(XENLOG_ERR -"evtchn: could not find remote evtchn phandle\n"); -return -EINVAL; -} - -rc = get_evtchn_dt_property(remote_node, _port, _phandle); -if ( rc ) -return rc; - -if (
[xen-unstable-smoke test] 183934: tolerable all pass - PUSHED
flight 183934 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/183934/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen 902377b690f42ddf44ae91c4b0751d597f1cd694 baseline version: xen 80c153c48b255bae61948827241c26671207cf4e Last test of basis 183851 2023-11-24 09:03:53 Z5 days Failing since183871 2023-11-27 14:00:26 Z2 days 11 attempts Testing same since 183934 2023-11-29 11:04:12 Z0 days1 attempts People who touched revisions under test: Alejandro Vallejo Andrew Cooper Federico Serafini Frediano Ziglio Jan Beulich Julien Grall Luca Fancellu Maria Celeste Cesario Maria Celeste Cesario Michal Orzel Nicola Vetrini Oleksii Kurochko Roger Pau Monne Roger Pau Monné Shawn Anastasio Simone Ballarin Simone Ballarin Stefano Stabellini Tamas K Lengyel jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 80c153c48b..902377b690 902377b690f42ddf44ae91c4b0751d597f1cd694 -> smoke
Re: [XEN PATCH 0/7] address some violations of MISRA C Rule 8.4
On 2023-11-29 16:24, Nicola Vetrini wrote: Hi all, this series addresses some of the remaining violations of MISRA C:2012 Rule 8.4. Some of the modifications are done according to the feedback received in this thread [1] missing a reference: [1] https://lore.kernel.org/xen-devel/51df0275d0988af618c22adb8d551...@bugseng.com/T/#t -- Nicola Vetrini, BSc Software Engineer, BUGSENG srl (https://bugseng.com)
[XEN PATCH 7/7] xen/page_alloc: deviate first_valid_mfn for MISRA C Rule 8.4
No functional change. Signed-off-by: Nicola Vetrini --- The preferred way to deviate is to use asmlinkage, but this modification is only the consequence of NUMA on ARM (and possibly PPC) being a work in progress. As stated in the comment above the textual deviation, first_valid_mfn will likely then become static and there would be no need for the comment anymore. This works towards having the analysis for this rule clean (i.e. no violations); the interest in having a clean rule is that then it could be used to signal newly introduced violations by making the analysis job fail. --- xen/common/page_alloc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 9b5df74fddab..794d7689b179 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -258,6 +258,7 @@ static PAGE_LIST_HEAD(page_broken_list); * first_valid_mfn is exported because it is use in ARM specific NUMA * helpers. See comment in arch/arm/include/asm/numa.h. */ +/* SAF-1-safe */ mfn_t first_valid_mfn = INVALID_MFN_INITIALIZER; struct bootmem_region { -- 2.34.1
[XEN PATCH 2/7] x86/i8259: add missing header for init_IRQ declaration
No functional change. Signed-off-by: Nicola Vetrini --- xen/arch/x86/i8259.c | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/arch/x86/i8259.c b/xen/arch/x86/i8259.c index e0fa1f96b4f2..470d690c3594 100644 --- a/xen/arch/x86/i8259.c +++ b/xen/arch/x86/i8259.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include -- 2.34.1
[XEN PATCH 3/7] xen/x86: add missing instances of asmlinkage attributes
No functional change. Signed-off-by: Nicola Vetrini --- xen/arch/x86/desc.c | 2 +- xen/arch/x86/efi/efi-boot.h | 5 +++-- xen/arch/x86/mm.c | 2 +- xen/arch/x86/smpboot.c | 2 +- 4 files changed, 6 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/desc.c b/xen/arch/x86/desc.c index 39080ca67211..b332adedf28e 100644 --- a/xen/arch/x86/desc.c +++ b/xen/arch/x86/desc.c @@ -91,7 +91,7 @@ seg_desc_t boot_compat_gdt[PAGE_SIZE / sizeof(seg_desc_t)] = * References boot_cpu_gdt_table for a short period, until the CPUs switch * onto their per-CPU GDTs. */ -const struct desc_ptr boot_gdtr = { +const struct desc_ptr asmlinkage boot_gdtr = { .limit = LAST_RESERVED_GDT_BYTE, .base = (unsigned long)(boot_gdt - FIRST_RESERVED_GDT_ENTRY), }; diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h index 86467da301e5..8ea64e31cdc2 100644 --- a/xen/arch/x86/efi/efi-boot.h +++ b/xen/arch/x86/efi/efi-boot.h @@ -808,8 +808,9 @@ static const char *__init get_option(const char *cmd, const char *opt) return o; } -void __init efi_multiboot2(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable, - const char *cmdline) +void asmlinkage __init efi_multiboot2(EFI_HANDLE ImageHandle, + EFI_SYSTEM_TABLE *SystemTable, + const char *cmdline) { EFI_GRAPHICS_OUTPUT_PROTOCOL *gop; EFI_HANDLE gop_handle; diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 0a66db10b959..9ce21c443bea 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -143,7 +143,7 @@ /* Mapping of the fixmap space needed early. */ l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE) l1_fixmap[L1_PAGETABLE_ENTRIES]; -l1_pgentry_t __section(".bss.page_aligned") __aligned(PAGE_SIZE) +l1_pgentry_t asmlinkage __section(".bss.page_aligned") __aligned(PAGE_SIZE) l1_fixmap_x[L1_PAGETABLE_ENTRIES]; bool __read_mostly machine_to_phys_mapping_valid; diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 4c54ecbc91d7..8aa621533f3d 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -310,7 +310,7 @@ static void set_cpu_sibling_map(unsigned int cpu) } } -void start_secondary(void *unused) +void asmlinkage start_secondary(void *unused) { struct cpu_info *info = get_cpu_info(); -- 2.34.1
[XEN PATCH 1/7] xen/arm: mmu: add headers for missing declarations
The definitions needing the inclusion of asm/setup.h are boot_{first,second,third}(_id)?, whereas vmap.h is needed by arch_vmap_virt_end. No functional change. Signed-off-by: Nicola Vetrini --- xen/arch/arm/mmu/setup.c | 1 + xen/arch/arm/mmu/smpboot.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/xen/arch/arm/mmu/setup.c b/xen/arch/arm/mmu/setup.c index a5a9b538ff13..d5264e51bc44 100644 --- a/xen/arch/arm/mmu/setup.c +++ b/xen/arch/arm/mmu/setup.c @@ -8,6 +8,7 @@ #include #include #include +#include #include diff --git a/xen/arch/arm/mmu/smpboot.c b/xen/arch/arm/mmu/smpboot.c index 12f1a5d761e7..b6fc0aae07f1 100644 --- a/xen/arch/arm/mmu/smpboot.c +++ b/xen/arch/arm/mmu/smpboot.c @@ -7,6 +7,8 @@ #include +#include + /* * Static start-of-day pagetables that we use before the allocators * are up. These are used by all CPUs during bringup before switching -- 2.34.1
[XEN PATCH 6/7] xen/x86: remove stale comment
The comment referred to the declaration for do_mca, which now is part of hypercall-defs.h, therefore the comment is stale. No functional change. Signed-off-by: Nicola Vetrini --- xen/arch/x86/cpu/mcheck/mce.c| 2 +- xen/arch/x86/include/asm/hypercall.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c index 779a458cd88f..53493c8e4778 100644 --- a/xen/arch/x86/cpu/mcheck/mce.c +++ b/xen/arch/x86/cpu/mcheck/mce.c @@ -14,7 +14,7 @@ #include #include #include -#include /* for do_mca */ +#include #include #include diff --git a/xen/arch/x86/include/asm/hypercall.h b/xen/arch/x86/include/asm/hypercall.h index ec2edc771e9d..76658fff19ff 100644 --- a/xen/arch/x86/include/asm/hypercall.h +++ b/xen/arch/x86/include/asm/hypercall.h @@ -12,7 +12,7 @@ #include #include #include -#include /* for do_mca */ +#include #include #define __HYPERVISOR_paging_domctl_cont __HYPERVISOR_arch_1 -- 2.34.1
[XEN PATCH 5/7] docs/misra: add entry to exclude-list.json
x86/efi/check.c is not part of the final Xen binary, therefore it doesn't need to conform to MISRA guidelines at the moment. Signed-off-by: Nicola Vetrini --- docs/misra/exclude-list.json | 4 1 file changed, 4 insertions(+) diff --git a/docs/misra/exclude-list.json b/docs/misra/exclude-list.json index b858a0baa106..24e4de3ca524 100644 --- a/docs/misra/exclude-list.json +++ b/docs/misra/exclude-list.json @@ -93,6 +93,10 @@ "rel_path": "arch/x86/x86_64/mmconf-fam10h.c", "comment": "Imported from Linux, ignore for now" }, +{ +"rel_path": "arch/x86/efi/check.c", +"comment": "The resulting code is not included in the final Xen binary, ignore for now" +}, { "rel_path": "common/bitmap.c", "comment": "Imported from Linux, ignore for now" -- 2.34.1
[XEN PATCH 4/7] x86/viridian: make build_assertions static
This is consistent with other instances of the same function and also resolves a violation of MISRA C:2012 Rule 8.4. No functional change. Signed-off-by: Nicola Vetrini --- xen/arch/x86/hvm/viridian/synic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/arch/x86/hvm/viridian/synic.c b/xen/arch/x86/hvm/viridian/synic.c index 8cf600cec68f..3375e55e95ca 100644 --- a/xen/arch/x86/hvm/viridian/synic.c +++ b/xen/arch/x86/hvm/viridian/synic.c @@ -18,7 +18,7 @@ #include "private.h" -void __init __maybe_unused build_assertions(void) +static void __init __maybe_unused build_assertions(void) { BUILD_BUG_ON(sizeof(struct hv_message) != HV_MESSAGE_SIZE); } -- 2.34.1
[XEN PATCH 0/7] address some violations of MISRA C Rule 8.4
Hi all, this series addresses some of the remaining violations of MISRA C:2012 Rule 8.4. Some of the modifications are done according to the feedback received in this thread [1] Nicola Vetrini (7): xen/arm: mmu: add headers for missing declarations x86/i8259: add missing header for init_IRQ declaration xen/x86: add missing instances of asmlinkage attributes x86/viridian: make build_assertions static docs/misra: add entry to exclude-list.json xen/x86: remove stale comment xen/page_alloc: deviate first_valid_mfn for MISRA C Rule 8.4 docs/misra/exclude-list.json | 4 xen/arch/arm/mmu/setup.c | 1 + xen/arch/arm/mmu/smpboot.c | 2 ++ xen/arch/x86/cpu/mcheck/mce.c| 2 +- xen/arch/x86/desc.c | 2 +- xen/arch/x86/efi/efi-boot.h | 5 +++-- xen/arch/x86/hvm/viridian/synic.c| 2 +- xen/arch/x86/i8259.c | 1 + xen/arch/x86/include/asm/hypercall.h | 2 +- xen/arch/x86/mm.c| 2 +- xen/arch/x86/smpboot.c | 2 +- xen/common/page_alloc.c | 1 + 12 files changed, 18 insertions(+), 8 deletions(-) -- 2.34.1
Re: [PATCH v8 1/2] xen/vpci: header: status register handler
On 11/29/23 06:03, Roger Pau Monné wrote: > On Tue, Nov 28, 2023 at 02:44:24PM -0500, Stewart Hildebrand wrote: >> Introduce a handler for the PCI status register, with ability to mask >> the capabilities bit. The status register contains RsvdZ bits, >> read-only bits, and write-1-to-clear bits. Additionally, we use RsvdP to >> mask the capabilities bit. Introduce bitmasks to handle these in vPCI. >> If a bit in the bitmask is set, then the special meaning applies: >> >> ro_mask: read normal, guest write ignore (preserve on write to hardware) >> rw1c_mask: read normal, write 1 to clear >> rsvdp_mask: read as zero, guest write ignore (preserve on write to >> hardware) >> rsvdz_mask: read as zero, guest write ignore (write zero to hardware) >> >> The RO/RW1C/RsvdP/RsvdZ naming and definitions were borrowed from the >> PCI Express Base 6.1 specification. RsvdP/RsvdZ bits help Xen enforce >> our view of the world. Xen preserves the value of read-only bits on >> write to hardware, discarding the guests write value. This is done in >> case hardware wrongly implements R/O bits as R/W. >> >> The mask_cap_list flag will be set in a follow-on patch. >> >> Signed-off-by: Stewart Hildebrand > > Thanks for adding the tests, this is looking very good, just a couple > of cosmetics comments mostly, and a question whether we should refuse > masks that have bit set outside the register size instead of > attempting to adjust them. > >> --- >> v7->v8: >> * move PCI_STATUS_UDF to rsvdz_mask (per PCI Express Base 6 spec) >> * add support for rsvdp bits >> * add tests for ro/rw1c/rsvdp/rsvdz bits in tools/tests/vpci/main.c >> * dropped R-b tag [1] since the patch has changed moderately since the last >> rev >> >> [1] >> https://lists.xenproject.org/archives/html/xen-devel/2023-09/msg00909.html >> >> v6->v7: >> * re-work args passed to vpci_add_register_mask() (called in init_bars()) >> * also check for overlap of (rsvdz_mask & ro_mask) in add_register() >> * slightly adjust masking operation in vpci_write_helper() >> >> v5->v6: >> * remove duplicate PCI_STATUS_CAP_LIST in constant definition >> * style fixup in constant definitions >> * s/res_mask/rsvdz_mask/ >> * combine a new masking operation into single line >> * preserve r/o bits on write >> * get rid of status_read. Instead, use rsvdz_mask for conditionally masking >> PCI_STATUS_CAP_LIST bit >> * add comment about PCI_STATUS_CAP_LIST and rsvdp behavior >> * add sanity checks in add_register >> * move mask_cap_list from struct vpci_header to local variable >> >> v4->v5: >> * add support for res_mask >> * add support for ro_mask (squash ro_mask patch) >> * add constants for reserved, read-only, and rw1c masks >> >> v3->v4: >> * move mask_cap_list setting to the capabilities patch >> * single pci_conf_read16 in status_read >> * align mask_cap_list bitfield in struct vpci_header >> * change to rw1c bit mask instead of treating whole register as rw1c >> * drop subsystem prefix on renamed add_register function >> >> v2->v3: >> * new patch >> --- >> tools/tests/vpci/main.c| 98 ++ >> xen/drivers/vpci/header.c | 12 + >> xen/drivers/vpci/vpci.c| 62 +++- >> xen/include/xen/pci_regs.h | 9 >> xen/include/xen/vpci.h | 9 >> 5 files changed, 178 insertions(+), 12 deletions(-) >> >> diff --git a/tools/tests/vpci/main.c b/tools/tests/vpci/main.c >> index b9a0a6006bb9..b0bb993be297 100644 >> --- a/tools/tests/vpci/main.c >> +++ b/tools/tests/vpci/main.c >> @@ -70,6 +70,26 @@ static void vpci_write32(const struct pci_dev *pdev, >> unsigned int reg, >> *(uint32_t *)data = val; >> } >> >> +struct mask_data { >> +uint32_t val; >> +uint32_t rw1c_mask; >> +}; >> + >> +static uint32_t vpci_read32_mask(const struct pci_dev *pdev, unsigned int >> reg, >> + void *data) >> +{ >> +struct mask_data *md = data; > > Newline, and possibly const for md. Will do, and will do > >> +return md->val; >> +} >> + >> +static void vpci_write32_mask(const struct pci_dev *pdev, unsigned int reg, >> + uint32_t val, void *data) >> +{ >> +struct mask_data *md = data; > > Newline. Will do > >> +md->val = val | (md->val & md->rw1c_mask); >> +md->val &= ~(val & md->rw1c_mask); >> +} >> + >> #define VPCI_READ(reg, size, data) ({ \ >> data = vpci_read((pci_sbdf_t){ .sbdf = 0 }, reg, size); \ >> }) >> @@ -94,9 +114,20 @@ static void vpci_write32(const struct pci_dev *pdev, >> unsigned int reg, >> assert(!vpci_add_register(test_pdev.vpci, fread, fwrite, off, size, >> \ >>)) >> >> +#define VPCI_ADD_REG_MASK(fread, fwrite, off, size, store, >>\ >> + ro_mask, rw1c_mask, rsvdp_mask, rsvdz_mask) >>\ >> +assert(!vpci_add_register_mask(test_pdev.vpci, fread, fwrite, off, >> size,
xen | Successful pipeline for staging | 902377b6
Pipeline #1089097255 has passed! Project: xen ( https://gitlab.com/xen-project/xen ) Branch: staging ( https://gitlab.com/xen-project/xen/-/commits/staging ) Commit: 902377b6 ( https://gitlab.com/xen-project/xen/-/commit/902377b690f42ddf44ae91c4b0751d597f1cd694 ) Commit Message: xen/livepatch: fix livepatch tests The current... Commit Author: Roger Pau Monne Committed by: Andrew Cooper ( https://gitlab.com/andyhhp ) Pipeline #1089097255 ( https://gitlab.com/xen-project/xen/-/pipelines/1089097255 ) triggered by Ganis ( https://gitlab.com/ganis ) successfully completed 129 jobs in 3 stages. -- You're receiving this email because of your account on gitlab.com.
Re: [PATCH] Config.mk: drop -Wdeclaration-after-statement
On 29.11.23 11:47, Julien Grall wrote: Hi, + Anthony for the tools + Juergen for Xenstored On 29/11/2023 11:34, Alexander Kanavin wrote: On 11/29/23 08:51, Jan Beulich wrote: On 28.11.2023 18:47, Alexander Kanavin wrote: Such constructs are fully allowed by C99: https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Mixed-Labels-and-Declarations.html#Mixed-Labels-and-Declarations There's more to this: It may also be a policy of ours (or of any sub-component) to demand that declarations and statements are properly separated. This would therefore need discussing first. The error is coming from python 3.12 headers and not from anything in xen tree, no? As you don't have control over those headers, I'm not sure what other solution there could be. We seem to add -Wno-declaration-after-statement for some components in tools/. So one possibility would be to move the flags to an hypervisor specific makefile (in xen/). Anthony/Juergen, do you have any concern if the tools are built without -Wdeclaration-after-statement? I could live with that. Juergen OpenPGP_0xB0DE9DD628BF132F.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature
Re: [PATCH] xen/livepatch: fix livepatch tests
On Wed, Nov 29, 2023 at 10:32:32AM +, Ross Lagerwall wrote: > On Tue, Nov 28, 2023 at 5:41 PM Roger Pau Monne wrote: > > > > The current set of in-tree livepatch tests in xen/test/livepatch started > > failing after the constify of the payload funcs array, and the movement of > > the > > status data into a separate array. > > > > Fix the tests so they respect the constness of the funcs array and also make > > use of the new location of the per-func state data. > > > > Fixes: 82182ad7b46e ('livepatch: do not use .livepatch.funcs section to > > store internal state') > > Signed-off-by: Roger Pau Monné > > --- > > I will see about getting those tests build in gitlab, in the meantime we > > should > > take this fix in order to unblock osstest. > > --- > > xen/test/livepatch/xen_action_hooks.c | 12 +- > > xen/test/livepatch/xen_action_hooks_marker.c | 20 ++--- > > xen/test/livepatch/xen_action_hooks_noapply.c | 22 +++ > > xen/test/livepatch/xen_action_hooks_nofunc.c | 6 ++--- > > .../livepatch/xen_action_hooks_norevert.c | 22 +++ > > xen/test/livepatch/xen_prepost_hooks.c| 8 +++ > > xen/test/livepatch/xen_prepost_hooks_fail.c | 2 +- > > 7 files changed, 53 insertions(+), 39 deletions(-) > > > snip > > diff --git a/xen/test/livepatch/xen_action_hooks_norevert.c > > b/xen/test/livepatch/xen_action_hooks_norevert.c > > index ef77e720713e..1c4873f55640 100644 > > --- a/xen/test/livepatch/xen_action_hooks_norevert.c > > +++ b/xen/test/livepatch/xen_action_hooks_norevert.c > > @@ -25,9 +25,10 @@ static int pre_apply_hook(livepatch_payload_t *payload) > > > > for (i = 0; i < payload->nfuncs; i++) > > { > > -struct livepatch_func *func = >funcs[i]; > > +const struct livepatch_func *func = >funcs[i]; > > +struct livepatch_fstate *fstate = >fstate[i]; > > > > -BUG_ON(func->applied == LIVEPATCH_FUNC_APPLIED); > > +BUG_ON(fstate->applied == LIVEPATCH_FUNC_APPLIED); > > printk(KERN_DEBUG "%s: pre applied: %s\n", __func__, func->name); > > } > > > > @@ -44,9 +45,10 @@ static void post_apply_hook(livepatch_payload_t *payload) > > > > for (i = 0; i < payload->nfuncs; i++) > > { > > -struct livepatch_func *func = >funcs[i]; > > +const struct livepatch_func *func = >funcs[i]; > > +struct livepatch_fstate *fstate = >fstate[i]; > > > > -BUG_ON(func->applied != LIVEPATCH_FUNC_APPLIED); > > +BUG_ON(fstate->applied != LIVEPATCH_FUNC_APPLIED); > > printk(KERN_DEBUG "%s: post applied: %s\n", __func__, func->name); > > } > > > > @@ -62,8 +64,9 @@ static int pre_revert_hook(livepatch_payload_t *payload) > > for (i = 0; i < payload->nfuncs; i++) > > { > > struct livepatch_func *func = >funcs[i]; > > const here too? > > With that fixed... > Reviewed-by: Ross Lagerwall Oh, so that file is not even built. Will see about getting it built. Thanks, Roger.
Re: [PATCH v8 2/2] xen/vpci: header: filter PCI capabilities
On Tue, Nov 28, 2023 at 02:44:25PM -0500, Stewart Hildebrand wrote: > Currently, Xen vPCI only supports virtualizing the MSI and MSI-X capabilities. > Hide all other PCI capabilities (including extended capabilities) from domUs > for > now, even though there may be certain devices/drivers that depend on being > able > to discover certain capabilities. > > We parse the physical PCI capabilities linked list and add vPCI register > handlers for the next elements, inserting our own next value, thus presenting > a > modified linked list to the domU. > > Introduce helper functions vpci_hw_read8 and vpci_read_val. The vpci_read_val > helper function returns a fixed value, which may be used for RAZ registers, or ^ read as zero > registers whose value doesn't change. > > Introduce pci_find_next_cap_ttl() helper while adapting the logic from > pci_find_next_cap() to suit our needs, and implement the existing > pci_find_next_cap() in terms of the new helper. > > Signed-off-by: Stewart Hildebrand LGTM, some nits below: Reviewed-by: Roger Pau Monné > --- > v7->v8: > * use to array instead of match function > * include lib.h for ARRAY_SIZE > * don't emulate PCI_CAPABILITY_LIST register if PCI_STATUS_CAP_LIST bit is not > set in hardware > * spell out RAZ/WI acronym > * dropped R-b tag since the patch has changed moderately since the last rev > > v6->v7: > * no change > > v5->v6: > * add register handlers before status register handler in init_bars() > * s/header->mask_cap_list/mask_cap_list/ > > v4->v5: > * use more appropriate types, continued > * get rid of unnecessary hook function > * add Jan's R-b > > v3->v4: > * move mask_cap_list setting to this patch > * leave pci_find_next_cap signature alone > * use more appropriate types > > v2->v3: > * get rid of > 0 in loop condition > * implement pci_find_next_cap in terms of new pci_find_next_cap_ttl function > so > that hypothetical future callers wouldn't be required to pass > * change NULL to (void *)0 for RAZ value passed to vpci_read_val > * change type of ttl to unsigned int > * remember to mask off the low 2 bits of next in the initial loop iteration > * change return type of pci_find_next_cap and pci_find_next_cap_ttl > * avoid wrapping the PCI_STATUS_CAP_LIST condition by using ! instead of == 0 > > v1->v2: > * change type of ttl to int > * use switch statement instead of if/else > * adapt existing pci_find_next_cap helper instead of rolling our own > * pass ttl as in/out > * "pass through" the lower 2 bits of the next pointer > * squash helper functions into this patch to avoid transient dead code > situation > * extended capabilities RAZ/WI > --- > xen/drivers/pci/pci.c | 31 --- > xen/drivers/vpci/header.c | 63 +++ > xen/drivers/vpci/vpci.c | 12 > xen/include/xen/pci.h | 3 ++ > xen/include/xen/vpci.h| 5 > 5 files changed, 104 insertions(+), 10 deletions(-) > > diff --git a/xen/drivers/pci/pci.c b/xen/drivers/pci/pci.c > index 3569ccb24e9e..1645b3118220 100644 > --- a/xen/drivers/pci/pci.c > +++ b/xen/drivers/pci/pci.c > @@ -39,31 +39,42 @@ unsigned int pci_find_cap_offset(pci_sbdf_t sbdf, > unsigned int cap) > return 0; > } > > -unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos, > - unsigned int cap) > +unsigned int pci_find_next_cap_ttl(pci_sbdf_t sbdf, unsigned int pos, > + unsigned int *cap, unsigned int n, > + unsigned int *ttl) > { > -u8 id; > -int ttl = 48; > +unsigned int id, i; Nit: those can be defined inside the while loop. > -while ( ttl-- ) > +while ( (*ttl)-- ) > { > pos = pci_conf_read8(sbdf, pos); > if ( pos < 0x40 ) > break; > > -pos &= ~3; > -id = pci_conf_read8(sbdf, pos + PCI_CAP_LIST_ID); > +id = pci_conf_read8(sbdf, (pos & ~3) + PCI_CAP_LIST_ID); > > if ( id == 0xff ) > break; > -if ( id == cap ) > -return pos; > +for ( i = 0; i < n; i++ ) > +{ > +if ( id == cap[i] ) > +return pos; > +} > > -pos += PCI_CAP_LIST_NEXT; > +pos = (pos & ~3) + PCI_CAP_LIST_NEXT; > } > + > return 0; > } > > +unsigned int pci_find_next_cap(pci_sbdf_t sbdf, unsigned int pos, > + unsigned int cap) > +{ > +unsigned int ttl = 48; > + > +return pci_find_next_cap_ttl(sbdf, pos, , 1, ) & ~3; > +} > + > /** > * pci_find_ext_capability - Find an extended capability > * @sbdf: PCI device to query > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c > index 351318121e48..d7dc0c82a6ba 100644 > --- a/xen/drivers/vpci/header.c > +++ b/xen/drivers/vpci/header.c > @@ -18,6 +18,7 @@ > */ > > #include > +#include >
[PATCH v5 2/5] x86/paravirt: move some functions and defines to alternative
As a preparation for replacing paravirt patching completely by alternative patching, move some backend functions and #defines to alternative code and header. Signed-off-by: Juergen Gross --- V4: - rename x86_nop() to nop_func() and x86_BUG() to BUG_func() (Boris Petkov) --- arch/x86/include/asm/alternative.h| 16 arch/x86/include/asm/paravirt.h | 12 - arch/x86/include/asm/paravirt_types.h | 4 +-- arch/x86/include/asm/qspinlock_paravirt.h | 4 +-- arch/x86/kernel/alternative.c | 10 arch/x86/kernel/kvm.c | 4 +-- arch/x86/kernel/paravirt.c| 30 +++ arch/x86/xen/irq.c| 2 +- 8 files changed, 41 insertions(+), 41 deletions(-) diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 65f79092c9d9..ce788ab4e77c 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -330,6 +330,22 @@ static inline int alternatives_text_reserved(void *start, void *end) */ #define ASM_NO_INPUT_CLOBBER(clbr...) "i" (0) : clbr +/* Macro for creating assembler functions avoiding any C magic. */ +#define DEFINE_ASM_FUNC(func, instr, sec) \ + asm (".pushsection " #sec ", \"ax\"\n" \ +".global " #func "\n\t"\ +".type " #func ", @function\n\t" \ +ASM_FUNC_ALIGN "\n"\ +#func ":\n\t" \ +ASM_ENDBR \ +instr "\n\t" \ +ASM_RET\ +".size " #func ", . - " #func "\n\t" \ +".popsection") + +void BUG_func(void); +void nop_func(void); + #else /* __ASSEMBLY__ */ #ifdef CONFIG_SMP diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index aa76ac7c806c..f18bfa7f3070 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -720,18 +720,6 @@ static __always_inline unsigned long arch_local_irq_save(void) #undef PVOP_VCALL4 #undef PVOP_CALL4 -#define DEFINE_PARAVIRT_ASM(func, instr, sec) \ - asm (".pushsection " #sec ", \"ax\"\n" \ -".global " #func "\n\t"\ -".type " #func ", @function\n\t" \ -ASM_FUNC_ALIGN "\n"\ -#func ":\n\t" \ -ASM_ENDBR \ -instr "\n\t" \ -ASM_RET\ -".size " #func ", . - " #func "\n\t" \ -".popsection") - extern void default_banner(void); void native_pv_lock_init(void) __init; diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index 483e19e5ca7a..166e9618158f 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -540,8 +540,6 @@ int paravirt_disable_iospace(void); __PVOP_VCALL(op, PVOP_CALL_ARG1(arg1), PVOP_CALL_ARG2(arg2),\ PVOP_CALL_ARG3(arg3), PVOP_CALL_ARG4(arg4)) -void _paravirt_nop(void); -void paravirt_BUG(void); unsigned long paravirt_ret0(void); #ifdef CONFIG_PARAVIRT_XXL u64 _paravirt_ident_64(u64); @@ -551,7 +549,7 @@ void pv_native_irq_enable(void); unsigned long pv_native_read_cr2(void); #endif -#define paravirt_nop ((void *)_paravirt_nop) +#define paravirt_nop ((void *)nop_func) extern struct paravirt_patch_site __parainstructions[], __parainstructions_end[]; diff --git a/arch/x86/include/asm/qspinlock_paravirt.h b/arch/x86/include/asm/qspinlock_paravirt.h index 85b6e3609cb9..ef9697f20129 100644 --- a/arch/x86/include/asm/qspinlock_paravirt.h +++ b/arch/x86/include/asm/qspinlock_paravirt.h @@ -56,8 +56,8 @@ __PV_CALLEE_SAVE_REGS_THUNK(__pv_queued_spin_unlock_slowpath, ".spinlock.text"); "pop%rdx\n\t" \ FRAME_END -DEFINE_PARAVIRT_ASM(__raw_callee_save___pv_queued_spin_unlock, - PV_UNLOCK_ASM, .spinlock.text); +DEFINE_ASM_FUNC(__raw_callee_save___pv_queued_spin_unlock, + PV_UNLOCK_ASM, .spinlock.text); #else /* CONFIG_64BIT */ diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index be35c8ccf826..ca25dd280b8c 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -385,6 +385,16 @@ apply_relocation(u8 *buf, size_t len, u8 *dest, u8 *src, size_t src_len) } } +/* Low-level backend functions usable from alternative code replacements. */ +DEFINE_ASM_FUNC(nop_func, "", .entry.text); +EXPORT_SYMBOL_GPL(nop_func); + +noinstr void BUG_func(void) +{ + BUG(); +} +EXPORT_SYMBOL_GPL(BUG_func); +
[PATCH v5 0/5] 86/paravirt: Get rid of paravirt patching
This is a small series getting rid of paravirt patching by switching completely to alternative patching for the same functionality. The basic idea is to add the capability to switch from indirect to direct calls via a special alternative patching option. This removes _some_ of the paravirt macro maze, but most of it needs to stay due to the need of hiding the call instructions from the compiler in order to avoid needless register save/restore. What is going away is the nasty stacking of alternative and paravirt patching and (of course) the special .parainstructions linker section. I have tested the series on bare metal and as Xen PV domain to still work. Note that objtool might need some changes to cope with the new indirect call patching mechanism. Additionally some paravirt handling can probably be removed from it. Changes in V5: - addressed Boris' comments - rebased on top of the tip/master branch Changes in V4: - addressed Boris' comments in patch 1 - fixed bugs found by kernel test robot (patch 2) Changes in V3: - split v2 patch 3 into 2 patches as requested by Peter and Ingo Changes in V2: - split last patch into 2 - rebase of patch 2 as suggested by Peter - addressed Peter's comments for patch 3 Juergen Gross (5): x86/paravirt: introduce ALT_NOT_XEN x86/paravirt: move some functions and defines to alternative x86/alternative: add indirect call patching x86/paravirt: switch mixed paravirt/alternative calls to alternative_2 x86/paravirt: remove no longer needed paravirt patching code arch/x86/include/asm/alternative.h| 30 +- arch/x86/include/asm/paravirt.h | 77 - arch/x86/include/asm/paravirt_types.h | 85 +-- arch/x86/include/asm/qspinlock_paravirt.h | 4 +- arch/x86/include/asm/text-patching.h | 12 --- arch/x86/kernel/alternative.c | 125 ++ arch/x86/kernel/callthunks.c | 17 ++- arch/x86/kernel/kvm.c | 4 +- arch/x86/kernel/module.c | 20 +--- arch/x86/kernel/paravirt.c| 54 ++ arch/x86/kernel/vmlinux.lds.S | 13 --- arch/x86/tools/relocs.c | 2 +- arch/x86/xen/irq.c| 2 +- 13 files changed, 161 insertions(+), 284 deletions(-) -- 2.35.3
Re: [PATCH] xen/public: fix flexible array definitions
On 25/07/2023 2:55 pm, Juergen Gross wrote: > Flexible arrays in public headers can be problematic with some > compilers. > > Replace them with arr[XEN_FLEX_ARRAY_DIM] in order to avoid compilation > errors. > > This includes arrays defined as "arr[1]", as seen with a recent Linux > kernel [1]. > > [1]: https://bugzilla.kernel.org/show_bug.cgi?id=217693 > > Signed-off-by: Juergen Gross I know this is a change in the public headers, and I know it will cause changes in the behaviour of sizeof() against these, but 1) We expect people to copy these files, so the change here isn't breaking others, and 2) The use of sizeof() on these structs is buggy in the first place, and 3) The use of sizeof() with these structs is unlikely because they're variadic 4) It really genuinely is UB as reported by toolchains It may not be great, but it's the least bad of a lot of bad options. This definitely needs a note in CHANGELOG. Subject to something suitable there, Reviewed-by: Andrew Cooper ~Andrew
Re: [PATCH] Config.mk: drop -Wdeclaration-after-statement
On Wed, Nov 29, 2023 at 11:47:24AM +0100, Julien Grall wrote: > Hi, > > + Anthony for the tools > + Juergen for Xenstored > > On 29/11/2023 11:34, Alexander Kanavin wrote: > > On 11/29/23 08:51, Jan Beulich wrote: > > > > > On 28.11.2023 18:47, Alexander Kanavin wrote: > > > > Such constructs are fully allowed by C99: > > > > https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Mixed-Labels-and-Declarations.html#Mixed-Labels-and-Declarations > > > There's more to this: It may also be a policy of ours (or of any > > > sub-component) > > > to demand that declarations and statements are properly separated. > > > This would > > > therefore need discussing first. > > > > The error is coming from python 3.12 headers and not from anything in > > xen tree, no? As you don't have control over those headers, I'm not sure > > what other solution there could be. > > We seem to add -Wno-declaration-after-statement for some components in > tools/. So one possibility would be to move the flags to an hypervisor > specific makefile (in xen/). You mean xen/Makefile I hope. > Anthony/Juergen, do you have any concern if the tools are built without > -Wdeclaration-after-statement? I don't, and as you said, there's already quite a few -Wno-declaration-after-statement. It can be nice to add a new variable in the middle of a function, it's like creating a new scope without adding extra indentation (if we wanted a new scope, we would need {} thus the intend). Cheers, -- Anthony PERARD
Re: [XEN PATCH] x86/guest_walk: address violations of MISRA C:2012 Rule 8.3
On 29/11/23 12:09, Jan Beulich wrote: On 29.11.2023 10:35, Federico Serafini wrote: I take this opportunity to inform that we are really close to the end with Rule 8.3 for x86, this is the situation: - do_multicall(), Stefano sent a patch; - guest_walk_tables(), Andrew will take care of it; - xenmem_add_to_physmap_one(), this is the last one. For the latter, I see you (x86) share the declaration with ARM, where "gfn" is used for the last parameter instead of "gpfn". Do you agree in changing the name in the definition from "gpfn" to "gfn"? Yes. If you agree, do you have any suggestions on how to rename the local variable "gfn"? Considering its exclusive use for the XENMAPSPACE_gmfn case, I'm inclined to suggest "gmfn", despite this being a term we generally try to get rid of. Maybe Andrew or Roger have a better suggestion. Along with renaming "gpfn" it would be nice (for consistency) to also rename "old_gpfn" at the same time. Thank you. -- Federico Serafini, M.Sc. Software Engineer, BUGSENG (http://bugseng.com)
Re: [PATCH v4 02/14] xen/asm-generic: introduce generic device.h
On Tue, 2023-11-28 at 15:28 -0600, Shawn Anastasio wrote: > Hi Oleksii, > > On 11/27/23 8:13 AM, Oleksii Kurochko wrote: > > diff --git a/xen/arch/ppc/include/asm/Makefile > > b/xen/arch/ppc/include/asm/Makefile > > index ece7fa66dd..df4c1ebb08 100644 > > --- a/xen/arch/ppc/include/asm/Makefile > > +++ b/xen/arch/ppc/include/asm/Makefile > > @@ -1,3 +1,4 @@ > > # SPDX-License-Identifier: GPL-2.0-only > > +generic-y += device.h > > generic-y += paging.h > > generic-y += vm_event.h > > diff --git a/xen/arch/ppc/include/asm/device.h > > b/xen/arch/ppc/include/asm/device.h > > deleted file mode 100644 > > index 8253e61d51..00 > > --- a/xen/arch/ppc/include/asm/device.h > > +++ /dev/null > > @@ -1,53 +0,0 @@ > > -/* SPDX-License-Identifier: GPL-2.0-only */ > > -#ifndef __ASM_PPC_DEVICE_H__ > > -#define __ASM_PPC_DEVICE_H__ > > - > > -enum device_type > > -{ > > - DEV_DT, > > - DEV_PCI, > > -}; > > - > > -struct device { > > - enum device_type type; > > -#ifdef CONFIG_HAS_DEVICE_TREE > > - struct dt_device_node *of_node; /* Used by drivers imported > > from Linux */ > > -#endif > > -}; > > - > > -enum device_class > > -{ > > - DEVICE_SERIAL, > > - DEVICE_IOMMU, > > - DEVICE_PCI_HOSTBRIDGE, > > - /* Use for error */ > > - DEVICE_UNKNOWN, > > -}; > > - > > -struct device_desc { > > - /* Device name */ > > - const char *name; > > - /* Device class */ > > - enum device_class class; > > - /* List of devices supported by this driver */ > > - const struct dt_device_match *dt_match; > > - /* > > - * Device initialization. > > - * > > - * -EAGAIN is used to indicate that device probing is > > deferred. > > - */ > > - int (*init)(struct dt_device_node *dev, const void *data); > > -}; > > - > > -typedef struct device device_t; > > - > > -#define DT_DEVICE_START(name_, namestr_, > > class_) \ > > -static const struct device_desc __dev_desc_##name_ > > __used \ > > -__section(".dev.info") = > > { \ > > - .name = > > namestr_, \ > > - .class = > > class_, \ > > - > > -#define > > DT_DEVICE_END \ > > -}; > > - > > -#endif /* __ASM_PPC_DEVICE_H__ */ > > diff --git a/xen/arch/ppc/include/asm/irq.h > > b/xen/arch/ppc/include/asm/irq.h > > index 5c37d0cf25..49193fddff 100644 > > --- a/xen/arch/ppc/include/asm/irq.h > > +++ b/xen/arch/ppc/include/asm/irq.h > > @@ -3,7 +3,9 @@ > > #define __ASM_PPC_IRQ_H__ > > > > #include > > +#ifdef CONFIG_HAS_DEVICE_TREE > > I realize that you were likely following PPC's device.h which also > checks CONFIG_HAS_DEVICE_TREE, but I'm not sure that it makes sense > to > check this conditional in PPC code at all -- we will always have > HAS_DEVICE_TREE (selected by PPC) and I can't imagine a scenario > where > this will ever not be the case. What about case if ACPI is used? Does ACPI is supported by PPC? But if you are sure that CONFIG_HAS_DEVICE_TREE will be always selected then I am OK to remove this change. > > Unless Jan (or someone else) disagrees, I'd rather this conditional > be > dropped inside of PPC code. I'll double check but I think I had a compilation issue if it isn't defined. > > > #include > > +#endif > > #include > > > > /* TODO */ > > @@ -25,9 +27,11 @@ static inline void arch_move_irqs(struct vcpu > > *v) > > BUG_ON("unimplemented"); > > } > > > > +#ifdef CONFIG_HAS_DEVICE_TREE > > Ditto. > > > static inline int platform_get_irq(const struct dt_device_node > > *device, int index) > > { > > BUG_ON("unimplemented"); > > } > > +#endif > > > > #endif /* __ASM_PPC_IRQ_H__ */ Thanks. ~ Oleksii
Re: [PATCH v4 10/14] xen/asm-generic: introduce stub header monitor.h
On Wed, 2023-11-29 at 09:21 +0100, Jan Beulich wrote: > On 29.11.2023 09:19, Jan Beulich wrote: > > On 28.11.2023 23:21, Shawn Anastasio wrote: > > > On 11/27/23 8:13 AM, Oleksii Kurochko wrote: > > > > --- a/xen/arch/ppc/include/asm/monitor.h > > > > +++ /dev/null > > > > @@ -1,43 +0,0 @@ > > > > -/* SPDX-License-Identifier: GPL-2.0-only */ > > > > -/* Derived from xen/arch/arm/include/asm/monitor.h */ > > > > -#ifndef __ASM_PPC_MONITOR_H__ > > > > -#define __ASM_PPC_MONITOR_H__ > > > > - > > > > -#include > > > > -#include > > > > - > > > > -static inline > > > > -void arch_monitor_allow_userspace(struct domain *d, bool > > > > allow_userspace) > > > > -{ > > > > -} > > > > - > > > > -static inline > > > > -int arch_monitor_domctl_op(struct domain *d, struct > > > > xen_domctl_monitor_op *mop) > > > > -{ > > > > - /* No arch-specific monitor ops on PPC. */ > > > > - return -EOPNOTSUPP; > > > > -} > > > > - > > > > -int arch_monitor_domctl_event(struct domain *d, > > > > - struct xen_domctl_monitor_op > > > > *mop); > > > > - > > > > -static inline > > > > -int arch_monitor_init_domain(struct domain *d) > > > > -{ > > > > - /* No arch-specific domain initialization on PPC. */ > > > > - return 0; > > > > -} > > > > - > > > > -static inline > > > > -void arch_monitor_cleanup_domain(struct domain *d) > > > > -{ > > > > - /* No arch-specific domain cleanup on PPC. */ > > > > -} > > > > - > > > > -static inline uint32_t arch_monitor_get_capabilities(struct > > > > domain *d) > > > > -{ > > > > - BUG_ON("unimplemented"); > > > > > > I'm not sure how I feel about this assertion being dropped in the > > > generic header. In general my philosophy when creating these stub > > > headers was to insert as many of these assertions as possible so > > > we > > > don't end up in a scenario where during early bringup I miss a > > > function > > > that was originally stubbed but ought to be implemented > > > eventually. > > > > > > Looking at ARM's monitor.h too, it seems that this function is > > > the only > > > one that differs from the generic/stub version. I'm wondering if > > > it > > > would make sense to leave this function out of the generic > > > header, to be > > > defined by the arch similar to what you've done with some other > > > functions in this series. That would also allow ARM to be brought > > > in to > > > using the generic variant. > > > > Yet then where would that function live, if not in > > arch/*/include/asm/monitor.h? > > Hmm, maybe implicitly you're proposing that > arch/*/include/asm/monitor.h > include include/asm-generic/monitor.h in such a case, and define this > one > function on top? I think it can be a solution. The same I suggest in my direct reply to Shawn message ( I didn't see your answer ). If everyone is OK with such solution, I can apply it for the next version of patch series. ~ Oleksii
Re: [PATCH v4 10/14] xen/asm-generic: introduce stub header monitor.h
Hi Shawn, On Tue, 2023-11-28 at 16:21 -0600, Shawn Anastasio wrote: > Hi Oleksii, > > On 11/27/23 8:13 AM, Oleksii Kurochko wrote: > > The header is shared between several archs so it is > > moved to asm-generic. > > > > Signed-off-by: Oleksii Kurochko > > Acked-by: Jan Beulich . > > --- > > Changes in V4: > > - Removed the double blank line. > > - Added Acked-by: Jan Beulich . > > - Update the commit message > > --- > > Changes in V3: > > - Use forward-declaration of struct domain instead of " #include > > ". > > - Add ' include ' > > - Drop PPC's monitor.h. > > --- > > Changes in V2: > > - remove inclusion of "+#include " > > - add "struct xen_domctl_monitor_op;" > > - remove one of SPDX tags. > > --- > > xen/arch/ppc/include/asm/Makefile | 1 + > > xen/arch/ppc/include/asm/monitor.h | 43 - > > xen/include/asm-generic/monitor.h | 62 > > ++ > > 3 files changed, 63 insertions(+), 43 deletions(-) > > delete mode 100644 xen/arch/ppc/include/asm/monitor.h > > create mode 100644 xen/include/asm-generic/monitor.h > > > > diff --git a/xen/arch/ppc/include/asm/Makefile > > b/xen/arch/ppc/include/asm/Makefile > > index 319e90955b..bcddcc181a 100644 > > --- a/xen/arch/ppc/include/asm/Makefile > > +++ b/xen/arch/ppc/include/asm/Makefile > > @@ -5,6 +5,7 @@ generic-y += div64.h > > generic-y += hardirq.h > > generic-y += hypercall.h > > generic-y += iocap.h > > +generic-y += monitor.h > > generic-y += paging.h > > generic-y += percpu.h > > generic-y += random.h > > diff --git a/xen/arch/ppc/include/asm/monitor.h > > b/xen/arch/ppc/include/asm/monitor.h > > deleted file mode 100644 > > index e5b0282bf1..00 > > --- a/xen/arch/ppc/include/asm/monitor.h > > +++ /dev/null > > @@ -1,43 +0,0 @@ > > -/* SPDX-License-Identifier: GPL-2.0-only */ > > -/* Derived from xen/arch/arm/include/asm/monitor.h */ > > -#ifndef __ASM_PPC_MONITOR_H__ > > -#define __ASM_PPC_MONITOR_H__ > > - > > -#include > > -#include > > - > > -static inline > > -void arch_monitor_allow_userspace(struct domain *d, bool > > allow_userspace) > > -{ > > -} > > - > > -static inline > > -int arch_monitor_domctl_op(struct domain *d, struct > > xen_domctl_monitor_op *mop) > > -{ > > - /* No arch-specific monitor ops on PPC. */ > > - return -EOPNOTSUPP; > > -} > > - > > -int arch_monitor_domctl_event(struct domain *d, > > - struct xen_domctl_monitor_op *mop); > > - > > -static inline > > -int arch_monitor_init_domain(struct domain *d) > > -{ > > - /* No arch-specific domain initialization on PPC. */ > > - return 0; > > -} > > - > > -static inline > > -void arch_monitor_cleanup_domain(struct domain *d) > > -{ > > - /* No arch-specific domain cleanup on PPC. */ > > -} > > - > > -static inline uint32_t arch_monitor_get_capabilities(struct domain > > *d) > > -{ > > - BUG_ON("unimplemented"); > > I'm not sure how I feel about this assertion being dropped in the > generic header. In general my philosophy when creating these stub > headers was to insert as many of these assertions as possible so we > don't end up in a scenario where during early bringup I miss a > function > that was originally stubbed but ought to be implemented eventually. > > Looking at ARM's monitor.h too, it seems that this function is the > only > one that differs from the generic/stub version. I'm wondering if it > would make sense to leave this function out of the generic header, to > be > defined by the arch similar to what you've done with some other > functions in this series. That would also allow ARM to be brought in > to > using the generic variant. Thanks for the comment. For RISC-V, I did in the same way ( about BUG() and unimplemented for time being functions ). I agree that such implementation isn't correct for generic header. I think the best one solution will be to include in whwere only this function will be implemented ( because implementation of other functions are the same for Arm, PPC and RISC-V ). > > > - return 0; > > -} > > - > > -#endif /* __ASM_PPC_MONITOR_H__ */ > > diff --git a/xen/include/asm-generic/monitor.h b/xen/include/asm- > > generic/monitor.h > > new file mode 100644 > > index 00..6be8614431 > > --- /dev/null > > +++ b/xen/include/asm-generic/monitor.h > > @@ -0,0 +1,62 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +/* > > + * include/asm-GENERIC/monitor.h > > + * > > + * Arch-specific monitor_op domctl handler. > > + * > > + * Copyright (c) 2015 Tamas K Lengyel (ta...@tklengyel.com) > > + * Copyright (c) 2016, Bitdefender S.R.L. > > + * > > + */ > > + > > +#ifndef __ASM_GENERIC_MONITOR_H__ > > +#define __ASM_GENERIC_MONITOR_H__ > > + > > +#include > > + > > +struct domain; > > +struct xen_domctl_monitor_op; > > + > > +static inline > > +void arch_monitor_allow_userspace(struct domain *d, bool > > allow_userspace) > > +{ > > +} > > + > > +static inline > > +int arch_monitor_domctl_op(struct
Re: [PATCH v4 00/14] Introduce generic headers
On Wed, 2023-11-29 at 10:25 +0100, Jan Beulich wrote: > On 27.11.2023 15:13, Oleksii Kurochko wrote: > > Oleksii Kurochko (14): > > xen/asm-generic: introduce stub header paging.h > > xen/asm-generic: introduce generic device.h > > xen/asm-generic: introduce generic hypercall.h > > xen/asm-generic: introduce generic header iocap.h > > xen/asm-generic: introduce stub header > > xen/asm-generic: introduce generic header percpu.h > > xen/asm-generic: introduce generalized hardirq.h > > xen/asm-generic: introduce generic div64.h header > > xen/asm-generic: introduce generic header altp2m.h > > xen/asm-generic: introduce stub header monitor.h > > xen/asm-generic: introduce stub header numa.h > > xen/asm-generic: introduce stub header softirq.h > > xen: ifdef inclusion of in > > > > xen/asm-generic: ifdef inclusion of > > I've applied what I think was suitably ack-ed, re-basing over patches > which > need further work. For the rest I'd like to ask that you do the > necessary > re-basing, perhaps moving in particular the device.h change later in > the > series (so that "easier" changes can go in without further tweaking). Thanks a lot. I'll take into account your advice. ~ Oleksii
[linux-linus test] 183917: tolerable FAIL - PUSHED
flight 183917 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/183917/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 183884 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 183884 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 183884 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 183884 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 183884 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 183884 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 183884 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 183884 test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass version targeted for testing: linux18d46e76d7c2eedd8577fae67e3f1d4db25018b0 baseline version: linuxdf60cee26a2e3d937a319229e335cb3f9c1f16d2 Last test of basis 183884 2023-11-28 08:21:29 Z1 days Testing same since 183917 2023-11-28 22:12:07 Z0 days1 attempts People who touched revisions under test: Bragatheswaran Manickavel David Sterba Filipe Manana Jann Horn Linus Torvalds Qu Wenruo jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-arm64-libvirt