Re: [SeaBIOS] [RFC v2 0/3] Support multiple pci domains in pci_device
Hi Kevin, On 08/28/2018 08:02 PM, Kevin O'Connor wrote: On Tue, Aug 28, 2018 at 12:14:58PM +0200, Gerd Hoffmann wrote: Hi, Where is the pxb-pcie device? :$somewhere? Or $domain:00:00.0? :$somewhere (On PCI domain 0) Cool, so we don't have an chicken-and-egg issue. If we can access pxb-pcie registers before configuring MMCFG then yes, we should use pxb-pcie registers for that. Yes, we can. Ok, so we can configure mmcfg as hidden pci bar, simliar to the q35 mmcfg. Any configuration hints can be passed as pci vendor capability (simliar to the bridge window size hints), if needed. Just so I understand, the proposal is to have SeaBIOS search for pxb-pcie devices on the main PCI bus and allocate address space for each. (These devices would not be considered pci buses in the traditional sense.) Then SeaBIOS will traverse that address space (MMCFG) and allocate BARs (both address space and io space) for the PCI devices found in that address space. Finally, QEMU will take all those allocations and use it when generating the ACPI tables. Did I get that right? Yes, the pxb-pcie exposes a new PCI root bus, but we want it in a different PCI domain. This is done in order to remove the 256 PCI Express devices limitation on a PCI Express machine. Does the plan sounds sane? Thanks, Marcel -Kevin ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC v2 0/3] Support multiple pci domains in pci_device
Hi Gerd, On 08/28/2018 09:07 AM, Gerd Hoffmann wrote: Hi, Since we will not use all 256 buses of an extra PCI domain, I think this space will allow us to support more PCI domains. Depends on the use case I guess. If you just need many pcie devices this probably doesn't help. If you want them for numa support then yes, more domains with less devices each can be useful then. We already support multiple NUMA nodes. We want more devices. Still, having 4x number of devices we previously supported is a good step forward. How will the flow look like ? 1. QEMU passes to SeaBIOS information of how many extra PCI domains needs, and how many buses per domain. How it will pass this info? A vendor specific capability, some PCI registers or modifying extra-pci-roots fw_cfg file? Where is the pxb-pcie device? :$somewhere? Or $domain:00:00.0? :$somewhere (On PCI domain 0) 2. SeaBIOS assigns the address for each PCI Domain and returns the information to QEMU. How it will do that? Some pxb-pcie registers? Or do we model the MMCFG like a PCI BAR? If we can access pxb-pcie registers before configuring MMCFG then yes, we should use pxb-pcie registers for that. Yes, we can. Thanks Gerd! Marcel cheers, Gerd ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC v2 0/3] Support multiple pci domains in pci_device
Hi Gerd On 08/28/2018 07:12 AM, Zihan Yang wrote: Gerd Hoffmann 于2018年8月27日周一 上午7:04写道: Hi, However, QEMU only binds port 0xcf8 and 0xcfc to bus pcie.0. To avoid bus confliction, we should use other port pairs for busses under new domains. I would skip support for IO based configuration and use only MMCONFIG for extra root buses. The question remains: how do we assign MMCONFIG space for each PCI domain. Thanks for your comments! Allocation-wise it would be easiest to place them above 4G. Right after memory, or after etc/reserved-memory-end (if that fw_cfg file is present), where the 64bit pci bars would have been placed. Move the pci bars up in address space to make room. Only problem is that seabios wouldn't be able to access mmconfig then. Placing them below 4G would work at least for a few pci domains. q35 mmconfig bar is placed at 0xb000 -> 0xbfff, basically for historical reasons. Old qemu versions had 2.75G low memory on q35 (up to 0xafff), and I think old machine types still have that for live migration compatibility reasons. Modern qemu uses 2G only, to make gigabyte alignment work. 32bit pci bars are placed above 0xc000. The address space from 2G to 2.75G (0x800 -> 0xafff) is unused on new machine types. Enough room for three additional mmconfig bars (full size), so four pci domains total if you add the q35 one. Maybe we can support 4 domains first before we come up with a better solution. But I'm not sure if four domains are enough for those who want too many devices? (Adding Michael) Since we will not use all 256 buses of an extra PCI domain, I think this space will allow us to support more PCI domains. How will the flow look like ? 1. QEMU passes to SeaBIOS information of how many extra PCI domains needs, and how many buses per domain. How it will pass this info? A vendor specific capability, some PCI registers or modifying extra-pci-roots fw_cfg file? 2. SeaBIOS assigns the address for each PCI Domain and returns the information to QEMU. How it will do that? Some pxb-pcie registers? Or do we model the MMCFG like a PCI BAR? 3. Once QEMU gets the MMCFG addresses, it can answer to mmio configuration cycles. 4. SeaBIOS queries all PCI domains devices, computes and assigns IO/MEM resources (for PCI domains > 0 it will use MMCFG to configure the PCI devices) 5. QEMU uses the IO/MEM information to create the CRS for each extra PCI host bridge. 6. SeaBIOS gets the ACPI tables from QEMU and passes them to the guest OS. Thanks, Marcel cheers, Gerd ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3 3/3] pci: recognize RH PCI legacy bridge resource reservation capability
On 08/27/2018 05:22 AM, Liu, Jing2 wrote: Hi Marcel, On 8/25/2018 11:59 PM, Marcel Apfelbaum wrote: On 08/24/2018 11:53 AM, Jing Liu wrote: Enable the firmware recognizing RedHat legacy PCI bridge device ID, so QEMU can reserve additional PCI bridge resource capability. Change the debug level lower to 3 when it is non-QEMU bridge. Signed-off-by: Jing Liu --- src/fw/pciinit.c | 50 +- src/hw/pci_ids.h | 1 + 2 files changed, 30 insertions(+), 21 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 62a32f1..c0634bc 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -525,30 +525,38 @@ static void pci_bios_init_platform(void) static u8 pci_find_resource_reserve_capability(u16 bdf) { - if (pci_config_readw(bdf, PCI_VENDOR_ID) == PCI_VENDOR_ID_REDHAT && - pci_config_readw(bdf, PCI_DEVICE_ID) == - PCI_DEVICE_ID_REDHAT_ROOT_PORT) { - u8 cap = 0; - do { - cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, cap); - } while (cap && - pci_config_readb(bdf, cap + PCI_CAP_REDHAT_TYPE_OFFSET) != - REDHAT_CAP_RESOURCE_RESERVE); - if (cap) { - u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS); - if (cap_len < RES_RESERVE_CAP_SIZE) { - dprintf(1, "PCI: QEMU resource reserve cap length %d is invalid\n", - cap_len); - return 0; - } - } else { - dprintf(1, "PCI: QEMU resource reserve cap not found\n"); + u16 device_id; + + if (pci_config_readw(bdf, PCI_VENDOR_ID) != PCI_VENDOR_ID_REDHAT) { + dprintf(3, "PCI: This is non-QEMU bridge.\n"); + return 0; + } + + device_id = pci_config_readw(bdf, PCI_DEVICE_ID); + + if (device_id != PCI_DEVICE_ID_REDHAT_ROOT_PORT && + device_id != PCI_DEVICE_ID_REDHAT_BRIDGE) { + dprintf(1, "PCI: QEMU resource reserve cap device ID doesn't match.\n"); + return 0; + } + u8 cap = 0; + + do { + cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, cap); + } while (cap && + pci_config_readb(bdf, cap + PCI_CAP_REDHAT_TYPE_OFFSET) != + REDHAT_CAP_RESOURCE_RESERVE); + if (cap) { + u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS); + if (cap_len < RES_RESERVE_CAP_SIZE) { + dprintf(1, "PCI: QEMU resource reserve cap length %d is invalid\n", + cap_len); + return 0; } - return cap; } else { - dprintf(1, "PCI: QEMU resource reserve cap VID or DID doesn't match.\n"); - return 0; I am sorry for the late review. Did you drop the above line in purpose? Thanks for the review! I replaced the above report to following phase. Check the vendor-id and device-id respectively. + if (pci_config_readw(bdf, PCI_VENDOR_ID) != PCI_VENDOR_ID_REDHAT) { + dprintf(3, "PCI: This is non-QEMU bridge.\n"); + return 0; + } + + device_id = pci_config_readw(bdf, PCI_DEVICE_ID); + + if (device_id != PCI_DEVICE_ID_REDHAT_ROOT_PORT && + device_id != PCI_DEVICE_ID_REDHAT_BRIDGE) { + dprintf(1, "PCI: QEMU resource reserve cap device ID doesn't match.\n"); + return 0; + } I understand. Reviewed-by: Marcel Apfelbaum Thanks, Marcel Thanks, Jing Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC v2 2/3] pci_device: Add pci domain support
On 08/09/2018 08:43 AM, Zihan Yang wrote: Most part of seabios assume only PCI domain 0. This patch adds support for multiple domain in pci devices, which involves some API changes. For compatibility, interfaces such as pci_config_read[b|w|l] still exist so that existing domain 0 devices needs no modification, but whenever a device wants to reside in different domain, it should add *_dom suffix to above functions, e.g, pci_config_readl_dom(..., domain_nr) to read from specific host bridge other than q35 host bridge. It is not related only to q35. Is about PCI hosts bridges others that the main one. Also, the user should check the device domain when using foreachpci() macro to fileter undesired devices that reside in a different domain. Signed-off-by: Zihan Yang --- src/fw/coreboot.c | 2 +- src/fw/csm.c | 2 +- src/fw/paravirt.c | 2 +- src/fw/pciinit.c | 261 ++--- src/hw/pci.c | 69 +++--- src/hw/pci.h | 42 ++--- src/hw/pci_ids.h | 7 +- src/hw/pcidevice.c | 8 +- src/hw/pcidevice.h | 4 +- 9 files changed, 227 insertions(+), 170 deletions(-) diff --git a/src/fw/coreboot.c b/src/fw/coreboot.c index 7c0954b..c955dfd 100644 --- a/src/fw/coreboot.c +++ b/src/fw/coreboot.c @@ -254,7 +254,7 @@ coreboot_platform_setup(void) { if (!CONFIG_COREBOOT) return; -pci_probe_devices(); +pci_probe_devices(0); struct cb_memory *cbm = CBMemTable; if (!cbm) diff --git a/src/fw/csm.c b/src/fw/csm.c index 03b4bb8..e94f614 100644 --- a/src/fw/csm.c +++ b/src/fw/csm.c @@ -63,7 +63,7 @@ static void csm_maininit(struct bregs *regs) { interface_init(); -pci_probe_devices(); +pci_probe_devices(0); csm_compat_table.PnPInstallationCheckSegment = SEG_BIOS; csm_compat_table.PnPInstallationCheckOffset = get_pnp_offset(); diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c index 6b14542..ef4d487 100644 --- a/src/fw/paravirt.c +++ b/src/fw/paravirt.c @@ -155,7 +155,7 @@ qemu_platform_setup(void) return; if (runningOnXen()) { -pci_probe_devices(); +pci_probe_devices(0); xen_hypercall_setup(); xen_biostable_setup(); return; diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 6e6a434..fcdcd38 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -51,6 +51,7 @@ u64 pcimem_end = BUILD_PCIMEM_END; u64 pcimem64_start = BUILD_PCIMEM64_START; u64 pcimem64_end = BUILD_PCIMEM64_END; u64 pci_io_low_end = 0xa000; +u64 pxb_mcfg_size = 0; struct pci_region_entry { struct pci_device *dev; @@ -88,9 +89,9 @@ static void pci_set_io_region_addr(struct pci_device *pci, int bar, u64 addr, int is64) { u32 ofs = pci_bar(pci, bar); -pci_config_writel(pci->bdf, ofs, addr); +pci_config_writel_dom(pci->bdf, ofs, addr, pci->domain_nr); if (is64) -pci_config_writel(pci->bdf, ofs + 4, addr >> 32); +pci_config_writel_dom(pci->bdf, ofs + 4, addr >> 32, pci->domain_nr); } @@ -405,25 +406,29 @@ static void pci_bios_init_device(struct pci_device *pci) /* map the interrupt */ u16 bdf = pci->bdf; -int pin = pci_config_readb(bdf, PCI_INTERRUPT_PIN); +int pin = pci_config_readb_dom(bdf, PCI_INTERRUPT_PIN, pci->domain_nr); if (pin != 0) -pci_config_writeb(bdf, PCI_INTERRUPT_LINE, pci_slot_get_irq(pci, pin)); +pci_config_writeb_dom(bdf, PCI_INTERRUPT_LINE, pci_slot_get_irq(pci, pin), + pci->domain_nr); pci_init_device(pci_device_tbl, pci, NULL); /* enable memory mappings */ -pci_config_maskw(bdf, PCI_COMMAND, 0, - PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_SERR); +pci_config_maskw_dom(bdf, PCI_COMMAND, 0, + PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_SERR, + pci->domain_nr); /* enable SERR# for forwarding */ if (pci->header_type & PCI_HEADER_TYPE_BRIDGE) -pci_config_maskw(bdf, PCI_BRIDGE_CONTROL, 0, - PCI_BRIDGE_CTL_SERR); +pci_config_maskw_dom(bdf, PCI_BRIDGE_CONTROL, 0, + PCI_BRIDGE_CTL_SERR, pci->domain_nr); } -static void pci_bios_init_devices(void) +static void pci_bios_init_devices(int domain_nr) { struct pci_device *pci; foreachpci(pci) { +if (pci->domain_nr != domain_nr) +continue; pci_bios_init_device(pci); } } @@ -520,6 +525,10 @@ static void pxb_mem_addr_setup(struct pci_device *dev, void *arg) It seems is a new function, but I can't find the definition. Can you please point me to it? * read mcfg_base and mcfg_size from it just now. Instead, we directly add * this item to e820 */ e820_add(mcfg_base.val, mcfg_size, E820_RESERVED); + +/* Add PXBHosts so that we can can initialize them later */ +
Re: [SeaBIOS] [RFC v2 0/3] Support multiple pci domains in pci_device
Hi, On 08/09/2018 08:43 AM, Zihan Yang wrote: Currently seabios assumes there is only one pci domain(0), and almost everything operates on pci domain 0 by default. This patch aims to add multiple pci domain support for pci_device, while reserve the original API for compatibility. This is a necessary addition to support your QEMU patches, Please send a link to the QEMU series on your next re-spin. The reason to get seabios involved is that the pxb-pcie host bus created in QEMU is now in a different PCI domain, and its bus number would start from 0 instead of bus_nr. Actually bus_nr should not be used when in another non-zero domain. That is not necessarily true. As we discussed in QEMU devel mailing list, it is possible PCI buses of a different domain to start from a positive bus number. Both bus_nr and domain_nr support makes sense. However, QEMU only binds port 0xcf8 and 0xcfc to bus pcie.0. To avoid bus confliction, we should use other port pairs for busses under new domains. I would skip support for IO based configuration and use only MMCONFIG for extra root buses. The question remains: how do we assign MMCONFIG space for each PCI domain. Thanks, Marcel Current issues: * when trying to read config space of pcie_pci_bridge, it actually reads out the result of mch. I'm working on this weird behavior. Changelog: v2 <- v1: - Fix bugs in filtering domains when traversing pci devices - Reformat some hardcoded codes, such as probing the pci device in pci_setup Zihan Yang (3): fw/pciinit: Recognize pxb-pcie-dev device pci_device: Add pci domain support pci: filter undesired domain when traversing pci src/fw/coreboot.c| 2 +- src/fw/csm.c | 2 +- src/fw/mptable.c | 1 + src/fw/paravirt.c| 3 +- src/fw/pciinit.c | 276 ++- src/hw/ahci.c| 1 + src/hw/ata.c | 1 + src/hw/esp-scsi.c| 1 + src/hw/lsi-scsi.c| 1 + src/hw/megasas.c | 1 + src/hw/mpt-scsi.c| 1 + src/hw/nvme.c| 1 + src/hw/pci.c | 69 +++-- src/hw/pci.h | 42 +--- src/hw/pci_ids.h | 6 +- src/hw/pcidevice.c | 11 +- src/hw/pcidevice.h | 8 +- src/hw/pvscsi.c | 1 + src/hw/sdcard.c | 1 + src/hw/usb-ehci.c| 1 + src/hw/usb-ohci.c| 1 + src/hw/usb-uhci.c| 1 + src/hw/usb-xhci.c| 1 + src/hw/virtio-blk.c | 1 + src/hw/virtio-scsi.c | 1 + src/optionroms.c | 3 + 26 files changed, 268 insertions(+), 170 deletions(-) ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3 3/3] pci: recognize RH PCI legacy bridge resource reservation capability
On 08/24/2018 11:53 AM, Jing Liu wrote: Enable the firmware recognizing RedHat legacy PCI bridge device ID, so QEMU can reserve additional PCI bridge resource capability. Change the debug level lower to 3 when it is non-QEMU bridge. Signed-off-by: Jing Liu --- src/fw/pciinit.c | 50 +- src/hw/pci_ids.h | 1 + 2 files changed, 30 insertions(+), 21 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 62a32f1..c0634bc 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -525,30 +525,38 @@ static void pci_bios_init_platform(void) static u8 pci_find_resource_reserve_capability(u16 bdf) { -if (pci_config_readw(bdf, PCI_VENDOR_ID) == PCI_VENDOR_ID_REDHAT && -pci_config_readw(bdf, PCI_DEVICE_ID) == -PCI_DEVICE_ID_REDHAT_ROOT_PORT) { -u8 cap = 0; -do { -cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, cap); -} while (cap && - pci_config_readb(bdf, cap + PCI_CAP_REDHAT_TYPE_OFFSET) != -REDHAT_CAP_RESOURCE_RESERVE); -if (cap) { -u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS); -if (cap_len < RES_RESERVE_CAP_SIZE) { -dprintf(1, "PCI: QEMU resource reserve cap length %d is invalid\n", -cap_len); -return 0; -} -} else { -dprintf(1, "PCI: QEMU resource reserve cap not found\n"); +u16 device_id; + +if (pci_config_readw(bdf, PCI_VENDOR_ID) != PCI_VENDOR_ID_REDHAT) { +dprintf(3, "PCI: This is non-QEMU bridge.\n"); +return 0; +} + +device_id = pci_config_readw(bdf, PCI_DEVICE_ID); + +if (device_id != PCI_DEVICE_ID_REDHAT_ROOT_PORT && +device_id != PCI_DEVICE_ID_REDHAT_BRIDGE) { +dprintf(1, "PCI: QEMU resource reserve cap device ID doesn't match.\n"); +return 0; +} +u8 cap = 0; + +do { +cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, cap); +} while (cap && + pci_config_readb(bdf, cap + PCI_CAP_REDHAT_TYPE_OFFSET) != + REDHAT_CAP_RESOURCE_RESERVE); +if (cap) { +u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS); +if (cap_len < RES_RESERVE_CAP_SIZE) { +dprintf(1, "PCI: QEMU resource reserve cap length %d is invalid\n", +cap_len); +return 0; } -return cap; } else { -dprintf(1, "PCI: QEMU resource reserve cap VID or DID doesn't match.\n"); -return 0; I am sorry for the late review. Did you drop the above line in purpose? Thanks, Marcel +dprintf(1, "PCI: QEMU resource reserve cap not found\n"); } +return cap; } / diff --git a/src/hw/pci_ids.h b/src/hw/pci_ids.h index 38fa2ca..1096461 100644 --- a/src/hw/pci_ids.h +++ b/src/hw/pci_ids.h @@ -2265,6 +2265,7 @@ #define PCI_VENDOR_ID_REDHAT 0x1b36 #define PCI_DEVICE_ID_REDHAT_ROOT_PORT0x000C +#define PCI_DEVICE_ID_REDHAT_BRIDGE0x0001 #define PCI_VENDOR_ID_TEKRAM 0x1de1 #define PCI_DEVICE_ID_TEKRAM_DC2900xdc29 ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v2 2/3] pci: clean up the debug message for pci capability found
On 08/16/2018 05:32 AM, Jing Liu wrote: Improve the debug message when QEMU resource reserve cap is not found and when the vendor-id or device-id does't match REDHAT special ones. Signed-off-by: Jing Liu --- src/fw/pciinit.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index d2cea2b..62a32f1 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -541,10 +541,12 @@ static u8 pci_find_resource_reserve_capability(u16 bdf) cap_len); return 0; } +} else { +dprintf(1, "PCI: QEMU resource reserve cap not found\n"); } return cap; } else { -dprintf(1, "PCI: QEMU resource reserve cap not found\n"); +dprintf(1, "PCI: QEMU resource reserve cap VID or DID doesn't match.\n"); return 0; } } Reviewed-by: Marcel Apfelbaum Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v2 1/3] pci: fix the return value for truncated capability
On 08/16/2018 05:32 AM, Jing Liu wrote: Return zero when finding truncated capability. Signed-off-by: Jing Liu --- src/fw/pciinit.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 3a2f747..d2cea2b 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -539,6 +539,7 @@ static u8 pci_find_resource_reserve_capability(u16 bdf) if (cap_len < RES_RESERVE_CAP_SIZE) { dprintf(1, "PCI: QEMU resource reserve cap length %d is invalid\n", cap_len); +return 0; } } return cap; Reviewed-by: Marcel Apfelbaum Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH] pci: fix 'io hints' capability for RedHat PCI bridges
Hi, I forgot to CC Kevin and Aleksandr (the original committer). Thanks, Marcel On 11/01/2018 22:15, Marcel Apfelbaum wrote: Commit ec6cb17f (pci: enable RedHat PCI bridges to reserve additional resources on PCI init) added a new vendor specific PCI capability for RedHat PCI bridges allowing them to reserve additional buses and/or IO/MEM space. When adding the IO hints PCI capability to the pcie-root-port without specifying a value for bus reservation, the subordinate bus computation is wrong and the guest kernel gets messed up. Fix it by returning to prev code if the value for bus reservation is not set. Removed also a wrong debug print "PCI: invalid QEMU resource reserve cap offset" which appears if the 'IO hints' capability is not present. Signed-off-by: Marcel Apfelbaum <mar...@redhat.com> --- src/fw/pciinit.c | 14 +- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 7f0e439..3a2f747 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -540,8 +540,6 @@ static u8 pci_find_resource_reserve_capability(u16 bdf) dprintf(1, "PCI: QEMU resource reserve cap length %d is invalid\n", cap_len); } -} else { -dprintf(1, "PCI: invalid QEMU resource reserve cap offset\n"); } return cap; } else { @@ -619,13 +617,11 @@ pci_bios_init_bus_rec(int bus, u8 *pci_bus) res_bus); res_bus = 0; } -} -if (secbus + res_bus > *pci_bus) { -dprintf(1, "PCI: QEMU resource reserve cap: bus = %u\n", -res_bus); -res_bus = secbus + res_bus; -} else { -res_bus = *pci_bus; +if (secbus + res_bus > *pci_bus) { +dprintf(1, "PCI: QEMU resource reserve cap: bus = %u\n", +res_bus); +res_bus = secbus + res_bus; +} } } dprintf(1, "PCI: subordinate bus = 0x%x -> 0x%x\n", ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH] pci: fix 'io hints' capability for RedHat PCI bridges
Commit ec6cb17f (pci: enable RedHat PCI bridges to reserve additional resources on PCI init) added a new vendor specific PCI capability for RedHat PCI bridges allowing them to reserve additional buses and/or IO/MEM space. When adding the IO hints PCI capability to the pcie-root-port without specifying a value for bus reservation, the subordinate bus computation is wrong and the guest kernel gets messed up. Fix it by returning to prev code if the value for bus reservation is not set. Removed also a wrong debug print "PCI: invalid QEMU resource reserve cap offset" which appears if the 'IO hints' capability is not present. Signed-off-by: Marcel Apfelbaum <mar...@redhat.com> --- src/fw/pciinit.c | 14 +- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 7f0e439..3a2f747 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -540,8 +540,6 @@ static u8 pci_find_resource_reserve_capability(u16 bdf) dprintf(1, "PCI: QEMU resource reserve cap length %d is invalid\n", cap_len); } -} else { -dprintf(1, "PCI: invalid QEMU resource reserve cap offset\n"); } return cap; } else { @@ -619,13 +617,11 @@ pci_bios_init_bus_rec(int bus, u8 *pci_bus) res_bus); res_bus = 0; } -} -if (secbus + res_bus > *pci_bus) { -dprintf(1, "PCI: QEMU resource reserve cap: bus = %u\n", -res_bus); -res_bus = secbus + res_bus; -} else { -res_bus = *pci_bus; +if (secbus + res_bus > *pci_bus) { +dprintf(1, "PCI: QEMU resource reserve cap: bus = %u\n", +res_bus); +res_bus = secbus + res_bus; +} } } dprintf(1, "PCI: subordinate bus = 0x%x -> 0x%x\n", -- 2.13.5 ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v7 1/4] hw/pci: introduce pcie-pci-bridge device
On 20/09/2017 16:57, Eduardo Habkost wrote: On Wed, Sep 20, 2017 at 09:52:01AM +, Aleksandr Bezzubikov wrote: ср, 20 сент. 2017 г. в 10:13, Marcel Apfelbaum <mar...@redhat.com>: On 19/09/2017 23:34, Eduardo Habkost wrote: On Fri, Aug 18, 2017 at 02:36:47AM +0300, Aleksandr Bezzubikov wrote: Introduce a new PCIExpress-to-PCI Bridge device, which is a hot-pluggable PCI Express device and supports devices hot-plug with SHPC. This device is intended to replace the DMI-to-PCI Bridge. Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> Reviewed-by: Marcel Apfelbaum <mar...@redhat.com> It's possible to crash QEMU by instantiating this device, with; $ qemu-system-ppc64 -machine prep -device pcie-pci-bridge qemu-system-ppc64: qemu/memory.c:1533: memory_region_finalize: Assertion `!mr->container' failed. Aborted Hi Edurado, I didn't investigate the root cause. Thanks for reporting it! Aleksandr, can you have a look? Maybe we should not compile the device for ppc arch. (x86 and arm is enough) I will see what can we do. Is x86 and arm really enough? I would investigate the original cause before disabling the device on other architectures, as we could be hiding a bug that's also present in x86. Agreed, it worth finding out the reason. But the restriction still makes sense. Thanks, Marcel The backtrace looks like broken error handling logic somewhere: #0 0x7fffea9ff1f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x7fffeaa008e8 in __GI_abort () at abort.c:90 #2 0x7fffea9f8266 in __assert_fail_base (fmt=0x7fffeab4ae68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x55be4ac1 "!mr->container", file=file@entry=0x55be49c4 "/root/qemu/memory.c", line=line@entry=1533, function=function@entry=0x55be5100 <__PRETTY_FUNCTION__.28908> "memory_region_finalize") at assert.c:92 #3 0x7fffea9f8312 in __GI___assert_fail (assertion=assertion@entry=0x55be4ac1 "!mr->container", file=file@entry=0x55be49c4 "/root/qemu/memory.c", line=line@entry=1533, function=function@entry=0x55be5100 <__PRETTY_FUNCTION__.28908> "memory_region_finalize") at assert.c:101 #4 0x557ff2df in memory_region_finalize (obj=) at /root/qemu/memory.c:1533 #5 0x55ae77a2 in object_unref (type=, obj=0x57c00d80) at /root/qemu/qom/object.c:453 #6 0x55ae77a2 in object_unref (data=0x57c00d80) at /root/qemu/qom/object.c:467 #7 0x55ae77a2 in object_unref (obj=0x57c00d80) at /root/qemu/qom/object.c:902 #8 0x55ae67d7 in object_property_del_child (obj=0x57ab6500, child=child@entry=0x57c00d80, errp=0x0) at /root/qemu/qom/object.c:427 #9 0x55ae6ff4 in object_unparent (obj=obj@entry=0x57c00d80) at /root/qemu/qom/object.c:446 #10 0x55a1c94e in shpc_free (d=d@entry=0x57ab6500) at /root/qemu/hw/pci/shpc.c:676 #11 0x55a12560 in pcie_pci_bridge_realize (d=0x57ab6500, errp=0x7fffd530) at /root/qemu/hw/pci-bridge/pcie_pci_bridge.c:84 #12 0x55a18d07 in pci_qdev_realize (qdev=0x57ab6500, errp=0x7fffd5d0) at /root/qemu/hw/pci/pci.c:2024 #13 0x559b53aa in device_set_realized (obj=, value=, errp=0x7fffd708) at /root/qemu/hw/core/qdev.c:914 #14 0x55ae62fe in property_set_bool (obj=0x57ab6500, v=, name=, opaque=0x57ab7b30, errp=0x7fffd708) at /root/qemu/qom/object.c:1886 #15 0x55aea3ef in object_property_set_qobject (obj=obj@entry=0x57ab6500, value=value@entry=0x57ab86b0, name=name@entry=0x55c4f217 "realized", errp=errp@entry=0x7fffd708) at /root/qemu/qom/qom-qobject.c:27 #16 0x55ae80a0 in object_property_set_bool (obj=0x57ab6500, value=, name=0x55c4f217 "realized", errp=0x7fffd708) at /root/qemu/qom/object.c:1162 #17 0x55949824 in qdev_device_add (opts=0x567795b0, errp=errp@entry=0x7fffd7e0) at /root/qemu/qdev-monitor.c:630 #18 0x5594be87 in device_init_func (opaque=, opts=, errp=) at /root/qemu/vl.c:2418 #19 0x55bc85ba in qemu_opts_foreach (list=, func=func@entry=0x5594be60 , opaque=opaque@entry=0x0, errp=errp@entry=0x0) at /root/qemu/util/qemu-option.c:1104 #20 0x5579f497 in main (argc=, argv=, envp=) at /root/qemu/vl.c:4745 (gdb) fr 11 #11 0x55a12560 in pcie_pci_bridge_realize (d=0x57ab6500, errp=0x7fffd530) at /root/qemu/hw/pci-bridge/pcie_pci_bridge.c:84 84 shpc_free(d); (gdb) l 79 pcie_aer_exit(d); 80 aer_error: 81 pm_error: 82 pcie_cap_exit(d); 83 cap_error: 84 shpc_free(d); 85 error: 86 pci_bridge_exitfn(d); 87 } 88 (gdb) ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v7 1/4] hw/pci: introduce pcie-pci-bridge device
On 20/09/2017 12:52, Aleksandr Bezzubikov wrote: ср, 20 сент. 2017 г. в 10:13, Marcel Apfelbaum <mar...@redhat.com <mailto:mar...@redhat.com>>: On 19/09/2017 23:34, Eduardo Habkost wrote: > On Fri, Aug 18, 2017 at 02:36:47AM +0300, Aleksandr Bezzubikov wrote: >> Introduce a new PCIExpress-to-PCI Bridge device, >> which is a hot-pluggable PCI Express device and >> supports devices hot-plug with SHPC. >> >> This device is intended to replace the DMI-to-PCI Bridge. >> >> Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com <mailto:zuban...@gmail.com>> >> Reviewed-by: Marcel Apfelbaum <mar...@redhat.com <mailto:mar...@redhat.com>> > > It's possible to crash QEMU by instantiating this device, with; > > $ qemu-system-ppc64 -machine prep -device pcie-pci-bridge > qemu-system-ppc64: qemu/memory.c:1533: memory_region_finalize: Assertion `!mr->container' failed. > Aborted Hi Edurado, > > I didn't investigate the root cause. > Thanks for reporting it! Aleksandr, can you have a look? Maybe we should not compile the device for ppc arch. (x86 and arm is enough) I will see what can we do. Is x86 and arm really enough? Well, I am being selfish, and it works for me lately :). Seriously speaking, the new generic PCI Express Port was restricted to x86 and arm for reasons I don't remember. Since your work has the same scope, the restriction makes sense. Please grep for CONFIG_PCIE_PORT to convince yourself and to help coding it. Thanks, Marcel Appreciated, Marcel -- Aleksandr Bezzubikov ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v7 1/4] hw/pci: introduce pcie-pci-bridge device
On 19/09/2017 23:34, Eduardo Habkost wrote: On Fri, Aug 18, 2017 at 02:36:47AM +0300, Aleksandr Bezzubikov wrote: Introduce a new PCIExpress-to-PCI Bridge device, which is a hot-pluggable PCI Express device and supports devices hot-plug with SHPC. This device is intended to replace the DMI-to-PCI Bridge. Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> Reviewed-by: Marcel Apfelbaum <mar...@redhat.com> It's possible to crash QEMU by instantiating this device, with; $ qemu-system-ppc64 -machine prep -device pcie-pci-bridge qemu-system-ppc64: qemu/memory.c:1533: memory_region_finalize: Assertion `!mr->container' failed. Aborted Hi Edurado, I didn't investigate the root cause. Thanks for reporting it! Aleksandr, can you have a look? Maybe we should not compile the device for ppc arch. (x86 and arm is enough) Appreciated, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH RESEND v7 0/3] Red Hat PCI bridge resource reserve capability
On 10/09/2017 21:34, Aleksandr Bezzubikov wrote: пт, 18 авг. 2017 г. в 2:33, Aleksandr Bezzubikov>: Now PCI bridges get a bus range number on a system init, basing on currently plugged devices. That's why when one wants to hotplug another bridge, it needs his child bus, which the parent is unable to provide (speaking about virtual device). The suggested workaround is to have vendor-specific capability in Red Hat PCI bridges that contains number of additional bus to reserve (as well as IO/MEM/PREF space limit hints) on BIOS PCI init. So this capability is intended only for pure QEMU->SeaBIOS usage. Considering all aforesaid, this series is directly connected with QEMU series "Generic PCIE-PCI Bridge". Although the new PCI capability is supposed to contain various limits along with bus number to reserve, now only its full layout is proposed. And only bus_reserve field is used in QEMU and BIOS. Limits usage is still a subject for implementation as now the main goal of this series to provide necessary support from the firmware side to PCIE-PCI bridge hotplug. Changes v6->v7: 0. Resend - fix a bug with incorrect subordinate bus default value. 1. Do not use alignment in case of IO reservation cap usage. 2. Log additional buses reservation events. Changes v5->v6: 1. Remove unnecessary indents and line breaks (addresses Marcel's comments) 2. Count IO/MEM/PREF region size as a maximum of necessary size and one provide in RESOURCE_RESERVE capability (addresses Marcel's comment). 3. Make the cap debug message more detailed (addresses Marcel's comment). 4. Change pref_32 and pref_64 cap fields comment. Changes v4->v5: 1. Rename capability-related #defines 2. Move capability IO/MEM/PREF fields values usage to the regions creation stage (addresses Marcel's comment) 3. The capability layout change: separate pref_mem into pref_mem_32 and pref_mem_64 fields (QEMU side has the same changes) (addresses Laszlo's comment) 4. Extract the capability lookup and check to the separate function (addresses Marcel's comment) - despite of Marcel's comment do not extract field check for -1 since it increases code length and doesn't look nice because of different field types 5. Fix the capability's comment (addresses Marcel's comment) 6. Fix the 3rd patch message Changes v3->v4: 1. Use all QEMU PCI capability fields - addresses Michael's comment 2. Changes of the capability layout (QEMU side has the same changes): - change reservation fields types: bus_res - uint32_t, others - uint64_t - interpret -1 value as 'ignore' Changes v2->v3: 1. Merge commit 2 (Red Hat vendor ID) into commit 4 - addresses Marcel's comment, and add Generic PCIE Root Port device ID - addresses Michael's comment. 2. Changes of the capability layout (QEMU side has the same changes): - add 'type' field to distinguish multiple RedHat-specific capabilities - addresses Michael's comment - do not mimiс PCI Config space register layout, but use mutually exclusive differently sized fields for IO and prefetchable memory limits - addresses Laszlo's comment - use defines instead of structure and offsetof - addresses Michael's comment 3. Interpret 'bus_reserve' field as a minimum necessary range to reserve - addresses Gerd's comment 4. pci_find_capability moved to pci.c - addresses Kevin's comment 5. Move capability layout header to src/fw/dev-pci.h - addresses Kevin's comment 6. Add the capability documentation - addresses Michael's comment 7. Add capability length and bus_reserve field sanity checks - addresses Michael's comment Changes v1->v2: 1. New #define for Red Hat vendor added (addresses Konrad's comment). 2. Refactored pci_find_capability function (addresses Marcel's comment). 3. Capability reworked: - data type added; - reserve space in a structure for IO, memory and prefetchable memory limits. Aleksandr Bezzubikov (3): pci: refactor pci_find_capapibilty to get bdf as the first argument instead of the whole pci_device pci: add QEMU-specific PCI capability structure pci: enable RedHat PCI bridges to reserve additional resources on PCI init src/fw/dev-pci.h| 53 ++ src/fw/pciinit.c| 108 +--- src/hw/pci.c| 25 src/hw/pci.h| 1 + src/hw/pci_ids.h| 3 ++ src/hw/pcidevice.c | 24 src/hw/pcidevice.h | 1 -
Re: [SeaBIOS] [PATCH v7 0/4] Generic PCIE-PCI Bridge
On 18/08/2017 2:36, Aleksandr Bezzubikov wrote: This series introduces a new device - Generic PCI Express to PCI bridge, and also makes all necessary changes to enable hotplug of the bridge itself and any device into the bridge. Hi, Series Tested-by: Marcel Apfelbaum <mar...@redhat.com> (focused on changes from v6) Michael, will Alecsandr need to re-send it after freeze? I am asking because the GSOC project is ending in a week or so. Thanks, Marcel Changes v6->v7: Change IO/MEM/PREF reservation properties type to SIZE. Changes v5->v6: 1. Fix indentation in the cap creation function (addresses Marcel's comment) 2. Simplify capability pref_mem_* fields assignment (addresses Marcel's comment) 3. Documentation fixes: - fix mutually exclusive fields definition (addresses Laszlo's comment) - fix pcie-pci-bridge usage example (addresses Marcel's comment) Changes v4->v5: 1. Change PCIE-PCI Bridge license (addresses Marcel's comment) 2. The capability layout changes (adress Laszlo' comments): - separate pref_mem into pref_mem_32 and pref_mem_64 fields (SeaBIOS side has the same changes) - accordingly change the Generic Root Port's properties 3. Do not add the capability to the root port if no valid values are provided (adresses Michael's comment) 4. Rename the capability type to 'RESOURCE_RESERVE' (addresses Marcel's comment) 5. Remove shpc_present check function (addresses Marcel's comment) 6. Fix the 4th patch message (adresses Michael's comment) 7. Patch for SHPC enabling in _OSC method has been already merged Changes v3->v4: 1. PCIE-PCI Bridge device: "msi_enable"->"msi", "shpc"->"shpc_bar", remove local_err, make "msi" property OnOffAuto, shpc_present() is still here to avoid SHPC_VMSTATE refactoring (address Marcel's comments). 2. Change QEMU PCI capability layout (SeaBIOS side has the same changes): - change reservation fields types: bus_res - uint32_t, others - uint64_t - rename 'non_pref' and 'pref' fields - interpret -1 value as 'ignore' 3. Use parent_realize in Generic PCI Express Root Port properly. 4. Fix documentation: fully replace the DMI-PCI bridge references with the new PCIE-PCI bridge, "PCIE"->"PCI Express", small mistakes and typos - address Laszlo's and Marcel's comments. 5. Rename QEMU PCI cap creation fucntion - addresses Marcel's comment. Changes v2->v3: (0). 'do_not_use' capability field flag is still _not_ in here since we haven't come to consesus on it yet. 1. Merge commits 5 (bus_reserve property creation) and 6 (property usage) together - addresses Michael's comment. 2. Add 'bus_reserve' property and QEMU PCI capability only to Generic PCIE Root Port - addresses Michael's and Marcel's comments. 3. Change 'bus_reserve' property's default value to 0 - addresses Michael's comment. 4. Rename QEMU bridge-specific PCI capability creation function - addresses Michael's comment. 5. Init the whole QEMU PCI capability with zeroes - addresses Michael's and Laszlo's comments. 6. Change QEMU PCI capability layout (SeaBIOS side has the same changes) - add 'type' field to distinguish multiple RedHat-specific capabilities - addresses Michael's comment - do not mimiс PCI Config space register layout, but use mutually exclusive differently sized fields for IO and prefetchable memory limits - addresses Laszlo's comment 7. Correct error handling in PCIE-PCI bridge realize function. 8. Replace a '2' constant with PCI_CAP_FLAGS in the capability creation function - addresses Michael's comment. 9. Remove a comment on _OSC which isn't correct anymore - address Marcel's comment. 10. Add documentation for the Generic PCIE-PCI Bridge and QEMU PCI capability - addresses Michael's comment. Changes v1->v2: 1. Enable SHPC for the bridge. 2. Enable SHPC support for the Q35 machine (ACPI stuff). 3. Introduce PCI capability to help firmware on the system init. This allows the bridge to be hotpluggable. Now it's supported only for pcie-root-port. Now it's supposed to used with SeaBIOS only, look at the SeaBIOS corresponding series "Allow RedHat PCI bridges reserve more buses than necessary during init". Aleksandr Bezzubikov (4): hw/pci: introduce pcie-pci-bridge device hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware hw/pci: add QEMU-specific PCI capability to the Generic PCI Express Root Port docs: update documentation considering PCIE-PCI bridge docs/pcie.txt | 49 +- docs/pcie_pci_bridge.txt | 114 ++ hw/pci-bridge/Makefile.objs| 2 +- hw/pci-bridge/gen_pcie_root_port.c | 36 +++ hw/pci-bridge/pcie_pci_bridge.c| 192 + hw/pci/pci_bridge.c| 46 + include/hw/pci/pci.h | 1
Re: [SeaBIOS] [PATCH v6 0/4] Generic PCIE-PCI Bridge
On 13/08/2017 18:49, Aleksandr Bezzubikov wrote: This series introduces a new device - Generic PCI Express to PCI bridge, and also makes all necessary changes to enable hotplug of the bridge itself and any device into the bridge. Hi Aleksandr, Thanks for all the effort you put in this series. I tested it and succeeded to hotplug a pcie-pci-bridge into a pcie-root-port by reserving an extra bus number, then succeeded to hotplug a NIC into the pcie-pci-bridge. (Windows an Linux guests) I also succeeded to reserve more mem/io using the pcie-root-port parameters. One minor but importing comment: While testing it I observed the "reserve" parameters are not "size", but integers and that is not user friendly. It is better to be able to use: -device pcie-root-port,io-reserve=1k,mem-reserve=4M For bus numbers, even a byte parameter is enough, but for others "size" is much better. Sorry for not observing earlier. Marcel Changes v5->v6: 1. Fix indentation in the cap creation function (addresses Marcel's comment) 2. Simplify capability pref_mem_* fields assignment (addresses Marcel's comment) 3. Documentation fixes: - fix mutually exclusive fields definition (addresses Laszlo's comment) - fix pcie-pci-bridge usage example (addresses Marcel's comment) Changes v4->v5: 1. Change PCIE-PCI Bridge license (addresses Marcel's comment) 2. The capability layout changes (adress Laszlo' comments): - separate pref_mem into pref_mem_32 and pref_mem_64 fields (SeaBIOS side has the same changes) - accordingly change the Generic Root Port's properties 3. Do not add the capability to the root port if no valid values are provided (adresses Michael's comment) 4. Rename the capability type to 'RESOURCE_RESERVE' (addresses Marcel's comment) 5. Remove shpc_present check function (addresses Marcel's comment) 6. Fix the 4th patch message (adresses Michael's comment) 7. Patch for SHPC enabling in _OSC method has been already merged Changes v3->v4: 1. PCIE-PCI Bridge device: "msi_enable"->"msi", "shpc"->"shpc_bar", remove local_err, make "msi" property OnOffAuto, shpc_present() is still here to avoid SHPC_VMSTATE refactoring (address Marcel's comments). 2. Change QEMU PCI capability layout (SeaBIOS side has the same changes): - change reservation fields types: bus_res - uint32_t, others - uint64_t - rename 'non_pref' and 'pref' fields - interpret -1 value as 'ignore' 3. Use parent_realize in Generic PCI Express Root Port properly. 4. Fix documentation: fully replace the DMI-PCI bridge references with the new PCIE-PCI bridge, "PCIE"->"PCI Express", small mistakes and typos - address Laszlo's and Marcel's comments. 5. Rename QEMU PCI cap creation fucntion - addresses Marcel's comment. Changes v2->v3: (0). 'do_not_use' capability field flag is still _not_ in here since we haven't come to consesus on it yet. 1. Merge commits 5 (bus_reserve property creation) and 6 (property usage) together - addresses Michael's comment. 2. Add 'bus_reserve' property and QEMU PCI capability only to Generic PCIE Root Port - addresses Michael's and Marcel's comments. 3. Change 'bus_reserve' property's default value to 0 - addresses Michael's comment. 4. Rename QEMU bridge-specific PCI capability creation function - addresses Michael's comment. 5. Init the whole QEMU PCI capability with zeroes - addresses Michael's and Laszlo's comments. 6. Change QEMU PCI capability layout (SeaBIOS side has the same changes) - add 'type' field to distinguish multiple RedHat-specific capabilities - addresses Michael's comment - do not mimiс PCI Config space register layout, but use mutually exclusive differently sized fields for IO and prefetchable memory limits - addresses Laszlo's comment 7. Correct error handling in PCIE-PCI bridge realize function. 8. Replace a '2' constant with PCI_CAP_FLAGS in the capability creation function - addresses Michael's comment. 9. Remove a comment on _OSC which isn't correct anymore - address Marcel's comment. 10. Add documentation for the Generic PCIE-PCI Bridge and QEMU PCI capability - addresses Michael's comment. Changes v1->v2: 1. Enable SHPC for the bridge. 2. Enable SHPC support for the Q35 machine (ACPI stuff). 3. Introduce PCI capability to help firmware on the system init. This allows the bridge to be hotpluggable. Now it's supported only for pcie-root-port. Now it's supposed to used with SeaBIOS only, look at the SeaBIOS corresponding series "Allow RedHat PCI bridges reserve more buses than necessary during init". Aleksandr Bezzubikov (4): hw/pci: introduce pcie-pci-bridge device hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware hw/pci: add QEMU-specific PCI capability to the Generic PCI Express Root Port docs: update documentation considering PCIE-PCI bridge docs/pcie.txt | 49 +- docs/pcie_pci_bridge.txt | 114
Re: [SeaBIOS] [PATCH v6 0/3] Red Hat PCI bridge resource reserve capability
On 13/08/2017 19:03, Aleksandr Bezzubikov wrote: Now PCI bridges get a bus range number on a system init, basing on currently plugged devices. That's why when one wants to hotplug another bridge, it needs his child bus, which the parent is unable to provide (speaking about virtual device). The suggested workaround is to have vendor-specific capability in Red Hat PCI bridges that contains number of additional bus to reserve (as well as IO/MEM/PREF space limit hints) on BIOS PCI init. So this capability is intended only for pure QEMU->SeaBIOS usage. Considering all aforesaid, this series is directly connected with QEMU series "Generic PCIE-PCI Bridge". Although the new PCI capability is supposed to contain various limits along with bus number to reserve, now only its full layout is proposed. And only bus_reserve field is used in QEMU and BIOS. Limits usage is still a subject for implementation as now the main goal of this series to provide necessary support from the firmware side to PCIE-PCI bridge hotplug. Changes v5->v6: 1. Remove unnecessary indents and line breaks (addresses Marcel's comments) 2. Count IO/MEM/PREF region size as a maximum of necessary size and one provide in RESOURCE_RESERVE capability (addresses Marcel's comment). 3. Make the cap debug message more detailed (addresses Marcel's comment). 4. Change pref_32 and pref_64 cap fields comment. Changes v4->v5: 1. Rename capability-related #defines 2. Move capability IO/MEM/PREF fields values usage to the regions creation stage (addresses Marcel's comment) 3. The capability layout change: separate pref_mem into pref_mem_32 and pref_mem_64 fields (QEMU side has the same changes) (addresses Laszlo's comment) 4. Extract the capability lookup and check to the separate function (addresses Marcel's comment) - despite of Marcel's comment do not extract field check for -1 since it increases code length and doesn't look nice because of different field types 5. Fix the capability's comment (addresses Marcel's comment) 6. Fix the 3rd patch message Changes v3->v4: 1. Use all QEMU PCI capability fields - addresses Michael's comment 2. Changes of the capability layout (QEMU side has the same changes): - change reservation fields types: bus_res - uint32_t, others - uint64_t - interpret -1 value as 'ignore' Changes v2->v3: 1. Merge commit 2 (Red Hat vendor ID) into commit 4 - addresses Marcel's comment, and add Generic PCIE Root Port device ID - addresses Michael's comment. 2. Changes of the capability layout (QEMU side has the same changes): - add 'type' field to distinguish multiple RedHat-specific capabilities - addresses Michael's comment - do not mimiс PCI Config space register layout, but use mutually exclusive differently sized fields for IO and prefetchable memory limits - addresses Laszlo's comment - use defines instead of structure and offsetof - addresses Michael's comment 3. Interpret 'bus_reserve' field as a minimum necessary range to reserve - addresses Gerd's comment 4. pci_find_capability moved to pci.c - addresses Kevin's comment 5. Move capability layout header to src/fw/dev-pci.h - addresses Kevin's comment 6. Add the capability documentation - addresses Michael's comment 7. Add capability length and bus_reserve field sanity checks - addresses Michael's comment Changes v1->v2: 1. New #define for Red Hat vendor added (addresses Konrad's comment). 2. Refactored pci_find_capability function (addresses Marcel's comment). 3. Capability reworked: - data type added; - reserve space in a structure for IO, memory and prefetchable memory limits. Aleksandr Bezzubikov (3): pci: refactor pci_find_capapibilty to get bdf as the first argument instead of the whole pci_device pci: add QEMU-specific PCI capability structure pci: enable RedHat PCI bridges to reserve additional resource on PCI init src/fw/dev-pci.h| 53 +++ src/fw/pciinit.c| 101 +--- src/hw/pci.c| 25 + src/hw/pci.h| 1 + src/hw/pci_ids.h| 3 ++ src/hw/pcidevice.c | 24 - src/hw/pcidevice.h | 1 - src/hw/virtio-pci.c | 6 ++-- 8 files changed, 181 insertions(+), 33 deletions(-) create mode 100644 src/fw/dev-pci.h Hi, Series Tested-by: Marcel Apfelbaum <mar...@redhat.com> Tested with Win10 and Fedora guests, verified the bus/io/mem hints are working correctly. Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v6 3/3] pci: enable RedHat PCI bridges to reserve additional resource on PCI init
p_size; +} +break; +case PCI_REGION_TYPE_PREFMEM: +tmp_size = pci_config_readl(bdf, qemu_cap + RES_RESERVE_PREF_MEM_32); +tmp_size_64 = (pci_config_readl(bdf, qemu_cap + RES_RESERVE_PREF_MEM_64) | +(u64)pci_config_readl(bdf, qemu_cap + RES_RESERVE_PREF_MEM_64 + 4) << 32); +if (tmp_size != (u32)-1 && tmp_size_64 == (u64)-1) { +size = tmp_size; +} else if (tmp_size == (u32)-1 && tmp_size_64 != (u64)-1) { +size = tmp_size_64; +} else if (tmp_size != (u32)-1 && tmp_size_64 != (u64)-1) { +dprintf(1, "PCI: resource reserve cap PREF32 and PREF64" +" conflict\n"); +} +break; +default: +break; +} +} if (pci_region_align(>r[type]) > align) align = pci_region_align(>r[type]); u64 sum = pci_region_sum(>r[type]); int resource_optional = pcie_cap && (type == PCI_REGION_TYPE_IO); if (!sum && hotplug_support && !resource_optional) sum = align; /* reserve min size for hot-plug */ -u64 size = ALIGN(sum, align); +if (size > sum) { +dprintf(1, "PCI: QEMU resource reserve cap: " +"size %08llx type %s\n", +size, region_type_name[type]); +} else { +size = sum; +} +size = ALIGN(size, align); int is64 = pci_bios_bridge_region_is64(>r[type], s->bus_dev, type); // entry->bar is -1 if the entry represents a bridge region diff --git a/src/hw/pci_ids.h b/src/hw/pci_ids.h index 4ac73b4..38fa2ca 100644 --- a/src/hw/pci_ids.h +++ b/src/hw/pci_ids.h @@ -2263,6 +2263,9 @@ #define PCI_DEVICE_ID_KORENIX_JETCARDF0 0x1600 #define PCI_DEVICE_ID_KORENIX_JETCARDF1 0x16ff +#define PCI_VENDOR_ID_REDHAT 0x1b36 +#define PCI_DEVICE_ID_REDHAT_ROOT_PORT 0x000C + #define PCI_VENDOR_ID_TEKRAM 0x1de1 #define PCI_DEVICE_ID_TEKRAM_DC2900xdc29 Hi Aleksandr, Reviewed-by: Marcel Apfelbaum <mar...@redhat.com> Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v6 2/3] pci: add QEMU-specific PCI capability structure
On 13/08/2017 19:03, Aleksandr Bezzubikov wrote: On PCI init PCI bridge devices may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special vendor-specific PCI capability. This capability is intended to be used only for Red Hat PCI bridges, i.e. QEMU cooperation. Reviewed-by: Marcel Apfelbaum <mar...@redhat.com> Thanks, Marcel Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- src/fw/dev-pci.h | 53 + 1 file changed, 53 insertions(+) create mode 100644 src/fw/dev-pci.h diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h new file mode 100644 index 000..0dc5556 --- /dev/null +++ b/src/fw/dev-pci.h @@ -0,0 +1,53 @@ +#ifndef _PCI_CAP_H +#define _PCI_CAP_H + +#include "types.h" + +/* + * + * QEMU-specific vendor(Red Hat)-specific capability. + * It's intended to provide some hints for firmware to init PCI devices. + * + * Its structure is shown below: + * + * Header: + * + * u8 id; Standard PCI Capability Header field + * u8 next; Standard PCI Capability Header field + * u8 len; Standard PCI Capability Header field + * u8 type; Red Hat vendor-specific capability type + * Data: + * + * u32 bus_res; minimum bus number to reserve; + * this is necessary for PCI Express Root Ports + * to support PCI bridges hotplug + * u64 io; IO space to reserve + * u32 mem; non-prefetchable memory to reserve + * + * At most of the following two fields may be set to a value + * different from 0xFF...F: + * u32 prefetchable_mem_32; prefetchable memory to reserve (32-bit MMIO) + * u64 prefetchable_mem_64; prefetchable memory to reserve (64-bit MMIO) + * + * If any field value in Data section is 0xFF...F, + * it means that such kind of reservation is not needed and must be ignored. + * +*/ + +/* Offset of vendor-specific capability type field */ +#define PCI_CAP_REDHAT_TYPE_OFFSET 3 + +/* List of valid Red Hat vendor-specific capability types */ +#define REDHAT_CAP_RESOURCE_RESERVE 1 + + +/* Offsets of RESOURCE_RESERVE capability fields */ +#define RES_RESERVE_BUS_RES4 +#define RES_RESERVE_IO 8 +#define RES_RESERVE_MEM16 +#define RES_RESERVE_PREF_MEM_3220 +#define RES_RESERVE_PREF_MEM_6424 +#define RES_RESERVE_CAP_SIZE 32 + +#endif /* _PCI_CAP_H */ + ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v5 3/3] pci: enable RedHat PCI bridges to reserve additional resource on PCI init
On 11/08/2017 2:21, Aleksandr Bezzubikov wrote: In case of Red Hat Generic PCIE Root Port reserve additional buses and/or IO/MEM/PREF space, which values are provided in a vendor-specific capability. Signed-off-by: Aleksandr Bezzubikov--- src/fw/dev-pci.h | 2 +- src/fw/pciinit.c | 125 +-- src/hw/pci_ids.h | 3 ++ 3 files changed, 116 insertions(+), 14 deletions(-) diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h index cf16b2e..99ccc12 100644 --- a/src/fw/dev-pci.h +++ b/src/fw/dev-pci.h @@ -38,7 +38,7 @@ #define PCI_CAP_REDHAT_TYPE_OFFSET 3 /* List of valid Red Hat vendor-specific capability types */ -#define REDHAT_CAP_RESOURCE_RESERVE1 +#define REDHAT_CAP_RESOURCE_RESERVE 1 /* Offsets of RESOURCE_RESERVE capability fields */ diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 864954f..d9aef56 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -15,6 +15,7 @@ #include "hw/pcidevice.h" // pci_probe_devices #include "hw/pci_ids.h" // PCI_VENDOR_ID_INTEL #include "hw/pci_regs.h" // PCI_COMMAND +#include "fw/dev-pci.h" // REDHAT_CAP_RESOURCE_RESERVE #include "list.h" // struct hlist_node #include "malloc.h" // free #include "output.h" // dprintf @@ -522,6 +523,32 @@ static void pci_bios_init_platform(void) } } +static u8 pci_find_resource_reserve_capability(u16 bdf) +{ +if (pci_config_readw(bdf, PCI_VENDOR_ID) == PCI_VENDOR_ID_REDHAT && +pci_config_readw(bdf, PCI_DEVICE_ID) == +PCI_DEVICE_ID_REDHAT_ROOT_PORT) { +u8 cap = 0; +do { +cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, cap); +} while (cap && + pci_config_readb(bdf, cap + PCI_CAP_REDHAT_TYPE_OFFSET) != +REDHAT_CAP_RESOURCE_RESERVE); +if (cap) { +u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS); +if (cap_len < RES_RESERVE_CAP_SIZE) { +dprintf(1, "PCI: QEMU resource reserve cap length %d is invalid\n", +cap_len); +} +} else { +dprintf(1, "PCI: invalid QEMU resource reserve cap offset\n"); +} +return cap; +} else { +dprintf(1, "PCI: QEMU resource reserve cap not found\n"); +return 0; +} +} / * Bus initialization @@ -578,9 +605,28 @@ pci_bios_init_bus_rec(int bus, u8 *pci_bus) pci_bios_init_bus_rec(secbus, pci_bus); if (subbus != *pci_bus) { +u8 res_bus = 0; +u8 cap = pci_find_resource_reserve_capability(bdf); + +if (cap) { +u32 tmp_res_bus = pci_config_readl(bdf, +cap + RES_RESERVE_BUS_RES); +if (tmp_res_bus != (u32)-1) { +res_bus = tmp_res_bus & 0xFF; +if ((u8)(res_bus + secbus) < secbus || +(u8)(res_bus + secbus) < res_bus) { +dprintf(1, "PCI: bus_reserve value %d is invalid\n", +res_bus); +res_bus = 0; +} +} +res_bus = (*pci_bus > secbus + res_bus) ? *pci_bus +: secbus + res_bus; +} dprintf(1, "PCI: subordinate bus = 0x%x -> 0x%x\n", -subbus, *pci_bus); -subbus = *pci_bus; +subbus, res_bus); +subbus = res_bus; +*pci_bus = res_bus; } else { dprintf(1, "PCI: subordinate bus = 0x%x\n", subbus); } @@ -844,22 +890,74 @@ static int pci_bios_check_devices(struct pci_bus *busses) */ parent = [0]; int type; -u8 pcie_cap = pci_find_capability(s->bus_dev->bdf, PCI_CAP_ID_EXP, 0); +u16 bdf = s->bus_dev->bdf; +u8 pcie_cap = pci_find_capability(bdf, PCI_CAP_ID_EXP, 0); +u8 qemu_cap = pci_find_resource_reserve_capability(bdf); + int hotplug_support = pci_bus_hotplug_support(s, pcie_cap); for (type = 0; type < PCI_REGION_TYPE_COUNT; type++) { u64 align = (type == PCI_REGION_TYPE_IO) ? -PCI_BRIDGE_IO_MIN : PCI_BRIDGE_MEM_MIN; +PCI_BRIDGE_IO_MIN : PCI_BRIDGE_MEM_MIN; if (!pci_bridge_has_region(s->bus_dev, type)) continue; -if (pci_region_align(>r[type]) > align) - align = pci_region_align(>r[type]); -u64 sum = pci_region_sum(>r[type]); -int resource_optional = pcie_cap && (type == PCI_REGION_TYPE_IO); -if (!sum && hotplug_support && !resource_optional) -sum = align; /* reserve min size for hot-plug */ -u64 size = ALIGN(sum, align); -int is64 =
Re: [SeaBIOS] [PATCH v5 3/3] pci: enable RedHat PCI bridges to reserve additional resource on PCI init
On 11/08/2017 2:21, Aleksandr Bezzubikov wrote: In case of Red Hat Generic PCIE Root Port reserve additional buses and/or IO/MEM/PREF space, which values are provided in a vendor-specific capability. Hi Aleksandr, Signed-off-by: Aleksandr Bezzubikov--- src/fw/dev-pci.h | 2 +- src/fw/pciinit.c | 125 +-- src/hw/pci_ids.h | 3 ++ 3 files changed, 116 insertions(+), 14 deletions(-) diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h index cf16b2e..99ccc12 100644 --- a/src/fw/dev-pci.h +++ b/src/fw/dev-pci.h @@ -38,7 +38,7 @@ #define PCI_CAP_REDHAT_TYPE_OFFSET 3 /* List of valid Red Hat vendor-specific capability types */ -#define REDHAT_CAP_RESOURCE_RESERVE1 +#define REDHAT_CAP_RESOURCE_RESERVE 1 Do you need the above chunk? If not, please get rid if it. /* Offsets of RESOURCE_RESERVE capability fields */ diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 864954f..d9aef56 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -15,6 +15,7 @@ #include "hw/pcidevice.h" // pci_probe_devices #include "hw/pci_ids.h" // PCI_VENDOR_ID_INTEL #include "hw/pci_regs.h" // PCI_COMMAND +#include "fw/dev-pci.h" // REDHAT_CAP_RESOURCE_RESERVE #include "list.h" // struct hlist_node #include "malloc.h" // free #include "output.h" // dprintf @@ -522,6 +523,32 @@ static void pci_bios_init_platform(void) } } +static u8 pci_find_resource_reserve_capability(u16 bdf) +{ +if (pci_config_readw(bdf, PCI_VENDOR_ID) == PCI_VENDOR_ID_REDHAT && +pci_config_readw(bdf, PCI_DEVICE_ID) == +PCI_DEVICE_ID_REDHAT_ROOT_PORT) { +u8 cap = 0; +do { +cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, cap); +} while (cap && + pci_config_readb(bdf, cap + PCI_CAP_REDHAT_TYPE_OFFSET) != +REDHAT_CAP_RESOURCE_RESERVE); +if (cap) { +u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS); +if (cap_len < RES_RESERVE_CAP_SIZE) { +dprintf(1, "PCI: QEMU resource reserve cap length %d is invalid\n", +cap_len); +} +} else { +dprintf(1, "PCI: invalid QEMU resource reserve cap offset\n"); +} +return cap; +} else { +dprintf(1, "PCI: QEMU resource reserve cap not found\n"); +return 0; +} +} / * Bus initialization @@ -578,9 +605,28 @@ pci_bios_init_bus_rec(int bus, u8 *pci_bus) pci_bios_init_bus_rec(secbus, pci_bus); if (subbus != *pci_bus) { +u8 res_bus = 0; +u8 cap = pci_find_resource_reserve_capability(bdf); + +if (cap) { +u32 tmp_res_bus = pci_config_readl(bdf, +cap + RES_RESERVE_BUS_RES); +if (tmp_res_bus != (u32)-1) { +res_bus = tmp_res_bus & 0xFF; +if ((u8)(res_bus + secbus) < secbus || +(u8)(res_bus + secbus) < res_bus) { +dprintf(1, "PCI: bus_reserve value %d is invalid\n", +res_bus); +res_bus = 0; +} +} +res_bus = (*pci_bus > secbus + res_bus) ? *pci_bus +: secbus + res_bus; +} dprintf(1, "PCI: subordinate bus = 0x%x -> 0x%x\n", -subbus, *pci_bus); -subbus = *pci_bus; +subbus, res_bus); +subbus = res_bus; +*pci_bus = res_bus; } else { dprintf(1, "PCI: subordinate bus = 0x%x\n", subbus); } @@ -844,22 +890,74 @@ static int pci_bios_check_devices(struct pci_bus *busses) */ parent = [0]; int type; -u8 pcie_cap = pci_find_capability(s->bus_dev->bdf, PCI_CAP_ID_EXP, 0); +u16 bdf = s->bus_dev->bdf; +u8 pcie_cap = pci_find_capability(bdf, PCI_CAP_ID_EXP, 0); +u8 qemu_cap = pci_find_resource_reserve_capability(bdf); + int hotplug_support = pci_bus_hotplug_support(s, pcie_cap); for (type = 0; type < PCI_REGION_TYPE_COUNT; type++) { u64 align = (type == PCI_REGION_TYPE_IO) ? -PCI_BRIDGE_IO_MIN : PCI_BRIDGE_MEM_MIN; +PCI_BRIDGE_IO_MIN : PCI_BRIDGE_MEM_MIN; The above chunk is also not needed. if (!pci_bridge_has_region(s->bus_dev, type)) continue; -if (pci_region_align(>r[type]) > align) - align = pci_region_align(>r[type]); -u64 sum = pci_region_sum(>r[type]); -int resource_optional = pcie_cap && (type == PCI_REGION_TYPE_IO); -if (!sum && hotplug_support && !resource_optional) -sum = align; /*
Re: [SeaBIOS] [PATCH v5 4/4] docs: update documentation considering PCIE-PCI bridge
e.0,id=rp3,bus-reserve=1 \ +-device pcie-pci-bridge,id=br1,bus=rp1 \ +-device pcie-pci-bridge,id=br2,bus=rp2 \ +-device e1000,bus=br1,addr=8 + +Then in monitor it's OK to execute next commands: +device_add pcie-pci-bridge,id=br3,bus=rp3 Please add '\' at the end of the line +device_add e1000,bus=br2,addr=1 +device_add e1000,bus=br3,addr=1 + +Here you have: + (1) Cold-plugged: +- Root ports: 1 QEMU generic root port with the capability mentioned above, + 2 ioh3420 root ports; +- 2 PCIE-PCI bridges plugged into 2 different root ports; +- e1000 plugged into the first bridge. + (2) Hot-plugged: +- PCIE-PCI bridge, plugged into QEMU generic root port; +- 2 e1000 cards, one plugged into the cold-plugged PCIE-PCI bridge, + another plugged into the hot-plugged bridge. + +Limitations +=== +The PCIE-PCI bridge can be hot-plugged only into pcie-root-port that +has proper 'bus-reserve' property value to provide secondary bus for the +hot-plugged bridge. + +Windows 7 and older versions don't support hot-plug devices into the PCIE-PCI bridge. +To enable device hot-plug into the bridge on Linux there're 3 ways: +1) Build shpchp module with this patch http://www.spinics.net/lists/linux-pci/msg63052.html +2) Use kernel 4.14+ where the patch mentioned above is already merged. +3) Set 'msi' property to off - this forced the bridge to use legacy INTx, +which allows the bridge to notify the OS about hot-plug event without having +BUSMASTER set. + +Implementation +== +The PCIE-PCI bridge is based on PCI-PCI bridge, but also accumulates PCI Express +features as a PCI Express device (is_express=1). + After addressing my and Laszlo's comments: Reviewed-by: Marcel Apfelbaum <mar...@redhat.com> Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v5 2/4] hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware
On 11/08/2017 2:31, Aleksandr Bezzubikov wrote: On PCI init PCI bridges may need some extra info about bus number, IO, memory and prefetchable memory to reserve. QEMU can provide this with a special vendor-specific PCI capability. Hi Aleksandr, I only have a few very small comments, other than that it looks OK to me. Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> Reviewed-by: Marcel Apfelbaum <mar...@redhat.com> --- hw/pci/pci_bridge.c | 54 + include/hw/pci/pci_bridge.h | 24 2 files changed, 78 insertions(+) diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index 720119b..2495a51 100644 --- a/hw/pci/pci_bridge.c +++ b/hw/pci/pci_bridge.c @@ -408,6 +408,60 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, br->bus_name = bus_name; } + +int pci_bridge_qemu_reserve_cap_init(PCIDevice *dev, int cap_offset, > + uint32_t bus_reserve, uint64_t io_reserve, Please pay attention to indentation, the above line should be aligned with the above (" + uint32_t mem_non_pref_reserve, + uint32_t mem_pref_32_reserve, + uint64_t mem_pref_64_reserve, + Error **errp) +{ +if (mem_pref_32_reserve != (uint32_t)-1 && +mem_pref_64_reserve != (uint64_t) -1) { Same here +error_setg(errp, + "PCI resource reserve cap: PREF32 and PREF64 conflict"); +return -EINVAL; +} + +if (bus_reserve == (uint32_t)-1 && +io_reserve == (uint64_t)-1 && +mem_non_pref_reserve == (uint32_t)-1 && +mem_pref_32_reserve == (uint32_t)-1 && +mem_pref_64_reserve == (uint64_t)-1) { and here (please go over all the file) +return 0; +} + +size_t cap_len = sizeof(PCIBridgeQemuCap); +PCIBridgeQemuCap cap = { +.len = cap_len, +.type = REDHAT_PCI_CAP_RESOURCE_RESERVE, +.bus_res = bus_reserve, +.io = io_reserve, +.mem = mem_non_pref_reserve, +.mem_pref_32 = (uint32_t)-1, +.mem_pref_64 = (uint64_t)-1 Why not use the values of mem_pref_32_reserve and mem_pref_64_reserve ? You already have checked they are mutually exclusive. +}; + +if (mem_pref_32_reserve != (uint32_t)-1 && +mem_pref_64_reserve == (uint64_t)-1) { +cap.mem_pref_32 = mem_pref_32_reserve; +} else if (mem_pref_32_reserve == (uint32_t)-1 && +mem_pref_64_reserve != (uint64_t)-1) { +cap.mem_pref_64 = mem_pref_64_reserve; +} So it seems you don't need the above code at all, right? With the above minor comments, please keep my R-b tag. Thanks, Marcel + +int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, +cap_offset, cap_len, errp); +if (offset < 0) { +return offset; +} + +memcpy(dev->config + offset + PCI_CAP_FLAGS, +(char *) + PCI_CAP_FLAGS, +cap_len - PCI_CAP_FLAGS); +return 0; +} + static const TypeInfo pci_bridge_type_info = { .name = TYPE_PCI_BRIDGE, .parent = TYPE_PCI_DEVICE, diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h index ff7cbaa..2d8c635 100644 --- a/include/hw/pci/pci_bridge.h +++ b/include/hw/pci/pci_bridge.h @@ -67,4 +67,28 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, #define PCI_BRIDGE_CTL_DISCARD_STATUS0x400 /* Discard timer status */ #define PCI_BRIDGE_CTL_DISCARD_SERR 0x800 /* Discard timer SERR# enable */ +typedef struct PCIBridgeQemuCap { +uint8_t id; /* Standard PCI capability header field */ +uint8_t next; /* Standard PCI capability header field */ +uint8_t len;/* Standard PCI vendor-specific capability header field */ +uint8_t type; /* Red Hat vendor-specific capability type. + Types are defined with REDHAT_PCI_CAP_ prefix */ + +uint32_t bus_res; /* Minimum number of buses to reserve */ +uint64_t io;/* IO space to reserve */ +uint32_t mem; /* Non-prefetchable memory to reserve */ +/* This two fields are mutually exclusive */ +uint32_t mem_pref_32; /* Prefetchable memory to reserve (32-bit MMIO) */ +uint64_t mem_pref_64; /* Prefetchable memory to reserve (64-bit MMIO) */ +} PCIBridgeQemuCap; + +#define REDHAT_PCI_CAP_RESOURCE_RESERVE 1 + +int pci_bridge_qemu_reserve_cap_init(PCIDevice *dev, int cap_offset, + uint32_t bus_reserve, uint64_t io_reserve, + uint32_t mem_non_pref_reserve, + uint32_t mem_pref_32_reserve, + uint64_t mem
Re: [SeaBIOS] [PATCH v5 1/4] hw/pci: introduce pcie-pci-bridge device
otplug_dev)) { +error_setg(errp, "standard hotplug controller has been disabled for " + "this %s", TYPE_PCIE_PCI_BRIDGE_DEV); +return; +} +shpc_device_hotplug_cb(hotplug_dev, dev, errp); +} + +static void pcie_pci_bridge_hot_unplug_request_cb(HotplugHandler *hotplug_dev, + DeviceState *dev, + Error **errp) +{ +PCIDevice *pci_hotplug_dev = PCI_DEVICE(hotplug_dev); + +if (!shpc_present(pci_hotplug_dev)) { +error_setg(errp, "standard hotplug controller has been disabled for " + "this %s", TYPE_PCIE_PCI_BRIDGE_DEV); +return; +} +shpc_device_hot_unplug_request_cb(hotplug_dev, dev, errp); +} + +static void pcie_pci_bridge_class_init(ObjectClass *klass, void *data) +{ +PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); +DeviceClass *dc = DEVICE_CLASS(klass); +HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(klass); + +k->is_express = 1; +k->is_bridge = 1; +k->vendor_id = PCI_VENDOR_ID_REDHAT; +k->device_id = PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE; +k->realize = pcie_pci_bridge_realize; +k->exit = pcie_pci_bridge_exit; +k->config_write = pcie_pci_bridge_write_config; +dc->vmsd = _pci_bridge_dev_vmstate; +dc->props = pcie_pci_bridge_dev_properties; +dc->vmsd = _pci_bridge_dev_vmstate; +dc->reset = _pci_bridge_reset; +set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); +hc->plug = pcie_pci_bridge_hotplug_cb; +hc->unplug_request = pcie_pci_bridge_hot_unplug_request_cb; +} + +static const TypeInfo pcie_pci_bridge_info = { +.name = TYPE_PCIE_PCI_BRIDGE_DEV, +.parent = TYPE_PCI_BRIDGE, +.instance_size = sizeof(PCIEPCIBridge), +.class_init = pcie_pci_bridge_class_init, +.interfaces = (InterfaceInfo[]) { +{ TYPE_HOTPLUG_HANDLER }, +{ }, +} +}; + +static void pciepci_register(void) +{ +type_register_static(_pci_bridge_info); +} + +type_init(pciepci_register); diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index e598b09..b33a34f 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -98,6 +98,7 @@ #define PCI_DEVICE_ID_REDHAT_PXB_PCIE0x000b #define PCI_DEVICE_ID_REDHAT_PCIE_RP 0x000c #define PCI_DEVICE_ID_REDHAT_XHCI0x000d +#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE 0x000e #define PCI_DEVICE_ID_REDHAT_QXL 0x0100 #define FMT_PCIBUS PRIx64 Reviewed-by: Marcel Apfelbaum <mar...@redhat.com> Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] >256 Virtio-net-pci hotplug Devices
On 07/08/2017 22:00, Kinsella, Ray wrote: Hi Marcel, Hi Ray, Please have a look on this thread, I think Laszlo and Paolo found the root cause. https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg01368.html It seems hot-plugging the devices would not help. Thanks, MArcel Yup - I am using Seabios by default. I took all the measures from the Kernel time reported in syslog. As Seabios wasn't exhibiting any obvious scaling problem. Ray K -Original Message- From: Marcel Apfelbaum [mailto:mar...@redhat.com] Sent: Wednesday, August 2, 2017 5:43 AM To: Kinsella, Ray <ray.kinse...@intel.com>; Kevin O'Connor <ke...@koconnor.net> Cc: Tan, Jianfeng <jianfeng@intel.com>; seabios@seabios.org; Michael Tsirkin <m...@redhat.com>; qemu-de...@nongnu.org; Gerd Hoffmann <kra...@redhat.com> Subject: Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices It is an issue worth looking into it, one more question, all the measurements are from OS boot? Do you use SeaBIOS? No problems with the firmware? Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v4 2/5] hw/pci: introduce pcie-pci-bridge device
On 07/08/2017 19:42, Alexander Bezzubikov wrote: 2017-08-07 19:39 GMT+03:00 Marcel Apfelbaum <mar...@redhat.com>: On 05/08/2017 23:27, Aleksandr Bezzubikov wrote: Introduce a new PCIExpress-to-PCI Bridge device, which is a hot-pluggable PCI Express device and supports devices hot-plug with SHPC. Hi Aleksandr, This device is intended to replace the DMI-to-PCI Bridge in an overwhelming majority of use-cases. Please drop the last line ( ... majority...) It simply replaces it. Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- hw/pci-bridge/Makefile.objs | 2 +- hw/pci-bridge/pcie_pci_bridge.c | 212 include/hw/pci/pci.h| 1 + 3 files changed, 214 insertions(+), 1 deletion(-) create mode 100644 hw/pci-bridge/pcie_pci_bridge.c diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs index c4683cf..666db37 100644 --- a/hw/pci-bridge/Makefile.objs +++ b/hw/pci-bridge/Makefile.objs @@ -1,4 +1,4 @@ -common-obj-y += pci_bridge_dev.o +common-obj-y += pci_bridge_dev.o pcie_pci_bridge.o common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o common-obj-$(CONFIG_PXB) += pci_expander_bridge.o common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c new file mode 100644 index 000..4127725 --- /dev/null +++ b/hw/pci-bridge/pcie_pci_bridge.c @@ -0,0 +1,212 @@ +/* + * QEMU Generic PCIE-PCI Bridge + * + * Copyright (c) 2017 Aleksandr Bezzubikov + * Please replace the below license with: " This work is licensed under the terms of the GNU GPL, version 2 or later. See the COPYING file in the top-level directory." to be sure it matches QEMU's license. + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci_bridge.h" +#include "hw/pci/msi.h" +#include "hw/pci/shpc.h" +#include "hw/pci/slotid_cap.h" + +typedef struct PCIEPCIBridge { +/*< private >*/ +PCIBridge parent_obj; + +OnOffAuto msi; +MemoryRegion shpc_bar; +/*< public >*/ +} PCIEPCIBridge; + +#define TYPE_PCIE_PCI_BRIDGE_DEV "pcie-pci-bridge" +#define PCIE_PCI_BRIDGE_DEV(obj) \ +OBJECT_CHECK(PCIEPCIBridge, (obj), TYPE_PCIE_PCI_BRIDGE_DEV) + +static void pcie_pci_bridge_realize(PCIDevice *d, Error **errp) +{ +PCIBridge *br = PCI_BRIDGE(d); +PCIEPCIBridge *pcie_br = PCIE_PCI_BRIDGE_DEV(d); +int rc, pos; + +pci_bridge_initfn(d, TYPE_PCI_BUS); + +d->config[PCI_INTERRUPT_PIN] = 0x1; +memory_region_init(_br->shpc_bar, OBJECT(d), "shpc-bar", + shpc_bar_size(d)); +rc = shpc_init(d, >sec_bus, _br->shpc_bar, 0, errp); +if (rc) { +goto error; +} + +rc = pcie_cap_init(d, 0, PCI_EXP_TYPE_PCI_BRIDGE, 0, errp); +if (rc < 0) { +goto cap_error; +} + +pos = pci_add_capability(d, PCI_CAP_ID_PM, 0, PCI_PM_SIZEOF, errp); +if (pos < 0) { +goto pm_error; +} +d->exp.pm_cap = pos; +pci_set_word(d->config + pos + PCI_PM_PMC, 0x3); + +pcie_cap_arifwd_init(d); +pcie_cap_deverr_init(d); + +rc = pcie_aer_init(d, PCI_ERR_VER, 0x100, PCI_ERR_SIZEOF, errp); +if (rc < 0) { +goto aer_error; +} + +if (pcie_br->msi != ON_OFF_AUTO_OFF) { +rc = msi_init(d, 0, 1, true, true, errp); +if (rc < 0) { +goto msi_error; +} +} +pci_register_bar(d, 0, PCI_BASE_ADDRESS_SPACE_MEMORY | + PCI_BASE_ADDRESS_MEM_TYPE_64, _br->shpc_bar); +return; + +msi_error: +pcie_aer_exit(d); +aer_err
Re: [SeaBIOS] [PATCH v4 3/5] hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware
On 05/08/2017 23:27, Aleksandr Bezzubikov wrote: On PCI init PCI bridges may need some extra info about bus number, IO, memory and prefetchable memory to reserve. QEMU can provide this with a special vendor-specific PCI capability. Hi Aleksandr, Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- hw/pci/pci_bridge.c | 29 + include/hw/pci/pci_bridge.h | 21 + 2 files changed, 50 insertions(+) diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index 720119b..889950d 100644 --- a/hw/pci/pci_bridge.c +++ b/hw/pci/pci_bridge.c @@ -408,6 +408,35 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, br->bus_name = bus_name; } + +int pci_bridge_qemu_reserve_cap_init(PCIDevice *dev, int cap_offset, + uint32_t bus_reserve, uint64_t io_reserve, + uint64_t non_pref_mem_reserve, + uint64_t pref_mem_reserve, + Error **errp) +{ +size_t cap_len = sizeof(PCIBridgeQemuCap); +PCIBridgeQemuCap cap = { +.len = cap_len, +.type = REDHAT_PCI_CAP_QEMU_RESERVE, I would change the type to: REDHAT_PCI_CAP_RESOURCE_RESERVE QEMU is less important here (I think) than "resource". +.bus_res = bus_reserve, +.io = io_reserve, +.mem = non_pref_mem_reserve, +.mem_pref = pref_mem_reserve +}; + +int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, +cap_offset, cap_len, errp); +if (offset < 0) { +return offset; +} + +memcpy(dev->config + offset + PCI_CAP_FLAGS, +(char *) + PCI_CAP_FLAGS, +cap_len - PCI_CAP_FLAGS); +return 0; +} + static const TypeInfo pci_bridge_type_info = { .name = TYPE_PCI_BRIDGE, .parent = TYPE_PCI_DEVICE, diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h index ff7cbaa..be565f7 100644 --- a/include/hw/pci/pci_bridge.h +++ b/include/hw/pci/pci_bridge.h @@ -67,4 +67,25 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, #define PCI_BRIDGE_CTL_DISCARD_STATUS0x400 /* Discard timer status */ #define PCI_BRIDGE_CTL_DISCARD_SERR 0x800 /* Discard timer SERR# enable */ +typedef struct PCIBridgeQemuCap { +uint8_t id; /* Standard PCI capability header field */ +uint8_t next; /* Standard PCI capability header field */ +uint8_t len;/* Standard PCI vendor-specific capability header field */ +uint8_t type; /* Red Hat vendor-specific capability type. + Types are defined with REDHAT_PCI_CAP_ prefix */ + +uint32_t bus_res; /* Minimum number of buses to reserve */ +uint64_t io;/* IO space to reserve */ +uint64_t mem; /* Non-prefetchable memory to reserve */ +uint64_t mem_pref; /* Prefetchable memory to reserve */ +} PCIBridgeQemuCap; + +#define REDHAT_PCI_CAP_QEMU_RESERVE 1 + +int pci_bridge_qemu_reserve_cap_init(PCIDevice *dev, int cap_offset, + uint32_t bus_reserve, uint64_t io_reserve, + uint64_t non_pref_mem_reserve, + uint64_t pref_mem_reserve, + Error **errp); + #endif /* QEMU_PCI_BRIDGE_H */ With the name change, the layout looks good to me: Reviewed-by: Marcel Apfelbaum <mar...@redhat.com> Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v4 2/5] hw/pci: introduce pcie-pci-bridge device
On 05/08/2017 23:27, Aleksandr Bezzubikov wrote: Introduce a new PCIExpress-to-PCI Bridge device, which is a hot-pluggable PCI Express device and supports devices hot-plug with SHPC. Hi Aleksandr, This device is intended to replace the DMI-to-PCI Bridge in an overwhelming majority of use-cases. Please drop the last line ( ... majority...) It simply replaces it. Signed-off-by: Aleksandr Bezzubikov--- hw/pci-bridge/Makefile.objs | 2 +- hw/pci-bridge/pcie_pci_bridge.c | 212 include/hw/pci/pci.h| 1 + 3 files changed, 214 insertions(+), 1 deletion(-) create mode 100644 hw/pci-bridge/pcie_pci_bridge.c diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs index c4683cf..666db37 100644 --- a/hw/pci-bridge/Makefile.objs +++ b/hw/pci-bridge/Makefile.objs @@ -1,4 +1,4 @@ -common-obj-y += pci_bridge_dev.o +common-obj-y += pci_bridge_dev.o pcie_pci_bridge.o common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o common-obj-$(CONFIG_PXB) += pci_expander_bridge.o common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c new file mode 100644 index 000..4127725 --- /dev/null +++ b/hw/pci-bridge/pcie_pci_bridge.c @@ -0,0 +1,212 @@ +/* + * QEMU Generic PCIE-PCI Bridge + * + * Copyright (c) 2017 Aleksandr Bezzubikov + * Please replace the below license with: " This work is licensed under the terms of the GNU GPL, version 2 or later. See the COPYING file in the top-level directory." to be sure it matches QEMU's license. + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci_bridge.h" +#include "hw/pci/msi.h" +#include "hw/pci/shpc.h" +#include "hw/pci/slotid_cap.h" + +typedef struct PCIEPCIBridge { +/*< private >*/ +PCIBridge parent_obj; + +OnOffAuto msi; +MemoryRegion shpc_bar; +/*< public >*/ +} PCIEPCIBridge; + +#define TYPE_PCIE_PCI_BRIDGE_DEV "pcie-pci-bridge" +#define PCIE_PCI_BRIDGE_DEV(obj) \ +OBJECT_CHECK(PCIEPCIBridge, (obj), TYPE_PCIE_PCI_BRIDGE_DEV) + +static void pcie_pci_bridge_realize(PCIDevice *d, Error **errp) +{ +PCIBridge *br = PCI_BRIDGE(d); +PCIEPCIBridge *pcie_br = PCIE_PCI_BRIDGE_DEV(d); +int rc, pos; + +pci_bridge_initfn(d, TYPE_PCI_BUS); + +d->config[PCI_INTERRUPT_PIN] = 0x1; +memory_region_init(_br->shpc_bar, OBJECT(d), "shpc-bar", + shpc_bar_size(d)); +rc = shpc_init(d, >sec_bus, _br->shpc_bar, 0, errp); +if (rc) { +goto error; +} + +rc = pcie_cap_init(d, 0, PCI_EXP_TYPE_PCI_BRIDGE, 0, errp); +if (rc < 0) { +goto cap_error; +} + +pos = pci_add_capability(d, PCI_CAP_ID_PM, 0, PCI_PM_SIZEOF, errp); +if (pos < 0) { +goto pm_error; +} +d->exp.pm_cap = pos; +pci_set_word(d->config + pos + PCI_PM_PMC, 0x3); + +pcie_cap_arifwd_init(d); +pcie_cap_deverr_init(d); + +rc = pcie_aer_init(d, PCI_ERR_VER, 0x100, PCI_ERR_SIZEOF, errp); +if (rc < 0) { +goto aer_error; +} + +if (pcie_br->msi != ON_OFF_AUTO_OFF) { +rc = msi_init(d, 0, 1, true, true, errp); +if (rc < 0) { +goto msi_error; +} +} +pci_register_bar(d, 0, PCI_BASE_ADDRESS_SPACE_MEMORY | + PCI_BASE_ADDRESS_MEM_TYPE_64, _br->shpc_bar); +return; + +msi_error: +pcie_aer_exit(d); +aer_error: +pm_error: +pcie_cap_exit(d); +cap_error: +shpc_free(d); +error: +pci_bridge_exitfn(d); +} + +static void pcie_pci_bridge_exit(PCIDevice *d) +{ +PCIEPCIBridge *bridge_dev = PCIE_PCI_BRIDGE_DEV(d); +pcie_cap_exit(d); +shpc_cleanup(d, _dev->shpc_bar); +pci_bridge_exitfn(d); +} + +static void
Re: [SeaBIOS] [PATCH v4 2/3] pci: add QEMU-specific PCI capability structure
On 07/08/2017 19:02, Alexander Bezzubikov wrote: 2017-08-07 18:52 GMT+03:00 Marcel Apfelbaum <mar...@redhat.com>: On 05/08/2017 23:29, Aleksandr Bezzubikov wrote: On PCI init PCI bridge devices may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special vendor-specific PCI capability. This capability is intended to be used only for Red Hat PCI bridges, i.e. QEMU cooperation. Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- src/fw/dev-pci.h | 50 ++ 1 file changed, 50 insertions(+) create mode 100644 src/fw/dev-pci.h diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h new file mode 100644 index 000..2c8ddb0 Hi Aleksandr, Hi Marcel, --- /dev/null +++ b/src/fw/dev-pci.h @@ -0,0 +1,50 @@ +#ifndef _PCI_CAP_H +#define _PCI_CAP_H + +#include "types.h" + +/* + Please use the standard comment: /* * * */ +QEMU-specific vendor(Red Hat)-specific capability. +It's intended to provide some hints for firmware to init PCI devices. + +Its structure is shown below: + +Header: + +u8 id; Standard PCI Capability Header field +u8 next; Standard PCI Capability Header field +u8 len; Standard PCI Capability Header field +u8 type; Red Hat vendor-specific capability type: + now only REDHAT_CAP_TYP_QEMU=1 exists Typo o the line before, but I think you don't need it there. +Data: + +u32 bus_res;minimum bus number to reserve; +this is necessary for PCI Express Root Ports +to support PCIE-to-PCI bridge hotplug I would add a broader class of usage: necessary for nesting PCI bridges hotplug. +u64 io; IO space to reserve +u64 mem;non-prefetchable memory space to reserve +u64 prefetchable_mem; prefetchable memory space to reserve + Layout looks good to me. +If any field value in Data section is -1, +it means that such kind of reservation +is not needed and must be ignored. + -1 is not a valid value for unsigned fields, you may want to say 0xff..f or some other way. I meant cast to unsigned here (because we still use unsigned types), but if it can mislead someone I will change this. We should not document signed values for unsigned fields, even if the reason is "best practices." +*/ + +/* Offset of vendor-specific capability type field */ +#define PCI_CAP_REDHAT_TYPE 3 May I ask why why '3'? I am not against it, I just want to understand the number. This is actually an offset to this field OK, so it should end with 'offset' to be clear. I was mislead. + +/* List of valid Red Hat vendor-specific capability types */ +#define REDHAT_CAP_TYPE_QEMU1 I think I pointed this in another thread, the name is too vague, please change it to something like: REDHAT_CAP_RES_RESERVE_QEMU that narrows down the intend. What does the first 'RES' mean? Resource. I don't mind you change it how you seem fit, just make it clear what it does. Is about resource reserving, not a general capability. Thanks, Marcel + + +/* Offsets of QEMU capability fields */ +#define QEMU_PCI_CAP_BUS_RES4 +#define QEMU_PCI_CAP_LIMITS_OFFSET 8 +#define QEMU_PCI_CAP_IO 8 +#define QEMU_PCI_CAP_MEM16 +#define QEMU_PCI_CAP_PREF_MEM 24 +#define QEMU_PCI_CAP_SIZE 32 + +#endif /* _PCI_CAP_H */ The layout looks good to me. Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v4 3/3] pci: enable RedHat PCI bridges to reserve additional buses on PCI init
On 05/08/2017 23:29, Aleksandr Bezzubikov wrote: In case of Red Hat Generic PCIE Root Port reserve additional buses, which number is provided in a vendor-specific capability. Hi Aleksandr, It seems the subject/commit description does not cover all that the patch does, not it also deals with other resources as well. Signed-off-by: Aleksandr Bezzubikov--- src/fw/pciinit.c | 69 src/hw/pci_ids.h | 3 +++ 2 files changed, 68 insertions(+), 4 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 864954f..d241d66 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -15,6 +15,7 @@ #include "hw/pcidevice.h" // pci_probe_devices #include "hw/pci_ids.h" // PCI_VENDOR_ID_INTEL #include "hw/pci_regs.h" // PCI_COMMAND +#include "fw/dev-pci.h" // qemu_pci_cap #include "list.h" // struct hlist_node #include "malloc.h" // free #include "output.h" // dprintf @@ -578,9 +579,42 @@ pci_bios_init_bus_rec(int bus, u8 *pci_bus) pci_bios_init_bus_rec(secbus, pci_bus); if (subbus != *pci_bus) { +u8 res_bus = 0; +if (pci_config_readw(bdf, PCI_VENDOR_ID) == PCI_VENDOR_ID_REDHAT && +pci_config_readw(bdf, PCI_DEVICE_ID) == +PCI_DEVICE_ID_REDHAT_ROOT_PORT) { I think I already pointed out you should extract the code receiving the limit into a different function. Also now you have a chance to re-use the code for IO/MEM resources. +u8 cap; +do { +cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, 0); Maybe I missed something, but how would the do-while will work if you always use pci_find_capability with offset 0. It will always start the search from 0 and find the same (first) capability, right? Maybe you need: cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, cap); +} while (cap && + pci_config_readb(bdf, cap + PCI_CAP_REDHAT_TYPE) != +REDHAT_CAP_TYPE_QEMU); +if (cap) { +u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS); +if (cap_len != QEMU_PCI_CAP_SIZE) { +dprintf(1, "PCI: QEMU cap length %d is invalid\n", +cap_len); +} else { +u32 tmp_res_bus = pci_config_readl(bdf, + cap + QEMU_PCI_CAP_BUS_RES); +if (tmp_res_bus != (u32)-1) { I would extract the above check into a separate function to make code more readable pci_qemu_res_cap_set(cap) { return cap != (u32)-1 } then the code will look like: if(pci_qemu_res_cap_set(res_bus)) { +res_bus = tmp_res_bus & 0xFF; +if ((u8)(res_bus + secbus) < secbus || +(u8)(res_bus + secbus) < res_bus) { +dprintf(1, "PCI: bus_reserve value %d is invalid\n", +res_bus); +res_bus = 0; +} +} +} +} +res_bus = (*pci_bus > secbus + res_bus) ? *pci_bus + : secbus + res_bus; +} dprintf(1, "PCI: subordinate bus = 0x%x -> 0x%x\n", -subbus, *pci_bus); -subbus = *pci_bus; +subbus, res_bus); +subbus = res_bus; +*pci_bus = res_bus; } else { dprintf(1, "PCI: subordinate bus = 0x%x\n", subbus); } @@ -951,11 +985,38 @@ pci_region_map_one_entry(struct pci_region_entry *entry, u64 addr) u16 bdf = entry->dev->bdf; u64 limit = addr + entry->size - 1; + +if (pci_config_readw(bdf, PCI_VENDOR_ID) == PCI_VENDOR_ID_REDHAT && +pci_config_readw(bdf, PCI_DEVICE_ID) == +PCI_DEVICE_ID_REDHAT_ROOT_PORT) { +u8 cap; +do { +cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, 0); +} while (cap && + pci_config_readb(bdf, cap + PCI_CAP_REDHAT_TYPE) != +REDHAT_CAP_TYPE_QEMU); +if (cap) { +u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS); +if (cap_len != QEMU_PCI_CAP_SIZE) { +dprintf(1, "PCI: QEMU cap length %d is invalid\n", +cap_len); The above code should be re-used. +} else { +u32 offset = cap + QEMU_PCI_CAP_LIMITS_OFFSET + entry->type * 8; +u64 tmp_limit = (pci_config_readl(bdf, offset) | +(u64)pci_config_readl(bdf, offset + 4) << 32); +if (tmp_limit != (u64)-1) { +
Re: [SeaBIOS] [PATCH v4 2/3] pci: add QEMU-specific PCI capability structure
On 05/08/2017 23:29, Aleksandr Bezzubikov wrote: On PCI init PCI bridge devices may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special vendor-specific PCI capability. This capability is intended to be used only for Red Hat PCI bridges, i.e. QEMU cooperation. Signed-off-by: Aleksandr Bezzubikov--- src/fw/dev-pci.h | 50 ++ 1 file changed, 50 insertions(+) create mode 100644 src/fw/dev-pci.h diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h new file mode 100644 index 000..2c8ddb0 Hi Aleksandr, --- /dev/null +++ b/src/fw/dev-pci.h @@ -0,0 +1,50 @@ +#ifndef _PCI_CAP_H +#define _PCI_CAP_H + +#include "types.h" + +/* + Please use the standard comment: /* * * */ +QEMU-specific vendor(Red Hat)-specific capability. +It's intended to provide some hints for firmware to init PCI devices. + +Its structure is shown below: + +Header: + +u8 id; Standard PCI Capability Header field +u8 next; Standard PCI Capability Header field +u8 len; Standard PCI Capability Header field +u8 type; Red Hat vendor-specific capability type: + now only REDHAT_CAP_TYP_QEMU=1 exists Typo o the line before, but I think you don't need it there. +Data: + +u32 bus_res;minimum bus number to reserve; +this is necessary for PCI Express Root Ports +to support PCIE-to-PCI bridge hotplug I would add a broader class of usage: necessary for nesting PCI bridges hotplug. +u64 io; IO space to reserve +u64 mem;non-prefetchable memory space to reserve +u64 prefetchable_mem; prefetchable memory space to reserve + Layout looks good to me. +If any field value in Data section is -1, +it means that such kind of reservation +is not needed and must be ignored. + -1 is not a valid value for unsigned fields, you may want to say 0xff..f or some other way. +*/ + +/* Offset of vendor-specific capability type field */ +#define PCI_CAP_REDHAT_TYPE 3 May I ask why why '3'? I am not against it, I just want to understand the number. + +/* List of valid Red Hat vendor-specific capability types */ +#define REDHAT_CAP_TYPE_QEMU1 I think I pointed this in another thread, the name is too vague, please change it to something like: REDHAT_CAP_RES_RESERVE_QEMU that narrows down the intend. + + +/* Offsets of QEMU capability fields */ +#define QEMU_PCI_CAP_BUS_RES4 +#define QEMU_PCI_CAP_LIMITS_OFFSET 8 +#define QEMU_PCI_CAP_IO 8 +#define QEMU_PCI_CAP_MEM16 +#define QEMU_PCI_CAP_PREF_MEM 24 +#define QEMU_PCI_CAP_SIZE 32 + +#endif /* _PCI_CAP_H */ The layout looks good to me. Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3 2/3] pci: add QEMU-specific PCI capability structure
On 04/08/2017 23:47, Alexander Bezzubikov wrote: 2017-08-04 23:28 GMT+03:00 Laszlo Ersek <ler...@redhat.com>: On 08/04/17 20:59, Alexander Bezzubikov wrote: 2017-08-01 20:28 GMT+03:00 Alexander Bezzubikov <zuban...@gmail.com>: 2017-08-01 16:38 GMT+03:00 Marcel Apfelbaum <mar...@redhat.com>: On 31/07/2017 22:01, Alexander Bezzubikov wrote: 2017-07-31 21:57 GMT+03:00 Michael S. Tsirkin <m...@redhat.com>: On Mon, Jul 31, 2017 at 09:54:55PM +0300, Alexander Bezzubikov wrote: 2017-07-31 17:09 GMT+03:00 Marcel Apfelbaum <mar...@redhat.com>: On 31/07/2017 17:00, Michael S. Tsirkin wrote: On Sat, Jul 29, 2017 at 02:34:31AM +0300, Aleksandr Bezzubikov wrote: On PCI init PCI bridge devices may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special vendor-specific PCI capability. This capability is intended to be used only for Red Hat PCI bridges, i.e. QEMU cooperation. Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- src/fw/dev-pci.h | 62 1 file changed, 62 insertions(+) create mode 100644 src/fw/dev-pci.h diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h new file mode 100644 index 000..fbd49ed --- /dev/null +++ b/src/fw/dev-pci.h @@ -0,0 +1,62 @@ +#ifndef _PCI_CAP_H +#define _PCI_CAP_H + +#include "types.h" + +/* + +QEMU-specific vendor(Red Hat)-specific capability. +It's intended to provide some hints for firmware to init PCI devices. + +Its is shown below: + +Header: + +u8 id; Standard PCI Capability Header field +u8 next; Standard PCI Capability Header field +u8 len; Standard PCI Capability Header field +u8 type; Red Hat vendor-specific capability type: + now only REDHAT_QEMU_CAP 1 exists +Data: + +u16 non_prefetchable_16; non-prefetchable memory limit + +u8 bus_res; minimum bus number to reserve; + this is necessary for PCI Express Root Ports + to support PCIE-to-PCI bridge hotplug + +u8 io_8; IO limit in case of 8-bit limit value +u32 io_32; IO limit in case of 16-bit limit value + io_8 and io_16 are mutually exclusive, in other words, + they can't be non-zero simultaneously + +u32 prefetchable_32; non-prefetchable memory limit + in case of 32-bit limit value +u64 prefetchable_64; non-prefetchable memory limit + in case of 64-bit limit value + prefetachable_32 and prefetchable_64 are + mutually exclusive, in other words, + they can't be non-zero simultaneously +If any field in Data section is 0, +it means that such kind of reservation +is not needed. I really don't like this 'mutually exclusive' fields approach because IMHO it increases confusion level when undertanding this capability structure. But - if we came to consensus on that, then IO fields should be used in the same way, because as I understand, this 'mutual exclusivity' serves to distinguish cases when we employ only *_LIMIT register and both *_LIMIT an UPPER_*_LIMIT registers. And this is how both IO and PREFETCHABLE works, isn't it? I would just use simeple 64 bit registers. PCI spec has an ugly format with fields spread all over the place but that is because of compatibility concerns. It makes not sense to spend cycles just to be similarly messy. Then I suggest to use exactly one field of a maximum possible size for each reserving object, and get rid of mutually exclusive fields. Then it can be something like that (order and names can be changed): u8 bus; u16 non_pref; u32 io; u64 pref; I think Michael suggested: u64 bus_res; u64 mem_res; u64 io_res; u64 mem_pref_res; OR: u32 bus_res; u32 mem_res; u32 io_res; u64 mem_pref_res; We can use 0XFFF..F as "not-set" value "merging" Gerd's and Michael's requests. Let's dwell on the second option (with -1 as 'ignore' sign), if no new objections. BTW, talking about limit values provided in the capability - do we want to completely override existing PCI resources allocation mechanism being used in SeaBIOS, I mean, to assign capability values hardly, not taking into consideration any existing checks, or somehow make this process soft (not an obvious way, can lead to another big discussion)? In other words, how do we plan to use IO/MEM/PREF limits provided in this capability in application to the PCIE Root Port, what result is this supposed to achieve? I think Gerd spoke about this earlier: when determining a given kind of aperture for a given bridge, pick the maximum of: - the actual cumulative need of the devices behind the bridge, and - the hint for the given kind of aperture. So basically, do the same thing as before, but if the hint is larger, grow the
Re: [SeaBIOS] [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge
On 03/08/2017 16:58, Laine Stump wrote: On 08/03/2017 06:29 AM, Marcel Apfelbaum wrote: On 03/08/2017 5:41, Laine Stump wrote: On 08/02/2017 01:58 PM, Marcel Apfelbaum wrote: On 02/08/2017 19:26, Michael S. Tsirkin wrote: On Wed, Aug 02, 2017 at 06:36:29PM +0300, Marcel Apfelbaum wrote: Can dmi-pci support shpc? why doesn't it? For compatibility? I don't know why, but the fact that it doesn't is the reason libvirt settled on auto-creating a dmi-pci bridge and a pci-pci bridge under that for Q35. The reasoning was (IIRC Laine's words correctly) that the dmi-pci bridge cannot receive hotplugged devices, while the pci-pci bridge cannot be connected to the root complex. So both were needed. Hi Laine, At least that's what I was told :-) (seriously, 50% of the convoluted rules encoded into libvirt's PCI bus topology construction and connection rules come from trial and error, and the other 50% come from advice and recommendations from others who (unlike me) actually know something about PCI.) Of course the whole setup of plugging a pci-bridge into a dmi-to-pci-bridge was (at the time at least) an exercise in futility, since hotplug didn't work properly on pci-bridge+Q35 anyway (that initially wasn't explained to me; it was only after I had constructed the odd bus topology and it was in released code that someone told me "Oh, by the way, hotplug to pci-bridge doesn't work on Q35". At first it was described as a bug, then later reclassified as a future feature.) (I guess the upside is that all of the horrible complex/confusing code needed to automatically add two controllers just to plug in a single endpoint is now already in the code, and will "just work" if/when needed). Now that I go back to look at this thread (qemu-devel is just too much for me to try and read unless something has been Cc'ed to me - I really don't know how you guys manage it!), I see that pcie-pci-bridge has been implemented, and we (libvirt) will want to use that instead of dmi-to-pci-bridge when available. And pcie-pci-bridge itself can have endpoints hotplugged into it, correct? Yes. This means there will need to be patches for libvirt that check for the presence of pcie-pci-bridge, and if it's found they will replace any auto-added dmi-to-pci-bridge+pci-bridge with a long pcie-pci-bridge. The PCIe-PCI bridge is to be plugged into a PCIe Root Port and then you can add PCI devices to it. The devices can be hot-plugged into it (see below the limitations) and even the bridge itself can be hot-plugged (old OSes might not support it). So the device will replace the dmi-pci-bridge + pci-pci bridge completely. libvirt will have 2 options: 1. Start with a pcie-pci bridge attached to a PCIe Root Port and all legacy PCI devices should land there (or on bus 0) (You can use the "auto" device addressing, add PCI devices automatically to this device until the bridge is full, then use the last slot to add a pci brigde or use another pcie-pci bridge) 2. Leave a PCIe Root Port empty and configure with hints for the fw that we might want to hotplug a pcie-pci bridge into it. If a PCI device is needed, hotplug the pcie-pci bridge first, then the device. The above model gives you enough elasticity so if you: 1. don't need PCI devices -> create the machine with no pci controllers 2. need PCI devices -> add a pcie-pci bridge and you get a legacy PCI bus supporting hotplug. 3. might need PCI devices -> leave a PCIe Root Port empty (+ hints) I'm not sure what to do in libvirt about (3). Right now if an unused root port is found in the config when adding a new endpoint device with no PCI address, the new endpoint will be attached to that existing root port. In order for one of the "save it for later" root ports to work, I guess we will need to count that root port as unavailable when setting PCI addresses on an inactive guest, but then allow hotplugging into it. For Q35 you need such policy anyway. The only way to allow PCI Express Hotplug (I am not referring now to our legacy PCI hotplug) you need to leave a few PCIe Root Ports empty. How many is an interesting question. Maybe a domain property (free-slots=x) ? For our scenario the only difference is the empty Root Port has a hint/a few hints for the firmware. Maybe all free Root Ports should behave the same. But what if someone wants to hotplug a PCI Express endpoint, and the only root-port that's available is this one that's marked to allow plugging in a pcie-pci-bridge? Do we fail the endpoint hotplug (even though it could have succeeded)? First come, first served. Or do we allow it, and then later potentially fail an attempt to hotplug a pcie-pci-bridge? (To be clear - I don't think there's really anything better that qemu could do to help this situation; I'm just thinking out loud about how libvirt can best deal with it) I think it would be very diffi
Re: [SeaBIOS] [PATCH v3 1/5] hw/i386: allow SHPC for Q35 machine
On 03/08/2017 15:52, Michael S. Tsirkin wrote: On Sat, Jul 29, 2017 at 02:37:49AM +0300, Aleksandr Bezzubikov wrote: Unmask previously masked SHPC feature in _OSC method. Signed-off-by: Aleksandr BezzubikovHi Michael, This does not do what the subject says - it enables SHPC unconditionally. And I think it will actually break ACPI hotplug for the PC unless we add an interface to disable ACPI hotplug and enable SHPC. Pls limit to Q35 only. The code is inside build_q35_osc_method, I don't understand how it affects the PC machine. Thanks, Marcel --- hw/i386/acpi-build.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 6b7bade..2ab32f9 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1848,9 +1848,9 @@ static Aml *build_q35_osc_method(void) /* * Always allow native PME, AER (no dependencies) - * Never allow SHPC (no SHPC controller in this system) + * Allow SHPC (PCI bridges can have SHPC controller) */ -aml_append(if_ctx, aml_and(a_ctrl, aml_int(0x1D), a_ctrl)); +aml_append(if_ctx, aml_and(a_ctrl, aml_int(0x1F), a_ctrl)); if_ctx2 = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(1; /* Unknown revision */ -- 2.7.4 ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge
On 03/08/2017 5:41, Laine Stump wrote: On 08/02/2017 01:58 PM, Marcel Apfelbaum wrote: On 02/08/2017 19:26, Michael S. Tsirkin wrote: On Wed, Aug 02, 2017 at 06:36:29PM +0300, Marcel Apfelbaum wrote: Can dmi-pci support shpc? why doesn't it? For compatibility? I don't know why, but the fact that it doesn't is the reason libvirt settled on auto-creating a dmi-pci bridge and a pci-pci bridge under that for Q35. The reasoning was (IIRC Laine's words correctly) that the dmi-pci bridge cannot receive hotplugged devices, while the pci-pci bridge cannot be connected to the root complex. So both were needed. Hi Laine, At least that's what I was told :-) (seriously, 50% of the convoluted rules encoded into libvirt's PCI bus topology construction and connection rules come from trial and error, and the other 50% come from advice and recommendations from others who (unlike me) actually know something about PCI.) Of course the whole setup of plugging a pci-bridge into a dmi-to-pci-bridge was (at the time at least) an exercise in futility, since hotplug didn't work properly on pci-bridge+Q35 anyway (that initially wasn't explained to me; it was only after I had constructed the odd bus topology and it was in released code that someone told me "Oh, by the way, hotplug to pci-bridge doesn't work on Q35". At first it was described as a bug, then later reclassified as a future feature.) (I guess the upside is that all of the horrible complex/confusing code needed to automatically add two controllers just to plug in a single endpoint is now already in the code, and will "just work" if/when needed). Now that I go back to look at this thread (qemu-devel is just too much for me to try and read unless something has been Cc'ed to me - I really don't know how you guys manage it!), I see that pcie-pci-bridge has been implemented, and we (libvirt) will want to use that instead of dmi-to-pci-bridge when available. And pcie-pci-bridge itself can have endpoints hotplugged into it, correct? Yes. This means there will need to be patches for libvirt that check for the presence of pcie-pci-bridge, and if it's found they will replace any auto-added dmi-to-pci-bridge+pci-bridge with a long pcie-pci-bridge. The PCIe-PCI bridge is to be plugged into a PCIe Root Port and then you can add PCI devices to it. The devices can be hot-plugged into it (see below the limitations) and even the bridge itself can be hot-plugged (old OSes might not support it). So the device will replace the dmi-pci-bridge + pci-pci bridge completely. libvirt will have 2 options: 1. Start with a pcie-pci bridge attached to a PCIe Root Port and all legacy PCI devices should land there (or on bus 0) (You can use the "auto" device addressing, add PCI devices automatically to this device until the bridge is full, then use the last slot to add a pci brigde or use another pcie-pci bridge) 2. Leave a PCIe Root Port empty and configure with hints for the fw that we might want to hotplug a pcie-pci bridge into it. If a PCI device is needed, hotplug the pcie-pci bridge first, then the device. The above model gives you enough elasticity so if you: 1. don't need PCI devices -> create the machine with no pci controllers 2. need PCI devices -> add a pcie-pci bridge and you get a legacy PCI bus supporting hotplug. 3. might need PCI devices -> leave a PCIe Root Port empty (+ hints) Thanks Laszlo OK. Is it true that dmi-pci + pci-pci under it will allow hotplug on Q35 if we just flip the bit in _OSC? Marcel, what say you?... :) Good news, works with: -device i82801b11-bridge,id=b1 -device pci-bridge,id=b2,bus=b1,chassis_nr=1,msi=off And presumably it works for modern windows? OK, so it looks like patch 1 is merely a bugfix, I'll merge it for 2.10. Tested with Win10, I think is OK to merge if for 2.10. Notice bridge's msi=off until the following kernel bug will be merged: https://www.spinics.net/lists/linux-pci/msg63052.html Does libvirt support msi=off as a work-around? We have no explicit setting for msi on pci controllers. The only place we explicitly set that is on the ivshmem device. We need msi=off because of a bug in Linux Kernel. Even if the bug is fixed (there is already a patch upstream), we don't know when will get in (actually 4.14) and what versions will include it. That doesn't mean that we couldn't add it. However, if we were going to do it manually, that would mean adding another knob that we have to support forever. And even if we wanted to do it automatically, we would not only need to find something in qemu to key off of when deciding whether or not to set it, but we would *still* have to explicitly store the setting in the config so that migrations between hosts using differing versions of qemu would preserve guest ABI. It is not even something QEMU can be queried about. It depends on the guest OS. Are there real
Re: [SeaBIOS] [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge
On 02/08/2017 19:26, Michael S. Tsirkin wrote: On Wed, Aug 02, 2017 at 06:36:29PM +0300, Marcel Apfelbaum wrote: Can dmi-pci support shpc? why doesn't it? For compatibility? I don't know why, but the fact that it doesn't is the reason libvirt settled on auto-creating a dmi-pci bridge and a pci-pci bridge under that for Q35. The reasoning was (IIRC Laine's words correctly) that the dmi-pci bridge cannot receive hotplugged devices, while the pci-pci bridge cannot be connected to the root complex. So both were needed. Thanks Laszlo OK. Is it true that dmi-pci + pci-pci under it will allow hotplug on Q35 if we just flip the bit in _OSC? Marcel, what say you?... :) Good news, works with: -device i82801b11-bridge,id=b1 -device pci-bridge,id=b2,bus=b1,chassis_nr=1,msi=off And presumably it works for modern windows? OK, so it looks like patch 1 is merely a bugfix, I'll merge it for 2.10. Tested with Win10, I think is OK to merge if for 2.10. Notice bridge's msi=off until the following kernel bug will be merged: https://www.spinics.net/lists/linux-pci/msg63052.html Does libvirt support msi=off as a work-around? Adding Laine, maybe he has the answer. Thanks, Marcel Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge
On 02/08/2017 17:21, Marcel Apfelbaum wrote: On 02/08/2017 17:16, Laszlo Ersek wrote: On 08/02/17 15:47, Michael S. Tsirkin wrote: On Wed, Aug 02, 2017 at 12:23:46AM +0200, Laszlo Ersek wrote: On 08/01/17 23:39, Michael S. Tsirkin wrote: On Wed, Aug 02, 2017 at 12:33:12AM +0300, Alexander Bezzubikov wrote: 2017-08-01 23:31 GMT+03:00 Laszlo Ersek <ler...@redhat.com>: (Whenever my comments conflict with Michael's or Marcel's, I defer to them.) On 07/29/17 01:37, Aleksandr Bezzubikov wrote: Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- docs/pcie.txt| 46 ++ docs/pcie_pci_bridge.txt | 121 +++ 2 files changed, 147 insertions(+), 20 deletions(-) create mode 100644 docs/pcie_pci_bridge.txt diff --git a/docs/pcie.txt b/docs/pcie.txt index 5bada24..338b50e 100644 --- a/docs/pcie.txt +++ b/docs/pcie.txt @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on the Root Complex: (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI Express hierarchies. -(3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI +(3) PCIE-PCI Bridge (pcie-pci-bridge), for starting legacy PCI hierarchies. (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root Buses When reviewing previous patches modifying / adding this file, I requested that we spell out "PCI Express" every single time. I'd like to see the same in this patch, if possible. OK, I didn't know it. @@ -55,18 +55,18 @@ Place only the following kinds of devices directly on the Root Complex: pcie.0 bus || | | - --- -- -- -- - | PCI Dev | | PCIe Root Port | | DMI-PCI Bridge | | pxb-pcie | - --- -- -- -- + --- -- --- -- + | PCI Dev | | PCIe Root Port | | PCIE-PCI Bridge | | pxb-pcie | + --- -- --- -- 2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint use: -device [,bus=pcie.0] 2.1.2 To expose a new PCI Express Root Bus use: -device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z] - Only PCI Express Root Ports and DMI-PCI bridges can be connected + Only PCI Express Root Ports, PCIE-PCI bridges and DMI-PCI bridges can be connected It would be nice if we could keep the flowing text wrapped to 80 chars. Also, here you add the "PCI Express-PCI" bridge to the list of allowed controllers (and you keep DMI-PCI as permitted), but above DMI was replaced. I think these should be made consistent -- we should make up our minds if we continue to recommend the DMI-PCI bridge or not. If not, then we should eradicate all traces of it. If we want to keep it at least for compatibility, then it should remain as fully documented as it is now. Now I'm beginning to think that we shouldn't keep the DMI-PCI bridge even for compatibility and may want to use a new PCIE-PCI bridge everywhere (of course, except some cases when users are sure they need exactly DMI-PCI bridge for some reason) Can dmi-pci support shpc? why doesn't it? For compatibility? I don't know why, but the fact that it doesn't is the reason libvirt settled on auto-creating a dmi-pci bridge and a pci-pci bridge under that for Q35. The reasoning was (IIRC Laine's words correctly) that the dmi-pci bridge cannot receive hotplugged devices, while the pci-pci bridge cannot be connected to the root complex. So both were needed. Thanks Laszlo OK. Is it true that dmi-pci + pci-pci under it will allow hotplug on Q35 if we just flip the bit in _OSC? Marcel, what say you?... :) Good news, works with: -device i82801b11-bridge,id=b1 -device pci-bridge,id=b2,bus=b1,chassis_nr=1,msi=off Notice bridge's msi=off until the following kernel bug will be merged: https://www.spinics.net/lists/linux-pci/msg63052.html Thanks, Marcel Will test and get back to you (it may actually work) Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge
On 02/08/2017 17:16, Laszlo Ersek wrote: On 08/02/17 15:47, Michael S. Tsirkin wrote: On Wed, Aug 02, 2017 at 12:23:46AM +0200, Laszlo Ersek wrote: On 08/01/17 23:39, Michael S. Tsirkin wrote: On Wed, Aug 02, 2017 at 12:33:12AM +0300, Alexander Bezzubikov wrote: 2017-08-01 23:31 GMT+03:00 Laszlo Ersek: (Whenever my comments conflict with Michael's or Marcel's, I defer to them.) On 07/29/17 01:37, Aleksandr Bezzubikov wrote: Signed-off-by: Aleksandr Bezzubikov --- docs/pcie.txt| 46 ++ docs/pcie_pci_bridge.txt | 121 +++ 2 files changed, 147 insertions(+), 20 deletions(-) create mode 100644 docs/pcie_pci_bridge.txt diff --git a/docs/pcie.txt b/docs/pcie.txt index 5bada24..338b50e 100644 --- a/docs/pcie.txt +++ b/docs/pcie.txt @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on the Root Complex: (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI Express hierarchies. -(3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI +(3) PCIE-PCI Bridge (pcie-pci-bridge), for starting legacy PCI hierarchies. (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root Buses When reviewing previous patches modifying / adding this file, I requested that we spell out "PCI Express" every single time. I'd like to see the same in this patch, if possible. OK, I didn't know it. @@ -55,18 +55,18 @@ Place only the following kinds of devices directly on the Root Complex: pcie.0 bus ||| | - --- -- -- -- - | PCI Dev | | PCIe Root Port | | DMI-PCI Bridge | | pxb-pcie | - --- -- -- -- + --- -- --- -- + | PCI Dev | | PCIe Root Port | | PCIE-PCI Bridge | | pxb-pcie | + --- -- --- -- 2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint use: -device [,bus=pcie.0] 2.1.2 To expose a new PCI Express Root Bus use: -device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z] - Only PCI Express Root Ports and DMI-PCI bridges can be connected + Only PCI Express Root Ports, PCIE-PCI bridges and DMI-PCI bridges can be connected It would be nice if we could keep the flowing text wrapped to 80 chars. Also, here you add the "PCI Express-PCI" bridge to the list of allowed controllers (and you keep DMI-PCI as permitted), but above DMI was replaced. I think these should be made consistent -- we should make up our minds if we continue to recommend the DMI-PCI bridge or not. If not, then we should eradicate all traces of it. If we want to keep it at least for compatibility, then it should remain as fully documented as it is now. Now I'm beginning to think that we shouldn't keep the DMI-PCI bridge even for compatibility and may want to use a new PCIE-PCI bridge everywhere (of course, except some cases when users are sure they need exactly DMI-PCI bridge for some reason) Can dmi-pci support shpc? why doesn't it? For compatibility? I don't know why, but the fact that it doesn't is the reason libvirt settled on auto-creating a dmi-pci bridge and a pci-pci bridge under that for Q35. The reasoning was (IIRC Laine's words correctly) that the dmi-pci bridge cannot receive hotplugged devices, while the pci-pci bridge cannot be connected to the root complex. So both were needed. Thanks Laszlo OK. Is it true that dmi-pci + pci-pci under it will allow hotplug on Q35 if we just flip the bit in _OSC? Marcel, what say you?... :) Will test and get back to you (it may actually work) Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] >256 Virtio-net-pci hotplug Devices
On 25/07/2017 21:00, Kinsella, Ray wrote: Hi Marcel, Hi Ray, On 24/07/2017 00:14, Marcel Apfelbaum wrote: On 24/07/2017 7:53, Kinsella, Ray wrote: Even if I am not aware of how much time would take to init a bare-metal PCIe Root Port, it seems too much. So I repeated the testing for 64, 128, 256 and 512 ports. I ensured the configuration was sane, that 128 was twice the number of root ports and virtio-pci-net devices as 64. I got the following results - shown in seconds, as you can see it is non linear but not exponential, there is something that is not scaling well. 64128256512 PCIe Root Ports14724302672 ACPI4353423863 Loading Drivers1131621 Total Boot341378907516 ( I did try to test 1024 devices, but it just dies silently ) Ray K It is an issue worth looking into it, one more question, all the measurements are from OS boot? Do you use SeaBIOS? No problems with the firmware? Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge
On 02/08/2017 1:23, Laszlo Ersek wrote: On 08/01/17 23:39, Michael S. Tsirkin wrote: On Wed, Aug 02, 2017 at 12:33:12AM +0300, Alexander Bezzubikov wrote: 2017-08-01 23:31 GMT+03:00 Laszlo Ersek: (Whenever my comments conflict with Michael's or Marcel's, I defer to them.) On 07/29/17 01:37, Aleksandr Bezzubikov wrote: Signed-off-by: Aleksandr Bezzubikov --- docs/pcie.txt| 46 ++ docs/pcie_pci_bridge.txt | 121 +++ 2 files changed, 147 insertions(+), 20 deletions(-) create mode 100644 docs/pcie_pci_bridge.txt diff --git a/docs/pcie.txt b/docs/pcie.txt index 5bada24..338b50e 100644 --- a/docs/pcie.txt +++ b/docs/pcie.txt @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on the Root Complex: (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI Express hierarchies. -(3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI +(3) PCIE-PCI Bridge (pcie-pci-bridge), for starting legacy PCI hierarchies. (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root Buses When reviewing previous patches modifying / adding this file, I requested that we spell out "PCI Express" every single time. I'd like to see the same in this patch, if possible. OK, I didn't know it. @@ -55,18 +55,18 @@ Place only the following kinds of devices directly on the Root Complex: pcie.0 bus ||| | - --- -- -- -- - | PCI Dev | | PCIe Root Port | | DMI-PCI Bridge | | pxb-pcie | - --- -- -- -- + --- -- --- -- + | PCI Dev | | PCIe Root Port | | PCIE-PCI Bridge | | pxb-pcie | + --- -- --- -- 2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint use: -device [,bus=pcie.0] 2.1.2 To expose a new PCI Express Root Bus use: -device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z] - Only PCI Express Root Ports and DMI-PCI bridges can be connected + Only PCI Express Root Ports, PCIE-PCI bridges and DMI-PCI bridges can be connected It would be nice if we could keep the flowing text wrapped to 80 chars. Also, here you add the "PCI Express-PCI" bridge to the list of allowed controllers (and you keep DMI-PCI as permitted), but above DMI was replaced. I think these should be made consistent -- we should make up our minds if we continue to recommend the DMI-PCI bridge or not. If not, then we should eradicate all traces of it. If we want to keep it at least for compatibility, then it should remain as fully documented as it is now. Now I'm beginning to think that we shouldn't keep the DMI-PCI bridge even for compatibility and may want to use a new PCIE-PCI bridge everywhere (of course, except some cases when users are sure they need exactly DMI-PCI bridge for some reason) Can dmi-pci support shpc? why doesn't it? For compatibility? Yes, mainly because I as far as I know the Intel device hasn't an SHPC controller. Is may be possible to make it work with it, but now we don't have a reason since we have PCIe_PCI bridge. I don't know why, but the fact that it doesn't is the reason libvirt settled on auto-creating a dmi-pci bridge and a pci-pci bridge under that for Q35. And hotplug doesn't work even for this configuration! (last time I checked) Thanks, Marcel The reasoning was (IIRC Laine's words correctly) that the dmi-pci bridge cannot receive hotplugged devices, while the pci-pci bridge cannot be connected to the root complex. So both were needed. Thanks Laszlo ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3 2/5] hw/pci: introduce pcie-pci-bridge device
On 01/08/2017 18:51, Michael S. Tsirkin wrote: On Tue, Aug 01, 2017 at 06:45:13PM +0300, Marcel Apfelbaum wrote: On 01/08/2017 18:32, Michael S. Tsirkin wrote: On Mon, Jul 31, 2017 at 09:40:41PM +0300, Alexander Bezzubikov wrote: +typedef struct PCIEPCIBridge { +/*< private >*/ +PCIBridge parent_obj; + +bool msi_enable; Please rename the msi_enable property to "msi" in order to be aligned with the existent PCIBridgeDev and consider making it OnOffAuto for the same reason. (I am not sure about the last part though, we have no meaning for "auto" here) Agreed about "msi", but OnOffAuto looks weird to me as we always want MSI to be enabled. Hi Michael, Why even have a property then? Can't you enable it unconditionally? Because of a current bug in Linux kernel: https://www.spinics.net/lists/linux-pci/msg63052.html msi will not work until the patch is merged. Even when it will be merged, not all linux kernels will contain the patch. You should Cc stable to make sure they all gain it eventually. Right! thanks, we missed cc-ing stable. Added stable to the mail thread. Marcel Disabling msi is a workaround for the above case. Thanks, Marcel Really enabling MSI without bus master is a bug that I'm not 100% sure it even worth working around. But I guess it's not too bad to have a work-around given it's this simple. ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3 2/5] hw/pci: introduce pcie-pci-bridge device
On 01/08/2017 18:32, Michael S. Tsirkin wrote: On Mon, Jul 31, 2017 at 09:40:41PM +0300, Alexander Bezzubikov wrote: +typedef struct PCIEPCIBridge { +/*< private >*/ +PCIBridge parent_obj; + +bool msi_enable; Please rename the msi_enable property to "msi" in order to be aligned with the existent PCIBridgeDev and consider making it OnOffAuto for the same reason. (I am not sure about the last part though, we have no meaning for "auto" here) Agreed about "msi", but OnOffAuto looks weird to me as we always want MSI to be enabled. Hi Michael, Why even have a property then? Can't you enable it unconditionally? Because of a current bug in Linux kernel: https://www.spinics.net/lists/linux-pci/msg63052.html msi will not work until the patch is merged. Even when it will be merged, not all linux kernels will contain the patch. Disabling msi is a workaround for the above case. Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3 2/3] pci: add QEMU-specific PCI capability structure
On 31/07/2017 22:01, Alexander Bezzubikov wrote: 2017-07-31 21:57 GMT+03:00 Michael S. Tsirkin <m...@redhat.com>: On Mon, Jul 31, 2017 at 09:54:55PM +0300, Alexander Bezzubikov wrote: 2017-07-31 17:09 GMT+03:00 Marcel Apfelbaum <mar...@redhat.com>: On 31/07/2017 17:00, Michael S. Tsirkin wrote: On Sat, Jul 29, 2017 at 02:34:31AM +0300, Aleksandr Bezzubikov wrote: On PCI init PCI bridge devices may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special vendor-specific PCI capability. This capability is intended to be used only for Red Hat PCI bridges, i.e. QEMU cooperation. Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- src/fw/dev-pci.h | 62 1 file changed, 62 insertions(+) create mode 100644 src/fw/dev-pci.h diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h new file mode 100644 index 000..fbd49ed --- /dev/null +++ b/src/fw/dev-pci.h @@ -0,0 +1,62 @@ +#ifndef _PCI_CAP_H +#define _PCI_CAP_H + +#include "types.h" + +/* + +QEMU-specific vendor(Red Hat)-specific capability. +It's intended to provide some hints for firmware to init PCI devices. + +Its is shown below: + +Header: + +u8 id; Standard PCI Capability Header field +u8 next; Standard PCI Capability Header field +u8 len; Standard PCI Capability Header field +u8 type; Red Hat vendor-specific capability type: + now only REDHAT_QEMU_CAP 1 exists +Data: + +u16 non_prefetchable_16; non-prefetchable memory limit + +u8 bus_res; minimum bus number to reserve; + this is necessary for PCI Express Root Ports + to support PCIE-to-PCI bridge hotplug + +u8 io_8; IO limit in case of 8-bit limit value +u32 io_32; IO limit in case of 16-bit limit value + io_8 and io_16 are mutually exclusive, in other words, + they can't be non-zero simultaneously + +u32 prefetchable_32; non-prefetchable memory limit + in case of 32-bit limit value +u64 prefetchable_64; non-prefetchable memory limit + in case of 64-bit limit value + prefetachable_32 and prefetchable_64 are + mutually exclusive, in other words, + they can't be non-zero simultaneously +If any field in Data section is 0, +it means that such kind of reservation +is not needed. I really don't like this 'mutually exclusive' fields approach because IMHO it increases confusion level when undertanding this capability structure. But - if we came to consensus on that, then IO fields should be used in the same way, because as I understand, this 'mutual exclusivity' serves to distinguish cases when we employ only *_LIMIT register and both *_LIMIT an UPPER_*_LIMIT registers. And this is how both IO and PREFETCHABLE works, isn't it? I would just use simeple 64 bit registers. PCI spec has an ugly format with fields spread all over the place but that is because of compatibility concerns. It makes not sense to spend cycles just to be similarly messy. Then I suggest to use exactly one field of a maximum possible size for each reserving object, and get rid of mutually exclusive fields. Then it can be something like that (order and names can be changed): u8 bus; u16 non_pref; u32 io; u64 pref; I think Michael suggested: u64 bus_res; u64 mem_res; u64 io_res; u64 mem_pref_res; OR: u32 bus_res; u32 mem_res; u32 io_res; u64 mem_pref_res; We can use 0XFFF..F as "not-set" value "merging" Gerd's and Michael's requests. Thanks, Marcel Hi Michael, We also want a way to say "no hint for this type". One way to achive this would be to have instead multiple vendor specific capabilities, one for each of bus#/io/mem/prefetch. 0 would mean do not reserve anything, absence of capability would mean "no info, up to firmware". First version of the series was implemented exactly like you propose, however Gerd preferred only one capability with multiple fields. I personally like the simplicity of vendor cap per io/mem/bus, even if it is on the expense of the limited PCI Config space. Personally I agree with Marcel since all this fields express reservations of some objects. We need a consensus here :) Absolutely :) Thanks, Marcel + +*/ + +/* Offset of vendor-specific capability type field */ +#define PCI_CAP_VNDR_SPEC_TYPE 3 This is a QEMU specific thing. Please name it as such. + +/* List of valid Red Hat vendor-specific capability types */ +#define REDHAT_CAP_TYPE_QEMU1 + + +/* Offsets of QEMU capability fields */ +#define QEMU_PCI_CAP_NON_PREF 4 +#define QEMU_PCI_CAP_BUS_RES6 +#define QEMU_PCI_CAP_IO_8 7 +#define QEMU_PCI_CAP_IO_32 8 +#define QEMU_PCI_CAP_PREF_3212 +#d
Re: [SeaBIOS] [PATCH v3 2/3] pci: add QEMU-specific PCI capability structure
On 31/07/2017 17:00, Michael S. Tsirkin wrote: On Sat, Jul 29, 2017 at 02:34:31AM +0300, Aleksandr Bezzubikov wrote: On PCI init PCI bridge devices may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special vendor-specific PCI capability. This capability is intended to be used only for Red Hat PCI bridges, i.e. QEMU cooperation. Signed-off-by: Aleksandr Bezzubikov--- src/fw/dev-pci.h | 62 1 file changed, 62 insertions(+) create mode 100644 src/fw/dev-pci.h diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h new file mode 100644 index 000..fbd49ed --- /dev/null +++ b/src/fw/dev-pci.h @@ -0,0 +1,62 @@ +#ifndef _PCI_CAP_H +#define _PCI_CAP_H + +#include "types.h" + +/* + +QEMU-specific vendor(Red Hat)-specific capability. +It's intended to provide some hints for firmware to init PCI devices. + +Its is shown below: + +Header: + +u8 id; Standard PCI Capability Header field +u8 next; Standard PCI Capability Header field +u8 len; Standard PCI Capability Header field +u8 type; Red Hat vendor-specific capability type: + now only REDHAT_QEMU_CAP 1 exists +Data: + +u16 non_prefetchable_16; non-prefetchable memory limit + +u8 bus_res; minimum bus number to reserve; + this is necessary for PCI Express Root Ports + to support PCIE-to-PCI bridge hotplug + +u8 io_8; IO limit in case of 8-bit limit value +u32 io_32; IO limit in case of 16-bit limit value + io_8 and io_16 are mutually exclusive, in other words, + they can't be non-zero simultaneously + +u32 prefetchable_32; non-prefetchable memory limit + in case of 32-bit limit value +u64 prefetchable_64; non-prefetchable memory limit + in case of 64-bit limit value + prefetachable_32 and prefetchable_64 are + mutually exclusive, in other words, + they can't be non-zero simultaneously +If any field in Data section is 0, +it means that such kind of reservation +is not needed. Hi Michael, We also want a way to say "no hint for this type". One way to achive this would be to have instead multiple vendor specific capabilities, one for each of bus#/io/mem/prefetch. 0 would mean do not reserve anything, absence of capability would mean "no info, up to firmware". First version of the series was implemented exactly like you propose, however Gerd preferred only one capability with multiple fields. I personally like the simplicity of vendor cap per io/mem/bus, even if it is on the expense of the limited PCI Config space. We need a consensus here :) Thanks, Marcel + +*/ + +/* Offset of vendor-specific capability type field */ +#define PCI_CAP_VNDR_SPEC_TYPE 3 This is a QEMU specific thing. Please name it as such. + +/* List of valid Red Hat vendor-specific capability types */ +#define REDHAT_CAP_TYPE_QEMU1 + + +/* Offsets of QEMU capability fields */ +#define QEMU_PCI_CAP_NON_PREF 4 +#define QEMU_PCI_CAP_BUS_RES6 +#define QEMU_PCI_CAP_IO_8 7 +#define QEMU_PCI_CAP_IO_32 8 +#define QEMU_PCI_CAP_PREF_3212 +#define QEMU_PCI_CAP_PREF_6416 +#define QEMU_PCI_CAP_SIZE 24 + +#endif /* _PCI_CAP_H */ -- 2.7.4 ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge
On 29/07/2017 2:37, Aleksandr Bezzubikov wrote: Signed-off-by: Aleksandr Bezzubikov--- docs/pcie.txt| 46 ++ docs/pcie_pci_bridge.txt | 121 +++ 2 files changed, 147 insertions(+), 20 deletions(-) create mode 100644 docs/pcie_pci_bridge.txt diff --git a/docs/pcie.txt b/docs/pcie.txt index 5bada24..338b50e 100644 --- a/docs/pcie.txt +++ b/docs/pcie.txt @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on the Root Complex: (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI Express hierarchies. -(3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI +(3) PCIE-PCI Bridge (pcie-pci-bridge), for starting legacy PCI hierarchies. (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root Buses @@ -55,18 +55,18 @@ Place only the following kinds of devices directly on the Root Complex: pcie.0 bus ||| | - --- -- -- -- - | PCI Dev | | PCIe Root Port | | DMI-PCI Bridge | | pxb-pcie | - --- -- -- -- + --- -- --- -- + | PCI Dev | | PCIe Root Port | | PCIE-PCI Bridge | | pxb-pcie | + --- -- --- -- 2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint use: -device [,bus=pcie.0] 2.1.2 To expose a new PCI Express Root Bus use: -device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z] - Only PCI Express Root Ports and DMI-PCI bridges can be connected + Only PCI Express Root Ports, PCIE-PCI bridges and DMI-PCI bridges can be connected to the pcie.1 bus: -device ioh3420,id=root_port1[,bus=pcie.1][,chassis=x][,slot=y][,addr=z] \ - -device i82801b11-bridge,id=dmi_pci_bridge1,bus=pcie.1 + -device pcie-pci-bridge,id=pcie_pci_bridge1,bus=pcie.1 2.2 PCI Express only hierarchy @@ -130,21 +130,25 @@ Notes: Legacy PCI devices can be plugged into pcie.0 as Integrated Endpoints, but, as mentioned in section 5, doing so means the legacy PCI device in question will be incapable of hot-unplugging. -Besides that use DMI-PCI Bridges (i82801b11-bridge) in combination +Besides that use PCIE-PCI Bridges (pcie-pci-bridge) in combination with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies. +Instead of the PCIE-PCI Bridge DMI-PCI one can be used, +but it doens't support hot-plug, is not crossplatform and since that +is obsolete and deprecated. Use the PCIE-PCI Bridge if you're not +absolutely sure you need the DMI-PCI Bridge. -Prefer flat hierarchies. For most scenarios a single DMI-PCI Bridge +Prefer flat hierarchies. For most scenarios a single PCIE-PCI Bridge (having 32 slots) and several PCI-PCI Bridges attached to it (each supporting also 32 slots) will support hundreds of legacy devices. -The recommendation is to populate one PCI-PCI Bridge under the DMI-PCI Bridge +The recommendation is to populate one PCI-PCI Bridge under the PCIE-PCI Bridge until is full and then plug a new PCI-PCI Bridge... pcie.0 bus -- || - --- -- - | PCI Dev | | DMI-PCI BRIDGE | - ---- + --- --- + | PCI Dev | | PCIE-PCI BRIDGE | + ----- || ---- | PCI-PCI Bridge || PCI-PCI Bridge | ... @@ -157,11 +161,11 @@ until is full and then plug a new PCI-PCI Bridge... 2.3.1 To plug a PCI device into pcie.0 as an Integrated Endpoint use: -device [,bus=pcie.0] 2.3.2 Plugging a PCI device into a PCI-PCI Bridge: - -device i82801b11-bridge,id=dmi_pci_bridge1[,bus=pcie.0] \ - -device pci-bridge,id=pci_bridge1,bus=dmi_pci_bridge1[,chassis_nr=x][,addr=y] \ + -device pcie-pci-bridge,id=pcie_pci_bridge1[,bus=pcie.0] \ + -device pci-bridge,id=pci_bridge1,bus=pcie_pci_bridge1[,chassis_nr=x][,addr=y] \ -device ,bus=pci_bridge1[,addr=x] Note that 'addr' cannot be 0 unless shpc=off parameter is passed to - the PCI Bridge. + the PCI Bridge, and can never be 0 when plugging into the PCIE-PCI Bridge. A simpler "to the PCI Bridge/PCIe-PCI bridge" finish is enough. 3. IO space issues
Re: [SeaBIOS] [PATCH v3 4/5] hw/pci: add QEMU-specific PCI capability to Generic PCI Express Root Port
On 29/07/2017 2:37, Aleksandr Bezzubikov wrote: From: Aleksandr BezzubikovTo enable hotplugging of a newly created pcie-pci-bridge, we need to tell firmware (SeaBIOS in this case) Not only SeaBIOS, also OVMF - so all guest firmware to reserve additional buses for pcie-root-port, that allows us to hotplug pcie-pci-bridge into this root port. The number of buses to reserve is provided to the device via a corresponding property, and to the firmware via new PCI capability. The property's default value is 0 to keep default behavior unchanged. Signed-off-by: Aleksandr Bezzubikov --- hw/pci-bridge/gen_pcie_root_port.c | 23 +++ hw/pci-bridge/pcie_root_port.c | 2 +- include/hw/pci/pcie_port.h | 2 ++ 3 files changed, 26 insertions(+), 1 deletion(-) diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c index cb694d6..da3caa1 100644 --- a/hw/pci-bridge/gen_pcie_root_port.c +++ b/hw/pci-bridge/gen_pcie_root_port.c @@ -16,6 +16,8 @@ #include "hw/pci/pcie_port.h" #define TYPE_GEN_PCIE_ROOT_PORT"pcie-root-port" +#define GEN_PCIE_ROOT_PORT(obj) \ +OBJECT_CHECK(GenPCIERootPort, (obj), TYPE_GEN_PCIE_ROOT_PORT) #define GEN_PCIE_ROOT_PORT_AER_OFFSET 0x100 #define GEN_PCIE_ROOT_PORT_MSIX_NR_VECTOR 1 @@ -26,6 +28,9 @@ typedef struct GenPCIERootPort { /*< public >*/ bool migrate_msix; + +/* additional buses to reserve on firmware init */ +uint8_t bus_reserve; } GenPCIERootPort; static uint8_t gen_rp_aer_vector(const PCIDevice *d) @@ -60,6 +65,21 @@ static bool gen_rp_test_migrate_msix(void *opaque, int version_id) return rp->migrate_msix; } +static void gen_rp_realize(PCIDevice *d, Error **errp) +{ +rp_realize(d, errp); +PCIESlot *s = PCIE_SLOT(d); +GenPCIERootPort *grp = GEN_PCIE_ROOT_PORT(d); + +int rc = pci_bridge_qemu_cap_init(d, 0, grp->bus_reserve, 0, 0, 0, errp); +if (rc < 0) { +pcie_chassis_del_slot(s); +pcie_cap_exit(d); +gen_rp_interrupts_uninit(d); +pci_bridge_exitfn(d); +} +} + static const VMStateDescription vmstate_rp_dev = { .name = "pcie-root-port", .version_id = 1, @@ -78,6 +98,7 @@ static const VMStateDescription vmstate_rp_dev = { static Property gen_rp_props[] = { DEFINE_PROP_BOOL("x-migrate-msix", GenPCIERootPort, migrate_msix, true), +DEFINE_PROP_UINT8("bus-reserve", GenPCIERootPort, bus_reserve, 0), DEFINE_PROP_END_OF_LIST() }; @@ -89,6 +110,8 @@ static void gen_rp_dev_class_init(ObjectClass *klass, void *data) k->vendor_id = PCI_VENDOR_ID_REDHAT; k->device_id = PCI_DEVICE_ID_REDHAT_PCIE_RP; +k->realize = gen_rp_realize; + dc->desc = "PCI Express Root Port"; dc->vmsd = _rp_dev; dc->props = gen_rp_props; diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c index 4d588cb..2f3bcb1 100644 --- a/hw/pci-bridge/pcie_root_port.c +++ b/hw/pci-bridge/pcie_root_port.c @@ -52,7 +52,7 @@ static void rp_reset(DeviceState *qdev) pci_bridge_disable_base_limit(d); } -static void rp_realize(PCIDevice *d, Error **errp) +void rp_realize(PCIDevice *d, Error **errp) { PCIEPort *p = PCIE_PORT(d); PCIESlot *s = PCIE_SLOT(d); diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h index 1333266..febd96a 100644 --- a/include/hw/pci/pcie_port.h +++ b/include/hw/pci/pcie_port.h @@ -63,6 +63,8 @@ void pcie_chassis_del_slot(PCIESlot *s); #define PCIE_ROOT_PORT_GET_CLASS(obj) \ OBJECT_GET_CLASS(PCIERootPortClass, (obj), TYPE_PCIE_ROOT_PORT) +void rp_realize(PCIDevice *d, Error **errp); This is not how QEMU re-uses parent's realize function. You can grep for "parent_realize" in the project, it goes something like this: 1. You add "DeviceRealize parent_realize" to GenPCIERootPort class. 2. In class_init you save parent's realize and replace it with your own: grpc->parent_realize = dc->realize; dc->realize = gen_rp_realize; 3. In gen_rp_realize call first parent_realize: rpc->parent_realize(dev, errp); - your code here - if (err) rpc-> exit() Thanks, Marcel + typedef struct PCIERootPortClass { PCIDeviceClass parent_class; ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3 3/5] hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware
On 29/07/2017 2:37, Aleksandr Bezzubikov wrote: On PCI init PCI bridges may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with a special vendor-specific PCI capability. Signed-off-by: Aleksandr Bezzubikov--- hw/pci/pci_bridge.c | 37 + include/hw/pci/pci_bridge.h | 28 2 files changed, 65 insertions(+) diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index 720119b..e9f12d6 100644 --- a/hw/pci/pci_bridge.c +++ b/hw/pci/pci_bridge.c @@ -408,6 +408,43 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, br->bus_name = bus_name; } Hi Alexksander, + +int pci_bridge_qemu_cap_init(PCIDevice *dev, int cap_offset, + uint8_t bus_reserve, uint32_t io_reserve, + uint16_t non_pref_reserve, uint64_t pref_reserve, + Error **errp) Maybe we should change it to something like pci_bridge_res_reseve_cap_init ? Maybe we will have other caps in the future. +{ +size_t cap_len = sizeof(PCIBridgeQemuCap); +PCIBridgeQemuCap cap = { +.len = cap_len, +.type = REDHAT_PCI_CAP_QEMU, +.bus_res = bus_reserve, +.non_pref_16 = non_pref_reserve +}; + +if ((uint8_t)io_reserve == io_reserve) { +cap.io_8 = io_reserve; +} else { +cap.io_32 = io_reserve; +} +if ((uint16_t)pref_reserve == pref_reserve) { +cap.pref_32 = pref_reserve; +} else { +cap.pref_64 = pref_reserve; +} + +int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, +cap_offset, cap_len, errp); +if (offset < 0) { +return offset; +} + +memcpy(dev->config + offset + PCI_CAP_FLAGS, +(char *) + PCI_CAP_FLAGS, +cap_len - PCI_CAP_FLAGS); +return 0; +} + static const TypeInfo pci_bridge_type_info = { .name = TYPE_PCI_BRIDGE, .parent = TYPE_PCI_DEVICE, diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h index ff7cbaa..e9b7cf4 100644 --- a/include/hw/pci/pci_bridge.h +++ b/include/hw/pci/pci_bridge.h @@ -67,4 +67,32 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, #define PCI_BRIDGE_CTL_DISCARD_STATUS0x400 /* Discard timer status */ #define PCI_BRIDGE_CTL_DISCARD_SERR 0x800 /* Discard timer SERR# enable */ +typedef struct PCIBridgeQemuCap { +uint8_t id; /* Standard PCI capability header field */ +uint8_t next; /* Standard PCI capability header field */ +uint8_t len;/* Standard PCI vendor-specific capability header field */ +uint8_t type; /* Red Hat vendor-specific capability type. + Types are defined with REDHAT_PCI_CAP_ prefix */ + +uint16_t non_pref_16; /* Non-prefetchable memory limit */ +uint8_t bus_res;/* Minimum number of buses to reserve */ +uint8_t io_8; /* IO space limit in case of 8-bit value */ +uint32_t io_32; /* IO space limit in case of 32-bit value + This 2 values are mutually exclusive, + i.e. they can't be >0 both*/ +uint32_t pref_32; /* Prefetchable memory limit + in case of 32-bit value */ +uint64_t pref_64; /* Prefetchable memory limit + in case of 64-bit value + This 2 values are mutually exclusive (just as + IO limit), i.e. they can't be >0 both */ The same comments as for the SeaBIOS series, can the capability be more simple? uint32_t bus_res uint32_t io_res uint32_t mem_res uint32_t pref_32_res uint64_t pref_64_res It is possible I missed some arguments, I'll have another look on the thread. Thanks, Marcel +} PCIBridgeQemuCap; + +#define REDHAT_PCI_CAP_QEMU 1 + +int pci_bridge_qemu_cap_init(PCIDevice *dev, int cap_offset, + uint8_t bus_reserve, uint32_t io_reserve, + uint16_t mem_reserve, uint64_t pref_reserve, + Error **errp); + #endif /* QEMU_PCI_BRIDGE_H */ ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3 1/5] hw/i386: allow SHPC for Q35 machine
On 29/07/2017 2:37, Aleksandr Bezzubikov wrote: Unmask previously masked SHPC feature in _OSC method. Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- hw/i386/acpi-build.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 6b7bade..2ab32f9 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1848,9 +1848,9 @@ static Aml *build_q35_osc_method(void) /* * Always allow native PME, AER (no dependencies) - * Never allow SHPC (no SHPC controller in this system) + * Allow SHPC (PCI bridges can have SHPC controller) */ -aml_append(if_ctx, aml_and(a_ctrl, aml_int(0x1D), a_ctrl)); +aml_append(if_ctx, aml_and(a_ctrl, aml_int(0x1F), a_ctrl)); if_ctx2 = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(1; /* Unknown revision */ Reviewed-by: Marcel Apfelbaum <mar...@redhat.com> Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3 3/3] pci: enable RedHat PCI bridges to reserve additional buses on PCI init
On 29/07/2017 2:34, Aleksandr Bezzubikov wrote: In case of Red Hat Generic PCIE Root Port reserve additional buses, which number is provided in a vendor-specific capability. Signed-off-by: Aleksandr Bezzubikov--- src/fw/pciinit.c | 37 +++-- src/hw/pci_ids.h | 3 +++ src/types.h | 2 ++ 3 files changed, 40 insertions(+), 2 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 864954f..a302a85 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -15,6 +15,7 @@ #include "hw/pcidevice.h" // pci_probe_devices #include "hw/pci_ids.h" // PCI_VENDOR_ID_INTEL #include "hw/pci_regs.h" // PCI_COMMAND +#include "fw/dev-pci.h" // qemu_pci_cap #include "list.h" // struct hlist_node #include "malloc.h" // free #include "output.h" // dprintf @@ -578,9 +579,41 @@ pci_bios_init_bus_rec(int bus, u8 *pci_bus) pci_bios_init_bus_rec(secbus, pci_bus); if (subbus != *pci_bus) { +u8 res_bus = 0; +if (pci_config_readw(bdf, PCI_VENDOR_ID) == PCI_VENDOR_ID_REDHAT && +pci_config_readw(bdf, PCI_DEVICE_ID) == +PCI_DEVICE_ID_REDHAT_ROOT_PORT) { +u8 cap; +do { +cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, 0); +} while (cap && + pci_config_readb(bdf, cap + PCI_CAP_VNDR_SPEC_TYPE) != +REDHAT_CAP_TYPE_QEMU); I suggest to extract the bus_reserve computation in a different function. +if (cap) { +u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS); +if (cap_len != QEMU_PCI_CAP_SIZE) { +dprintf(1, "PCI: QEMU cap length %d is invalid\n", +cap_len); +} else { +res_bus = pci_config_readb(bdf, + cap + QEMU_PCI_CAP_BUS_RES); +if ((u8)(res_bus + secbus) < secbus || +(u8)(res_bus + secbus) < res_bus) { What do you check here, "garbage" values? Even so, all values are unsigned, are you checking for overflow? +dprintf(1, "PCI: bus_reserve value %d is invalid\n", +res_bus); +res_bus = 0; +} else { +dprintf(1, "PCI: QEMU cap is found, value = %u\n", +res_bus); +} +} +} +res_bus = MAX(*pci_bus, secbus + res_bus); Did you re-check the reboot "issue"? Thanks, Marcel +} dprintf(1, "PCI: subordinate bus = 0x%x -> 0x%x\n", -subbus, *pci_bus); -subbus = *pci_bus; +subbus, res_bus); +subbus = res_bus; +*pci_bus = res_bus; } else { dprintf(1, "PCI: subordinate bus = 0x%x\n", subbus); } diff --git a/src/hw/pci_ids.h b/src/hw/pci_ids.h index 4ac73b4..38fa2ca 100644 --- a/src/hw/pci_ids.h +++ b/src/hw/pci_ids.h @@ -2263,6 +2263,9 @@ #define PCI_DEVICE_ID_KORENIX_JETCARDF0 0x1600 #define PCI_DEVICE_ID_KORENIX_JETCARDF1 0x16ff +#define PCI_VENDOR_ID_REDHAT 0x1b36 +#define PCI_DEVICE_ID_REDHAT_ROOT_PORT 0x000C + #define PCI_VENDOR_ID_TEKRAM 0x1de1 #define PCI_DEVICE_ID_TEKRAM_DC2900xdc29 diff --git a/src/types.h b/src/types.h index 19d9f6c..75d9108 100644 --- a/src/types.h +++ b/src/types.h @@ -122,6 +122,8 @@ extern void __force_link_error__only_in_16bit(void) __noreturn; typeof(divisor) __divisor = divisor;\ (((x) + ((__divisor) / 2)) / (__divisor)); \ }) +#define MIN(a, b) (((a) < (b)) ? (a) : (b)) +#define MAX(a, b) (((a) > (b)) ? (a) : (b)) #define ALIGN(x,a) __ALIGN_MASK(x,(typeof(x))(a)-1) #define __ALIGN_MASK(x,mask)(((x)+(mask))&~(mask)) #define ALIGN_DOWN(x,a) ((x) & ~((typeof(x))(a)-1)) ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3 2/3] pci: add QEMU-specific PCI capability structure
On 29/07/2017 2:34, Aleksandr Bezzubikov wrote: On PCI init PCI bridge devices may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special vendor-specific PCI capability. This capability is intended to be used only for Red Hat PCI bridges, i.e. QEMU cooperation. Signed-off-by: Aleksandr Bezzubikov--- src/fw/dev-pci.h | 62 1 file changed, 62 insertions(+) create mode 100644 src/fw/dev-pci.h diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h new file mode 100644 index 000..fbd49ed --- /dev/null +++ b/src/fw/dev-pci.h @@ -0,0 +1,62 @@ +#ifndef _PCI_CAP_H +#define _PCI_CAP_H + +#include "types.h" + +/* + Hi Aleksander, +QEMU-specific vendor(Red Hat)-specific capability. +It's intended to provide some hints for firmware to init PCI devices. + +Its is shown below: + +Header: + +u8 id; Standard PCI Capability Header field +u8 next; Standard PCI Capability Header field +u8 len; Standard PCI Capability Header field +u8 type; Red Hat vendor-specific capability type: + now only REDHAT_QEMU_CAP 1 exists +Data: + +u16 non_prefetchable_16; non-prefetchable memory limit + Maybe we should name it "mem". And if I remember right Gerd suggested keeping them all 32 bits: u32 mem_res +u8 bus_res; minimum bus number to reserve; + this is necessary for PCI Express Root Ports + to support PCIE-to-PCI bridge hotplug + +u8 io_8; IO limit in case of 8-bit limit value I must have missed it, but why do we need io_8 field? +u32 io_32; IO limit in case of 16-bit limit value + io_8 and io_16 are mutually exclusive, in other words, + they can't be non-zero simultaneously I don't see any io_16 field. Maybe only one field: u32 io_res + +u32 prefetchable_32; non-prefetchable memory limit + in case of 32-bit limit value Name and comment mismatch +u64 prefetchable_64; non-prefetchable memory limit + in case of 64-bit limit value + prefetachable_32 and prefetchable_64 are + mutually exclusive, in other words, + they can't be non-zero simultaneously Name and comment mismatch It should look like: - u32 bus_res - u32 io_res - u32 mem_res, - u32 mem_prefetchable_32, - u64 mem_prefetchable_64, (mutually exclusive with the above) Does it look right to all? +If any field in Data section is 0, +it means that such kind of reservation +is not needed. + +*/ + +/* Offset of vendor-specific capability type field */ +#define PCI_CAP_VNDR_SPEC_TYPE 3 + +/* List of valid Red Hat vendor-specific capability types */ +#define REDHAT_CAP_TYPE_QEMU1 Maybe we should be more concrete: REDHAT_CAP_TYPE_RES_RESERVE + + +/* Offsets of QEMU capability fields */ +#define QEMU_PCI_CAP_NON_PREF 4 +#define QEMU_PCI_CAP_BUS_RES6 +#define QEMU_PCI_CAP_IO_8 7 +#define QEMU_PCI_CAP_IO_32 8 +#define QEMU_PCI_CAP_PREF_3212 +#define QEMU_PCI_CAP_PREF_6416 +#define QEMU_PCI_CAP_SIZE 24 + +#endif /* _PCI_CAP_H */ I know the exact layout is less important for your current project, but is important to get it right the first time. Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2 4/6] hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware
On 29/07/2017 2:12, Michael S. Tsirkin wrote: On Thu, Jul 27, 2017 at 12:39:54PM +0300, Marcel Apfelbaum wrote: On 27/07/2017 2:28, Michael S. Tsirkin wrote: On Thu, Jul 27, 2017 at 12:54:07AM +0300, Alexander Bezzubikov wrote: 2017-07-26 22:43 GMT+03:00 Michael S. Tsirkin <m...@redhat.com>: On Sun, Jul 23, 2017 at 01:15:41AM +0300, Aleksandr Bezzubikov wrote: On PCI init PCI bridges may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special with a special vendor-specific PCI capability. Sizes of limits match ones from PCI Type 1 Configuration Space Header, number of buses to reserve occupies only 1 byte since it is the size of Subordinate Bus Number register. Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- hw/pci/pci_bridge.c | 27 +++ include/hw/pci/pci_bridge.h | 18 ++ 2 files changed, 45 insertions(+) diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index 720119b..8ec6c2c 100644 --- a/hw/pci/pci_bridge.c +++ b/hw/pci/pci_bridge.c @@ -408,6 +408,33 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, br->bus_name = bus_name; } + +int pci_bridge_help_cap_init(PCIDevice *dev, int cap_offset, help? should be qemu_cap_init? + uint8_t bus_reserve, uint32_t io_limit, + uint16_t mem_limit, uint64_t pref_limit, + Error **errp) +{ +size_t cap_len = sizeof(PCIBridgeQemuCap); +PCIBridgeQemuCap cap; This leaks info to guest. You want to init all fields here: cap = { .len = }; I surely can do this for len field, but as Laszlo proposed we can use mutually exclusive fields, e.g. pref_32 and pref_64, the only way I have left is to use ternary operator (if we surely need this big initializer). Keeping some if's would look better, I think. + +cap.len = cap_len; +cap.bus_res = bus_reserve; +cap.io_lim = io_limit & 0xFF; +cap.io_lim_upper = io_limit >> 8 & 0x; +cap.mem_lim = mem_limit; +cap.pref_lim = pref_limit & 0x; +cap.pref_lim_upper = pref_limit >> 16 & 0x; Please use pci_set_word etc or cpu_to_leXX. Since now we've decided to avoid fields separation into + , this bitmask along with pci_set_word are no longer needed. I think it's easiest to replace struct with a set of macros then pci_set_word does the work for you. I don't really want to use macros here because structure show us the whole capability layout and this can decrease documenting efforts. More than that, memcpy usage is very convenient here, and I wouldn't like to lose it. + +int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, +cap_offset, cap_len, errp); +if (offset < 0) { +return offset; +} + +memcpy(dev->config + offset + 2, (char *) + 2, cap_len - 2); +2 is yacky. See how virtio does it: memcpy(dev->config + offset + PCI_CAP_FLAGS, >cap_len, cap->cap_len - PCI_CAP_FLAGS); OK. +return 0; +} + static const TypeInfo pci_bridge_type_info = { .name = TYPE_PCI_BRIDGE, .parent = TYPE_PCI_DEVICE, diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h index ff7cbaa..c9f642c 100644 --- a/include/hw/pci/pci_bridge.h +++ b/include/hw/pci/pci_bridge.h @@ -67,4 +67,22 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, #define PCI_BRIDGE_CTL_DISCARD_STATUS 0x400 /* Discard timer status */ #define PCI_BRIDGE_CTL_DISCARD_SERR 0x800 /* Discard timer SERR# enable */ +typedef struct PCIBridgeQemuCap { +uint8_t id; /* Standard PCI capability header field */ +uint8_t next; /* Standard PCI capability header field */ +uint8_t len;/* Standard PCI vendor-specific capability header field */ +uint8_t bus_res; +uint32_t pref_lim_upper; Big endian? Ugh. Agreed, and this's gonna to disappear with the new layout. +uint16_t pref_lim; +uint16_t mem_lim; I'd say we need 64 bit for memory. Why? Non-prefetchable MEMORY_LIMIT register is 16 bits long. Hmm ok, but e.g. for io there are bridges that have extra registers to specify non-standard non-aligned registers. +uint16_t io_lim_upper; +uint8_t io_lim; +uint8_t padding; IMHO each type should have a special "don't care" flag that would mean "I don't know". Don't know what? Now 0 is an indicator to do nothing with this field. In that case how do you say "don't allocate any memory"? We can keep the MEM/Limit registers read-only for such cases, as they are optional registers. Thanks, Marcel Hi Michael, I don't believe they are - from the spec (1.2): The Memory Base and Memory Limit registers are both required registers Rest of ranges are indeed optiona
Re: [SeaBIOS] [RFC PATCH v2 0/4] Allow RedHat PCI bridges reserve more buses than necessary during init
On 26/07/2017 21:49, Michael S. Tsirkin wrote: On Wed, Jul 26, 2017 at 07:22:42PM +0300, Marcel Apfelbaum wrote: On 26/07/2017 18:20, Laszlo Ersek wrote: On 07/26/17 08:48, Marcel Apfelbaum wrote: On 25/07/2017 18:46, Laszlo Ersek wrote: [snip] (2) Bus range reservation, and hotplugging bridges. What's the motivation? Our recommendations in "docs/pcie.txt" suggest flat hierarchies. It remains flat. You have one single PCIE-PCI bridge plugged into a PCIe Root Port, no deep nesting. The reason is to be able to support legacy PCI devices without "committing" with a DMI-PCI bridge in advance. (Keep Q35 without) legacy hw. The only way to support PCI devices in Q35 is to have them cold-plugged into the pcie.0 bus, which is good, but not enough for expanding the Q35 usability in order to make it eventually the default QEMU x86 machine (I know this is another discussion and I am in minority, at least for now). The plan is: Start Q35 machine as usual, but one of the PCIe Root Ports includes hints for firmware needed t support legacy PCI devices. (IO Ports range, extra bus,...) Once a pci device is needed you have 2 options: 1. Plug a PCIe-PCI bridge into a PCIe Root Port and the PCI device in the bridge. 2. Hotplug a PCIe-PCI bridge into a PCIe Root Port and then hotplug a PCI device into the bridge. Hi Laszlo, Thank you for the explanation, it makes the intent a lot clearer. However, what does the hot-pluggability of the PCIe-PCI bridge buy us? In other words, what does it buy us when we do not add the PCIe-PCI bridge immediately at guest startup, as an integrated device? > Why is it a problem to "commit" in advance? I understand that we might not like the DMI-PCI bridge (due to it being legacy), but what speaks against cold-plugging the PCIe-PCI bridge either as an integrated device in pcie.0 (assuming that is permitted), or cold-plugging the PCIe-PCI bridge in a similarly cold-plugged PCIe root port? We want to keep Q35 clean, and for most cases we don't want any legacy PCI stuff if not especially required. BTW, what are the PCI devices that we actually need? Is not about what we need, if Q35 will become a "transition" machine, any existing emulated PCI device is fair game, since we would want to run on Q35 also pc configurations. Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2 0/4] Allow RedHat PCI bridges reserve more buses than necessary during init
On 26/07/2017 21:31, Laszlo Ersek wrote: On 07/26/17 18:22, Marcel Apfelbaum wrote: On 26/07/2017 18:20, Laszlo Ersek wrote: [snip] However, what does the hot-pluggability of the PCIe-PCI bridge buy us? In other words, what does it buy us when we do not add the PCIe-PCI bridge immediately at guest startup, as an integrated device? > Why is it a problem to "commit" in advance? I understand that we might not like the DMI-PCI bridge (due to it being legacy), but what speaks against cold-plugging the PCIe-PCI bridge either as an integrated device in pcie.0 (assuming that is permitted), or cold-plugging the PCIe-PCI bridge in a similarly cold-plugged PCIe root port? We want to keep Q35 clean, and for most cases we don't want any legacy PCI stuff if not especially required. I mean, in the cold-plugged case, you use up two bus numbers at the most, one for the root port, and another for the PCIe-PCI bridge. In the hot-plugged case, you have to start with the cold-plugged root port just the same (so that you can communicate the bus number reservation *at all*), and then reserve (= use up in advance) the bus number, the IO space, and the MMIO space(s). I don't see the difference; hot-plugging the PCIe-PCI bridge (= not committing in advance) doesn't seem to save any resources. Is not about resources, more about usage model. I guess I would see a difference if we reserved more than one bus number in the hotplug case, namely in order to support recursive hotplug under the PCIe-PCI bridge. But, you confirmed that we intend to keep the flat hierarchy (ie the exercise is only for enabling legacy PCI endpoints, not for recursive hotplug). The PCIe-PCI bridge isn't a device that does anything at all on its own, so why not just coldplug it? Its resources have to be reserved in advance anyway. Even if we prefer flat hierarchies, we should allow a sane nested bridges configuration, so we will some times reserve more than one. So, thus far I would say "just cold-plug the PCIe-PCI bridge at startup, possibly even make it an integrated device, and then you don't need to reserve bus numbers (and other apertures)". Where am I wrong? Nothing wrong, I am just looking for feature parity Q35 vs PC. Users may want to continue using [nested] PCI bridges, and we want the Q35 machine to be used by more users in order to make it reliable faster, while keeping it clean by default. We had a discussion on this matter on last year KVM forum and the hot-pluggable PCIe-PCI bridge was the general consensus. OK. I don't want to question or go back on that consensus now; I'd just like to point out that all that you describe (nested bridges, and enabling legacy PCI with PCIe-PCI bridges, *on demand*) is still possible with cold-plugging. I.e., the default setup of Q35 does not need to include legacy PCI bridges. It's just that the pre-launch configuration effort for a Q35 user to *reserve* resources for legacy PCI is the exact same as the pre-launch configuration effort to *actually cold-plug* the bridge. [snip] The PI spec says, [...] For all the root HPCs and the nonroot HPCs, call EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding() to obtain the amount of overallocation and add that amount to the requests from the physical devices. Reprogram the bus numbers by taking into account the bus resource padding information. [...] However, according to my interpretation of the source code, PciBusDxe does not consider bus number padding for non-root HPCs (which are "all" HPCs on QEMU). Theoretically speaking, it is possible to change the behavior, right? Not just theoretically; in the past I have changed PciBusDxe -- it wouldn't identify QEMU's hotplug controllers (root port, downstream port etc) appropriately, and I managed to get some patches in. It's just that the less we understand the current code and the more intrusive/extensive the change is, the harder it is to sell the *idea*. PciBusDxe is platform-independent and shipped on many a physical system too. Understood, but from your explanation it sounds like the existings callback sites(hooks) are enough. That's the problem: they don't appear to, if you consider bus number reservations. The existing callback sites seem fine regarding IO and MMIO, but the only callback site that honors bus number reservation is limited to "root" (in the previously defined sense) hotplug controllers. So this is something that will need investigation, and my most recent queries into the "hotplug preparation" parts of PciBusDxe indicate that those parts are quite... "forgotten". :) I guess this might be because on physical systems the level of PCI(e) hotpluggery that we plan to do is likely unheard of :) I admit is possible that it looks a little "crazy" on bare-metal, but as long as we "color inside the lines" we are allowed to push it a little :) Thanks, Marcel Thanks! Lasz
Re: [SeaBIOS] [RFC PATCH v2 4/6] hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware
On 27/07/2017 2:28, Michael S. Tsirkin wrote: On Thu, Jul 27, 2017 at 12:54:07AM +0300, Alexander Bezzubikov wrote: 2017-07-26 22:43 GMT+03:00 Michael S. Tsirkin: On Sun, Jul 23, 2017 at 01:15:41AM +0300, Aleksandr Bezzubikov wrote: On PCI init PCI bridges may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special with a special vendor-specific PCI capability. Sizes of limits match ones from PCI Type 1 Configuration Space Header, number of buses to reserve occupies only 1 byte since it is the size of Subordinate Bus Number register. Signed-off-by: Aleksandr Bezzubikov --- hw/pci/pci_bridge.c | 27 +++ include/hw/pci/pci_bridge.h | 18 ++ 2 files changed, 45 insertions(+) diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index 720119b..8ec6c2c 100644 --- a/hw/pci/pci_bridge.c +++ b/hw/pci/pci_bridge.c @@ -408,6 +408,33 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, br->bus_name = bus_name; } + +int pci_bridge_help_cap_init(PCIDevice *dev, int cap_offset, help? should be qemu_cap_init? + uint8_t bus_reserve, uint32_t io_limit, + uint16_t mem_limit, uint64_t pref_limit, + Error **errp) +{ +size_t cap_len = sizeof(PCIBridgeQemuCap); +PCIBridgeQemuCap cap; This leaks info to guest. You want to init all fields here: cap = { .len = }; I surely can do this for len field, but as Laszlo proposed we can use mutually exclusive fields, e.g. pref_32 and pref_64, the only way I have left is to use ternary operator (if we surely need this big initializer). Keeping some if's would look better, I think. + +cap.len = cap_len; +cap.bus_res = bus_reserve; +cap.io_lim = io_limit & 0xFF; +cap.io_lim_upper = io_limit >> 8 & 0x; +cap.mem_lim = mem_limit; +cap.pref_lim = pref_limit & 0x; +cap.pref_lim_upper = pref_limit >> 16 & 0x; Please use pci_set_word etc or cpu_to_leXX. Since now we've decided to avoid fields separation into + , this bitmask along with pci_set_word are no longer needed. I think it's easiest to replace struct with a set of macros then pci_set_word does the work for you. I don't really want to use macros here because structure show us the whole capability layout and this can decrease documenting efforts. More than that, memcpy usage is very convenient here, and I wouldn't like to lose it. + +int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, +cap_offset, cap_len, errp); +if (offset < 0) { +return offset; +} + +memcpy(dev->config + offset + 2, (char *) + 2, cap_len - 2); +2 is yacky. See how virtio does it: memcpy(dev->config + offset + PCI_CAP_FLAGS, >cap_len, cap->cap_len - PCI_CAP_FLAGS); OK. +return 0; +} + static const TypeInfo pci_bridge_type_info = { .name = TYPE_PCI_BRIDGE, .parent = TYPE_PCI_DEVICE, diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h index ff7cbaa..c9f642c 100644 --- a/include/hw/pci/pci_bridge.h +++ b/include/hw/pci/pci_bridge.h @@ -67,4 +67,22 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, #define PCI_BRIDGE_CTL_DISCARD_STATUS 0x400 /* Discard timer status */ #define PCI_BRIDGE_CTL_DISCARD_SERR 0x800 /* Discard timer SERR# enable */ +typedef struct PCIBridgeQemuCap { +uint8_t id; /* Standard PCI capability header field */ +uint8_t next; /* Standard PCI capability header field */ +uint8_t len;/* Standard PCI vendor-specific capability header field */ +uint8_t bus_res; +uint32_t pref_lim_upper; Big endian? Ugh. Agreed, and this's gonna to disappear with the new layout. +uint16_t pref_lim; +uint16_t mem_lim; I'd say we need 64 bit for memory. Why? Non-prefetchable MEMORY_LIMIT register is 16 bits long. Hmm ok, but e.g. for io there are bridges that have extra registers to specify non-standard non-aligned registers. +uint16_t io_lim_upper; +uint8_t io_lim; +uint8_t padding; IMHO each type should have a special "don't care" flag that would mean "I don't know". Don't know what? Now 0 is an indicator to do nothing with this field. In that case how do you say "don't allocate any memory"? We can keep the MEM/Limit registers read-only for such cases, as they are optional registers. Thanks, Marcel +} PCIBridgeQemuCap; You don't really need this struct in the header. And pls document all fields. + +int pci_bridge_help_cap_init(PCIDevice *dev, int cap_offset, + uint8_t bus_reserve, uint32_t io_limit, + uint16_t mem_limit, uint64_t pref_limit, + Error **errp); + #endif /* QEMU_PCI_BRIDGE_H */
Re: [SeaBIOS] [RFC PATCH v2 0/4] Allow RedHat PCI bridges reserve more buses than necessary during init
On 26/07/2017 18:20, Laszlo Ersek wrote: On 07/26/17 08:48, Marcel Apfelbaum wrote: On 25/07/2017 18:46, Laszlo Ersek wrote: [snip] (2) Bus range reservation, and hotplugging bridges. What's the motivation? Our recommendations in "docs/pcie.txt" suggest flat hierarchies. It remains flat. You have one single PCIE-PCI bridge plugged into a PCIe Root Port, no deep nesting. The reason is to be able to support legacy PCI devices without "committing" with a DMI-PCI bridge in advance. (Keep Q35 without) legacy hw. The only way to support PCI devices in Q35 is to have them cold-plugged into the pcie.0 bus, which is good, but not enough for expanding the Q35 usability in order to make it eventually the default QEMU x86 machine (I know this is another discussion and I am in minority, at least for now). The plan is: Start Q35 machine as usual, but one of the PCIe Root Ports includes hints for firmware needed t support legacy PCI devices. (IO Ports range, extra bus,...) Once a pci device is needed you have 2 options: 1. Plug a PCIe-PCI bridge into a PCIe Root Port and the PCI device in the bridge. 2. Hotplug a PCIe-PCI bridge into a PCIe Root Port and then hotplug a PCI device into the bridge. Hi Laszlo, Thank you for the explanation, it makes the intent a lot clearer. However, what does the hot-pluggability of the PCIe-PCI bridge buy us? In other words, what does it buy us when we do not add the PCIe-PCI bridge immediately at guest startup, as an integrated device? > Why is it a problem to "commit" in advance? I understand that we might not like the DMI-PCI bridge (due to it being legacy), but what speaks against cold-plugging the PCIe-PCI bridge either as an integrated device in pcie.0 (assuming that is permitted), or cold-plugging the PCIe-PCI bridge in a similarly cold-plugged PCIe root port? We want to keep Q35 clean, and for most cases we don't want any legacy PCI stuff if not especially required. I mean, in the cold-plugged case, you use up two bus numbers at the most, one for the root port, and another for the PCIe-PCI bridge. In the hot-plugged case, you have to start with the cold-plugged root port just the same (so that you can communicate the bus number reservation *at all*), and then reserve (= use up in advance) the bus number, the IO space, and the MMIO space(s). I don't see the difference; hot-plugging the PCIe-PCI bridge (= not committing in advance) doesn't seem to save any resources. Is not about resources, more about usage model. I guess I would see a difference if we reserved more than one bus number in the hotplug case, namely in order to support recursive hotplug under the PCIe-PCI bridge. But, you confirmed that we intend to keep the flat hierarchy (ie the exercise is only for enabling legacy PCI endpoints, not for recursive hotplug). The PCIe-PCI bridge isn't a device that does anything at all on its own, so why not just coldplug it? Its resources have to be reserved in advance anyway. Even if we prefer flat hierarchies, we should allow a sane nested bridges configuration, so we will some times reserve more than one. So, thus far I would say "just cold-plug the PCIe-PCI bridge at startup, possibly even make it an integrated device, and then you don't need to reserve bus numbers (and other apertures)". Where am I wrong? Nothing wrong, I am just looking for feature parity Q35 vs PC. Users may want to continue using [nested] PCI bridges, and we want the Q35 machine to be used by more users in order to make it reliable faster, while keeping it clean by default. We had a discussion on this matter on last year KVM forum and the hot-pluggable PCIe-PCI bridge was the general consensus. As a bonus we get the PCI firmware hints capability. [snip] (4) Whether the reservation size should be absolute or relative (raised by Gerd). IIUC, Gerd suggests that the absolute aperture size should be specified (as a minimum), not the increment / reservation for hotplug purposes. The Platform Initialization Specification, v1.6, downloadable at <http://www.uefi.org/specs>, writes the following under EFI_PCI_HOT_PLUG_INIT_PROTOCOL.GetResourcePadding() in whose implementation I will have to parse the values from the capability structure, and return the appropriate representation to the platform-independent PciBusDxe driver (i.e., the enumeration / allocation agent): The padding is returned in the form of ACPI (2.0 & 3.0) resource descriptors. The exact definition of each of the fields is the same as in the EFI_PCI_HOST_BRIDGE_RESOURCE_ALLOCATION_PROTOCOL.SubmitResources() function. See the section 10.8 for the definition of this function. The PCI bus driver is responsible for adding this resource request to the resource requests by the physical PCI devices. If Attributes is EfiPaddingPciBus, the padding takes effect at the PCI bus level. If Attributes is EfiPaddingPciRootBridge, the required padding ta
Re: [SeaBIOS] [RFC PATCH v2 0/4] Allow RedHat PCI bridges reserve more buses than necessary during init
On 25/07/2017 18:46, Laszlo Ersek wrote: On 07/23/17 00:11, Aleksandr Bezzubikov wrote: Now PCI bridges get a bus range number on a system init, basing on currently plugged devices. That's why when one wants to hotplug another bridge, it needs his child bus, which the parent is unable to provide (speaking about virtual device). The suggested workaround is to have vendor-specific capability in Red Hat PCI bridges that contains number of additional bus to reserve on BIOS PCI init. So this capability is intented only for pure QEMU->SeaBIOS usage. Considering all aforesaid, this series is directly connected with QEMU RFC series (v2) "Generic PCIE-PCI Bridge". Although the new PCI capability is supposed to contain various limits along with bus number to reserve, now only its full layout is proposed, but only bus_reserve field is used in QEMU and BIOS. Limits usage is still a subject for implementation as now the main goal of this series to provide necessary support from the firmware side to PCIE-PCI bridge hotplug. Changes v1->v2: 1. New #define for Red Hat vendor added (addresses Konrad's comment). 2. Refactored pci_find_capability function (addresses Marcel's comment). 3. Capability reworked: - data type added; - reserve space in a structure for IO, memory and prefetchable memory limits. Aleksandr Bezzubikov (4): pci: refactor pci_find_capapibilty to get bdf as the first argument instead of the whole pci_device pci: add RedHat vendor ID pci: add QEMU-specific PCI capability structure pci: enable RedHat PCI bridges to reserve additional buses on PCI init src/fw/pciinit.c| 18 ++ src/hw/pci_cap.h| 23 +++ src/hw/pci_ids.h| 2 ++ src/hw/pcidevice.c | 12 ++-- src/hw/pcidevice.h | 2 +- src/hw/virtio-pci.c | 4 ++-- 6 files changed, 48 insertions(+), 13 deletions(-) create mode 100644 src/hw/pci_cap.h Coming back from PTO, it's hard for me to follow up on all the comments that have been made across the v1 and v2 of this RFC series, so I'll just provide a brain dump here: Hi Laszlo, Thanks for the review. (1) Mentioned by Michael: documentation. That's the most important part. I haven't seen the QEMU patches, so perhaps they already include documentation. If not, please start this work with adding a detailed description do QEMU's docs/ or docs/specs/. I agree is time. Aleksandr, please be sure to document the PCIE-PCI bridge in docs/pcie.txt There are a number of preexistent documents that might be related, just search docs/ for filenames with "pci" in them. (2) Bus range reservation, and hotplugging bridges. What's the motivation? Our recommendations in "docs/pcie.txt" suggest flat hierarchies. It remains flat. You have one single PCIE-PCI bridge plugged into a PCIe Root Port, no deep nesting. The reason is to be able to support legacy PCI devices without "committing" with a DMI-PCI bridge in advance. (Keep Q35 without) legacy hw. The only way to support PCI devices in Q35 is to have them cold-plugged into the pcie.0 bus, which is good, but not enough for expanding the Q35 usability in order to make it eventually the default QEMU x86 machine (I know this is another discussion and I am in minority, at least for now). The plan is: Start Q35 machine as usual, but one of the PCIe Root Ports includes hints for firmware needed t support legacy PCI devices. (IO Ports range, extra bus,...) Once a pci device is needed you have 2 options: 1. Plug a PCIe-PCI bridge into a PCIe Root Port and the PCI device in the bridge. 2. Hotplug a PCIe-PCI bridge into a PCIe Root Port and then hotplug a PCI device into the bridge. If this use case is really necessary, I think it should be covered in "docs/pcie.txt". In particular it has a consequence for PXB as well (search "pcie.txt" for "bus_nr") -- if users employ extra root buses, then the bus number partitions that they specify must account for any bridges that they plan to hot-plug (and for the bus range reservations on the cold-plugged bridges behind those extra root buses). Agreed about the doc. (3) Regarding the contents and the format of the capability structure, I wrote up my thoughts earlier in https://bugzilla.redhat.com/show_bug.cgi?id=1434747#c8 Let me quote it here for ease of commenting: (In reply to Gerd Hoffmann from comment #7) So, now that the generic ports are there we can go on figure how to handle this best. I still think the best way to communicate window size hints would be to use a vendor specific pci capability (instead of setting the desired size on reset). The information will always be available then and we don't run into initialization order issues. This seems good to me -- I can't promise 100% without actually trying, but I think I should be able to parse the capability list in config space for this hint, in the GetResourcePadding() callback. I propose that we try to handle
Re: [SeaBIOS] [RFC PATCH v2 5/6] hw/pci: add bus_reserve property to pcie-root-port
On 25/07/2017 20:11, Alexander Bezzubikov wrote: вт, 25 июля 2017 г. в 19:10, Marcel Apfelbaum <mar...@redhat.com <mailto:mar...@redhat.com>>: On 25/07/2017 17:09, Alexander Bezzubikov wrote: > 2017-07-25 16:53 GMT+03:00 Michael S. Tsirkin <m...@redhat.com <mailto:m...@redhat.com>>: >> On Tue, Jul 25, 2017 at 04:50:49PM +0300, Alexander Bezzubikov wrote: >>> 2017-07-25 16:43 GMT+03:00 Michael S. Tsirkin <m...@redhat.com <mailto:m...@redhat.com>>: >>>> On Sun, Jul 23, 2017 at 05:13:11PM +0300, Marcel Apfelbaum wrote: >>>>> On 23/07/2017 15:22, Michael S. Tsirkin wrote: >>>>>> On Sun, Jul 23, 2017 at 01:15:42AM +0300, Aleksandr Bezzubikov wrote: >>>>>>> To enable hotplugging of a newly created pcie-pci-bridge, >>>>>>> we need to tell firmware (SeaBIOS in this case) >>>>>> >>>>> >>>>> Hi Michael, >>>>> >>>>>> Presumably, EFI would need to support this too? >>>>>> >>>>> >>>>> Sure, Eduardo added to CC, but he is in PTO now. >>>>> >>>>>>> to reserve >>>>>>> additional buses for pcie-root-port, that allows us to >>>>>>> hotplug pcie-pci-bridge into this root port. >>>>>>> The number of buses to reserve is provided to the device via a corresponding >>>>>>> property, and to the firmware via new PCI capability (next patch). >>>>>>> The property's default value is 1 as we want to hotplug at least 1 bridge. >>>>>> >>>>>> If so you should just teach firmware to allocate one bus # >>>>>> unconditionally. >>>>>> >>>>> >>>>> That would be a problem for the PCIe machines, since each PCIe >>>>> devices is plugged in a different bus and we are already >>>>> limited to 256 PCIe devices. Allocating an extra-bus always >>>>> would really limit the PCIe devices we can use. >>>> >>>> One of the declared advantages of PCIe is easy support for multiple roots. >>>> We really should look at that IMHO so we do not need to pile up hacks. >>>> >>>>>> But why would that be so? What's wrong with a device >>>>>> directly in the root port? >>>>>> >>>> >>>> To clarify, my point is we might be wasting bus numbers by reservation >>>> since someone might just want to put pcie devices there. >>> >>> I think, changing default value to 0 can help us avoid this, >>> as no bus reservation by default. If one's surely wants >>> to hotplug pcie-pci-bridge into this root port in future, >>> the property gives him such an opportunity. >>> So, sure need pcie-pci-bridge hotplug -> creating a root port with >>> bus_reserve > 0. Otherwise (and default) - just as now, no changes >>> in bus topology. >> >> I guess 0 should mean "do not reserve any buses". So I think we also >> need a flag to just avoid the capability altogether. Maybe -1? *That* >> should be the default. > > -1 might be useful if any limit value 0 is legal, but is it? > If not, we can set every field to 0 and > this is a sign of avoiding capability since none legal > values are provided. > As Gerd suggested, this value is not a "delta" but the number of buses to be reserved behind the bridge. If I got it right, 0 is not a valid value, since the bridge by definition has a list one bus behind. Gerd's suggestion was to set min(cap_value, children_found). From such point of view 0 can be a valid value. I am lost now :) How can we use the capability to reserve "more" buses since children-found will be always the smaller value? I think you should use max(cap_value, children_found) to ensure you always reserve enough buses for existing children. In this case 0 is actually an invalid value since children_found > 0 for a bridge. Thanks, Marcel Michael, would you be OK with that? Thanks, Marcel >> >>>> >>>>> First, plugging a legacy PCI device into a PCIe Root Port >>>>> looks strange a
Re: [SeaBIOS] [RFC PATCH v2 5/6] hw/pci: add bus_reserve property to pcie-root-port
On 25/07/2017 17:09, Alexander Bezzubikov wrote: 2017-07-25 16:53 GMT+03:00 Michael S. Tsirkin <m...@redhat.com>: On Tue, Jul 25, 2017 at 04:50:49PM +0300, Alexander Bezzubikov wrote: 2017-07-25 16:43 GMT+03:00 Michael S. Tsirkin <m...@redhat.com>: On Sun, Jul 23, 2017 at 05:13:11PM +0300, Marcel Apfelbaum wrote: On 23/07/2017 15:22, Michael S. Tsirkin wrote: On Sun, Jul 23, 2017 at 01:15:42AM +0300, Aleksandr Bezzubikov wrote: To enable hotplugging of a newly created pcie-pci-bridge, we need to tell firmware (SeaBIOS in this case) Hi Michael, Presumably, EFI would need to support this too? Sure, Eduardo added to CC, but he is in PTO now. to reserve additional buses for pcie-root-port, that allows us to hotplug pcie-pci-bridge into this root port. The number of buses to reserve is provided to the device via a corresponding property, and to the firmware via new PCI capability (next patch). The property's default value is 1 as we want to hotplug at least 1 bridge. If so you should just teach firmware to allocate one bus # unconditionally. That would be a problem for the PCIe machines, since each PCIe devices is plugged in a different bus and we are already limited to 256 PCIe devices. Allocating an extra-bus always would really limit the PCIe devices we can use. One of the declared advantages of PCIe is easy support for multiple roots. We really should look at that IMHO so we do not need to pile up hacks. But why would that be so? What's wrong with a device directly in the root port? To clarify, my point is we might be wasting bus numbers by reservation since someone might just want to put pcie devices there. I think, changing default value to 0 can help us avoid this, as no bus reservation by default. If one's surely wants to hotplug pcie-pci-bridge into this root port in future, the property gives him such an opportunity. So, sure need pcie-pci-bridge hotplug -> creating a root port with bus_reserve > 0. Otherwise (and default) - just as now, no changes in bus topology. I guess 0 should mean "do not reserve any buses". So I think we also need a flag to just avoid the capability altogether. Maybe -1? *That* should be the default. -1 might be useful if any limit value 0 is legal, but is it? If not, we can set every field to 0 and this is a sign of avoiding capability since none legal values are provided. As Gerd suggested, this value is not a "delta" but the number of buses to be reserved behind the bridge. If I got it right, 0 is not a valid value, since the bridge by definition has a list one bus behind. Michael, would you be OK with that? Thanks, Marcel First, plugging a legacy PCI device into a PCIe Root Port looks strange at least, and it can;t be done on real HW anyway. (incompatible slots) Second (and more important), if we want 2 or more PCI devices we would loose both IO ports space and bus numbers. Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- hw/pci-bridge/pcie_root_port.c | 1 + include/hw/pci/pcie_port.h | 3 +++ 2 files changed, 4 insertions(+) diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c index 4d588cb..b0e49e1 100644 --- a/hw/pci-bridge/pcie_root_port.c +++ b/hw/pci-bridge/pcie_root_port.c @@ -137,6 +137,7 @@ static void rp_exit(PCIDevice *d) static Property rp_props[] = { DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present, QEMU_PCIE_SLTCAP_PCP_BITNR, true), +DEFINE_PROP_UINT8("bus_reserve", PCIEPort, bus_reserve, 1), DEFINE_PROP_END_OF_LIST() }; diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h index 1333266..1b2dd1f 100644 --- a/include/hw/pci/pcie_port.h +++ b/include/hw/pci/pcie_port.h @@ -34,6 +34,9 @@ struct PCIEPort { /* pci express switch port */ uint8_t port; + +/* additional buses to reserve on firmware init */ +uint8_t bus_reserve; }; void pcie_port_init_reg(PCIDevice *d); So here is a property and it does not do anything. It makes it easier to work on series maybe, but review is harder since we do not see what it does at all. Please do not split up patches like this - you can maintain it split up in your branch if you like and merge before sending. Agreed, Alexandr please merge patches 4-5-6 for your next submission. Thanks, Marcel -- 2.7.4 -- Alexander Bezzubikov ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2 6/6] hw/pci: add hint capabilty for additional bus reservation to pcie-root-port
On 25/07/2017 0:43, Alexander Bezzubikov wrote: 2017-07-24 23:43 GMT+03:00 Michael S. Tsirkin: On Sun, Jul 23, 2017 at 01:15:43AM +0300, Aleksandr Bezzubikov wrote: Signed-off-by: Aleksandr Bezzubikov --- hw/pci-bridge/pcie_root_port.c | 5 + 1 file changed, 5 insertions(+) diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c index b0e49e1..ca92d85 100644 --- a/hw/pci-bridge/pcie_root_port.c +++ b/hw/pci-bridge/pcie_root_port.c @@ -106,6 +106,11 @@ static void rp_realize(PCIDevice *d, Error **errp) pcie_aer_root_init(d); rp_aer_vector_update(d); +rc = pci_bridge_help_cap_init(d, 0, p->bus_reserve, 0, 0, 0, errp); +if (rc < 0) { +goto err; +} + return; err: It looks like this will add the capability unconditionally to all pcie root ports. Two issues with it: 1. you can't add vendor properties to devices where vendor is not qemu as they might have their own concept of what it does. 2. this will break compatibility with old machine types, need to disable for these Actually the original idea was to add it for pcie-root-port excusively (for now at least), looks like I've confused a little with files naming. Right, for the Generic PCIe Root Port and not for all the root ports. In the future we may want to add it to the PCI-brigde so we can have nested bridges, but we are not there yet. Will add it for v3. Thanks, Marcel -- 2.7.4 ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2 5/6] hw/pci: add bus_reserve property to pcie-root-port
On 25/07/2017 0:58, Michael S. Tsirkin wrote: On Tue, Jul 25, 2017 at 12:41:12AM +0300, Alexander Bezzubikov wrote: 2017-07-24 23:46 GMT+03:00 Michael S. Tsirkin <m...@redhat.com>: On Sun, Jul 23, 2017 at 05:13:11PM +0300, Marcel Apfelbaum wrote: On 23/07/2017 15:22, Michael S. Tsirkin wrote: On Sun, Jul 23, 2017 at 01:15:42AM +0300, Aleksandr Bezzubikov wrote: To enable hotplugging of a newly created pcie-pci-bridge, we need to tell firmware (SeaBIOS in this case) Hi Michael, Presumably, EFI would need to support this too? Sure, Eduardo added to CC, but he is in PTO now. to reserve additional buses for pcie-root-port, that allows us to hotplug pcie-pci-bridge into this root port. The number of buses to reserve is provided to the device via a corresponding property, and to the firmware via new PCI capability (next patch). The property's default value is 1 as we want to hotplug at least 1 bridge. If so you should just teach firmware to allocate one bus # unconditionally. That would be a problem for the PCIe machines, since each PCIe devices is plugged in a different bus and we are already limited to 256 PCIe devices. Allocating an extra-bus always would really limit the PCIe devices we can use. But this is exactly what this patch does as the property is added to all buses and default to 1 (1 extra bus). But why would that be so? What's wrong with a device directly in the root port? First, plugging a legacy PCI device into a PCIe Root Port looks strange at least, and it can;t be done on real HW anyway. (incompatible slots) You can still plug in PCIe devices there. Second (and more important), if we want 2 or more PCI devices we would loose both IO ports space and bus numbers. What I am saying is maybe default should not be 1. Hi Michael, Alexander The only sensible variant left is 0. But as we want pcie-pci-bridge to be used for every legacy PCI device on q35 machine, every time one hotplugs the bridge into the root port, he must be sure rp's prop value >0 (for Linux). I'm not sure that it is a very convenient way to utilize the bridge - always remember to set property. Is not for Linux only, is for all guest OSes. I also think setting the property is OK, libvirt can always add a single PCIe Root Port port with this property set, while upper layers can create flavors (if the feature needed or not for the current setup) That's what I'm saying then - if in your opinion default is >0 anyway, tweak firmware to do it by default. Default should be 0 for sure - because of the hard limitation on the number of PCIe devices for single PCI domain (the same as the number of buses, 256). For a positive value we will should add a property "buses-reserve = x". Another way - we can set this to 0 by default, and to 1 for pcie-root-port, and recommend to use it for hotplugging of the pcie-pci-bridge itself. I wonder about something: imagine hotplugging a hierarchy of bridges below a root port. It seems that nothing prevents guest from finding a free range of buses to cover this hierarchy and setting that as secondary/subordinate bus for this bridge. > This does need support on QEMU side to hotplug a hierarchy at once, and might need some fixes in Linux, on the plus side you can defer management decision on how many are needed until you are actually adding something, and you don't need vendor specific patches. We can teach Linux kernel, that's for sure (OK, almost sure...) but what we don't want is to be dependent on specific guest Operating Systems. For example, most configurations are not supported by Windows guests. Also is a great opportunity adding PCI IO resources hints to guest FW, something we wanted to do for some time. Thanks, Marcel Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- hw/pci-bridge/pcie_root_port.c | 1 + include/hw/pci/pcie_port.h | 3 +++ 2 files changed, 4 insertions(+) diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c index 4d588cb..b0e49e1 100644 --- a/hw/pci-bridge/pcie_root_port.c +++ b/hw/pci-bridge/pcie_root_port.c @@ -137,6 +137,7 @@ static void rp_exit(PCIDevice *d) static Property rp_props[] = { DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present, QEMU_PCIE_SLTCAP_PCP_BITNR, true), +DEFINE_PROP_UINT8("bus_reserve", PCIEPort, bus_reserve, 1), DEFINE_PROP_END_OF_LIST() }; diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h index 1333266..1b2dd1f 100644 --- a/include/hw/pci/pcie_port.h +++ b/include/hw/pci/pcie_port.h @@ -34,6 +34,9 @@ struct PCIEPort { /* pci express switch port */ uint8_t port; + +/* additional buses to reserve on firmware init */ +uint8_t bus_reserve; }; void pcie_port_init_reg(PCIDevice *d); So here is a property and it does not do anything. It makes it easier to work on series maybe, but review is harde
Re: [SeaBIOS] [RFC PATCH v2 4/4] pci: enable RedHat PCI bridges to reserve additional buses on PCI init
On 24/07/2017 17:39, Alexander Bezzubikov wrote: 2017-07-24 12:42 GMT+03:00 Gerd Hoffmann>: On Sun, 2017-07-23 at 22:44 +0300, Alexander Bezzubikov wrote: > By the way, any ideas on how to avoid 'bus overstealing' would > be greatly appreciated. > Static BIOS variable isn't applicable since its value isn't saved > across reboots. I think the reservation hints should be a absolute number, not a increment. i.e. if qemu suggests to reserve three extra bus numbers seabios should reserve three, no matter whenever there are zero, one, two or three child busses present. And I guess seabios should interpret that as minimum, so in case it finds five child busses it will allocate five bus numbers of course ... Personally I have nothing against it. Marcel, Michael, what do you think? Sounds good to me. Same with the other limit hints. If the hint says to allocate 16M, and existing device bars sum up to 4M, allocate 16M (and therefore leave 12M address space for hotplug). If the device bars sum up to 32M, allocate that. While being at it: I have my doubts the capability struct layout (which mimics register layout) buys us that much, seabios wouldn't blindly copy over the values anyway. Having regular u32 fields looks more useful to me. Again, if nobody has any objections, I can change it in v3. I also had my reservations about prev layout, let's see if it will look cleaner. Thanks, Marcel cheers, Gerd -- Alexander Bezzubikov ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] >256 Virtio-net-pci hotplug Devices
On 24/07/2017 7:53, Kinsella, Ray wrote: Hi Ray, Thank you for the details, So as it turns out at 512 devices, it is nothing to do SeaBIOS, it was the Kernel again. It is taking quite a while to startup, a little over two hours (7489 seconds). The main culprits appear to be enumerating/initializing the PCI Express ports and enabling interrupts. The PCI Express Root Ports are taking a long time to enumerate/ initializing. 42 minutes in total=2579/60=64 ports in total, 40 seconds each. Even if I am not aware of how much time would take to init a bare-metal PCIe Root Port, it seems too much. [ 50.612822] pci_bus :80: root bus resource [bus 80-c1] [ 172.345361] pci :80:00.0: PCI bridge to [bus 81] ... [ 2724.734240] pci :80:08.0: PCI bridge to [bus c1] [ 2751.154702] ACPI: Enabled 2 GPEs in block 00 to 3F I assume the 1 hour (3827 seconds) below is being spent enabling interrupts. Assuming you are referring to legacy interrupts, maybe is possible to disable them and use only MSI/MSI-X for PCIe Root Ports (based on user input, we can't disable INTx for all the ports) [ 2899.394288] ACPI: PCI Interrupt Link [GSIG] enabled at IRQ 22 [ 2899.531324] ACPI: PCI Interrupt Link [GSIH] enabled at IRQ 23 [ 2899.534778] ACPI: PCI Interrupt Link [GSIE] enabled at IRQ 20 [ 6726.914388] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled [ 6726.937932] 00:04: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A [ 6726.964699] Linux agpgart interface v0.103 There finally there is another 20 minutes to find in the boot. [ 7489.202589] virtio_net virtio515 enp193s0f0: renamed from eth513 Poky (Yocto Project Reference Distro) 2.3 qemux86-64 ttyS0 qemux86-64 login: root I will remove the virtio-net-pci devices and hotplug them instead. In theory it should improve boot time, at expense of incurring some of these costs at runtime. I would appreciate if you can share the results. Thanks, Marcel Ray K -Original Message- From: Kevin O'Connor [mailto:ke...@koconnor.net] Sent: Sunday, July 23, 2017 1:05 PM To: Marcel Apfelbaum <mar...@redhat.com>; Kinsella, Ray <ray.kinse...@intel.com> Cc: qemu-de...@nongnu.org; seabios@seabios.org; Gerd Hoffmann <kra...@redhat.com>; Michael Tsirkin <m...@redhat.com> Subject: Re: >256 Virtio-net-pci hotplug Devices On Sun, Jul 23, 2017 at 07:28:01PM +0300, Marcel Apfelbaum wrote: On 22/07/2017 2:57, Kinsella, Ray wrote: When scaling up to 512 Virtio-net devices SeaBIOS appears to really slow down when configuring PCI Config space - haven't manage to get this to work yet. If there is a slowdown in SeaBIOS, it would help to produce a log with timing information - see: https://www.seabios.org/Debugging#Timing_debug_messages It may also help to increase the debug level in SeaBIOS to get more fine grained timing reports. -Kevin ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] >256 Virtio-net-pci hotplug Devices
On 22/07/2017 2:57, Kinsella, Ray wrote: Hi Marcel Hi Ray, On 21/07/2017 01:33, Marcel Apfelbaum wrote: On 20/07/2017 3:44, Kinsella, Ray wrote: That's strange. Please ensure the virtio devices are working in virtio 1.0 mode (disable-modern=0,disable-legacy=1). Let us know any problems you see. Not sure what yet, I will try scaling it with hotplugging tomorrow. Updates? I have managed to scale it to 128 devices. The kernel does complain about IO address space exhaustion. [ 83.697956] pci :80:00.0: BAR 13: no space for [io size 0x1000] [ 83.700958] pci :80:00.0: BAR 13: failed to assign [io size 0x1000] [ 83.701689] pci :80:00.1: BAR 13: no space for [io size 0x1000] [ 83.702378] pci :80:00.1: BAR 13: failed to assign [io size 0x1000] [ 83.703093] pci :80:00.2: BAR 13: no space for [io size 0x1000] I was surprised that I am running out of IO address space, as I am disabling legacy virtio. I assumed that this would remove the need for SeaBIOS to allocate the PCI Express Root Port IO address space. Indeed, SeeBIOS does not reserve IO ports in this case, but Linux kernel still decides ""it knows better" and tries to allocate IO resources anyway. It does not affect the "modern" virtio-net devices because they don't need IO ports anyway. One way to work around the error message is to have the PCIe Root Port have the corresponding IO headers read-only since IO support is optional. I tried this some time ago, I'll get back to it. In any case - it doesn't stop the virtio-net device coming up and working as expected. Right. [ 668.692081] virtio_net virtio103 enp141s0f4: renamed from eth101 [ 668.707114] virtio_net virtio130 enp144s0f7: renamed from eth128 [ 668.719795] virtio_net virtio129 enp144s0f6: renamed from eth127 I encountered some issues in vhost, due to open file exhaustion but resolved these with 'ulimit' in the usual way - burned alot of time on that today. When scaling up to 512 Virtio-net devices SeaBIOS appears to really slow down when configuring PCI Config space - haven't manage to get this to work yet. Adding SeaBIOS mailing list and maintainers, maybe there is a known issue about 500+ PCI devices configuration. Not really. All you have to do is to add a property to the pxb-pci/pxb devices: pci_domain=x; then update the ACPI table to include the pxb domain. You also have to tweak a little the pxb-pcie/pxb devices to not share the bus numbers if pci_domain > 0. Thanks for information, will add to the list. Is also on my todo list :) Thanks, Marcel Ray K \ ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2 4/6] hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware
On 23/07/2017 19:19, Alexander Bezzubikov wrote: 2017-07-23 18:57 GMT+03:00 Marcel Apfelbaum <mar...@redhat.com <mailto:mar...@redhat.com>>: On 23/07/2017 1:15, Aleksandr Bezzubikov wrote: On PCI init PCI bridges may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special vendor-specific PCI capability. Sizes of limits match ones from PCI Type 1 Configuration Space Header, number of buses to reserve occupies only 1 byte since it is the size of Subordinate Bus Number register. Hi Alexandr, Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com <mailto:zuban...@gmail.com>> --- hw/pci/pci_bridge.c | 27 +++ include/hw/pci/pci_bridge.h | 18 ++ 2 files changed, 45 insertions(+) diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index 720119b..8ec6c2c 100644 --- a/hw/pci/pci_bridge.c +++ b/hw/pci/pci_bridge.c @@ -408,6 +408,33 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, br->bus_name = bus_name; } + +int pci_bridge_help_cap_init(PCIDevice *dev, int cap_offset, Can you please rename to something like 'pci_bridge_qemu_cap_init' to be more specific? + uint8_t bus_reserve, uint32_t io_limit, + uint16_t mem_limit, uint64_t pref_limit, I am not sure regarding "limit" suffix, this is a reservation, not a limitation. I'd like this fields names to match actual registers which are going to get the values. For this case I think is better to have io_res..., to describe the parameter rather than match the registers. + Error **errp) +{ +size_t cap_len = sizeof(PCIBridgeQemuCap); +PCIBridgeQemuCap cap; + +cap.len = cap_len; +cap.bus_res = bus_reserve; +cap.io_lim = io_limit & 0xFF; +cap.io_lim_upper = io_limit >> 8 & 0x; +cap.mem_lim = mem_limit; +cap.pref_lim = pref_limit & 0x; +cap.pref_lim_upper = pref_limit >> 16 & 0x; + +int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, +cap_offset, cap_len, errp); +if (offset < 0) { +return offset; +} + +memcpy(dev->config + offset + 2, (char *) + 2, cap_len - 2); +return 0; +} + static const TypeInfo pci_bridge_type_info = { .name = TYPE_PCI_BRIDGE, .parent = TYPE_PCI_DEVICE, diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h index ff7cbaa..c9f642c 100644 --- a/include/hw/pci/pci_bridge.h +++ b/include/hw/pci/pci_bridge.h @@ -67,4 +67,22 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, #define PCI_BRIDGE_CTL_DISCARD_STATUS0x400 /* Discard timer status */ #define PCI_BRIDGE_CTL_DISCARD_SERR 0x800 /* Discard timer SERR# enable */ +typedef struct PCIBridgeQemuCap { +uint8_t id; /* Standard PCI capability header field */ +uint8_t next; /* Standard PCI capability header field */ +uint8_t len;/* Standard PCI vendor-specific capability header field */ +uint8_t bus_res; +uint32_t pref_lim_upper; +uint16_t pref_lim; +uint16_t mem_lim; This 32bit IOMEM, right? +uint16_t io_lim_upper; +uint8_t io_lim; Why do we need io_lim and io_lim_upper? The idea was to be able to directly move the capability fields values to the registers when actually using it (in firmware) code. Secondly, it saves a little space by avoding usage 32-bit types when 24-bit ones are desired. And the same thing with prefetchable (48->64). But if it's more convenient no to split this value, I can do that. With a clear explanation (Mimic of the ) I personally don't mind keeping it like that. Thanks, Marcel Thanks, Marcel +uint8_t padding; +} PCIBridgeQemuCap; + +int pci_bridge_help_cap_init(PCIDevice *dev, int cap_offset, + uint8_t bus_reserve, uint32_t io_limit, + uint16_t mem_limit, uint64_t pref_limit, + Error **errp); + #endif /* QEMU_PCI_BRIDGE_H */ -- Alexander Bezzubikov
Re: [SeaBIOS] [RFC PATCH v2 2/4] pci: add RedHat vendor ID
On 23/07/2017 1:11, Aleksandr Bezzubikov wrote: Signed-off-by: Aleksandr Bezzubikov--- src/hw/pci_ids.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/hw/pci_ids.h b/src/hw/pci_ids.h index 4ac73b4..db2e694 100644 --- a/src/hw/pci_ids.h +++ b/src/hw/pci_ids.h @@ -2263,6 +2263,8 @@ #define PCI_DEVICE_ID_KORENIX_JETCARDF0 0x1600 #define PCI_DEVICE_ID_KORENIX_JETCARDF1 0x16ff +#define PCI_VENDOR_ID_REDHAT 0x1b36 + #define PCI_VENDOR_ID_TEKRAM 0x1de1 #define PCI_DEVICE_ID_TEKRAM_DC2900xdc29 I suggest to merge this patch with patch 4 that uses it. Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2 1/4] pci: refactor pci_find_capapibilty to get bdf as the first argument instead of the whole pci_device
Hi Alexandr, On 23/07/2017 1:11, Aleksandr Bezzubikov wrote: Refactor pci_find_capability function to get bdf instead of a whole pci_device* as the only necessary field for this function is still bdf. It greatly helps when we have bdf but not pci_device. You can drop the last sentence. Other than that: Reviewed-by: Marcel Apfelbaum <mar...@redhat.com> Thanks, Marcel Signed-off-by: Aleksandr Bezzubikov <zuban...@gmail.com> --- src/fw/pciinit.c| 4 ++-- src/hw/pcidevice.c | 12 ++-- src/hw/pcidevice.h | 2 +- src/hw/virtio-pci.c | 4 ++-- 4 files changed, 11 insertions(+), 11 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 08221e6..864954f 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -762,7 +762,7 @@ static int pci_bus_hotplug_support(struct pci_bus *bus, u8 pcie_cap) return downstream_port && slot_implemented; } -shpc_cap = pci_find_capability(bus->bus_dev, PCI_CAP_ID_SHPC, 0); +shpc_cap = pci_find_capability(bus->bus_dev->bdf, PCI_CAP_ID_SHPC, 0); return !!shpc_cap; } @@ -844,7 +844,7 @@ static int pci_bios_check_devices(struct pci_bus *busses) */ parent = [0]; int type; -u8 pcie_cap = pci_find_capability(s->bus_dev, PCI_CAP_ID_EXP, 0); +u8 pcie_cap = pci_find_capability(s->bus_dev->bdf, PCI_CAP_ID_EXP, 0); int hotplug_support = pci_bus_hotplug_support(s, pcie_cap); for (type = 0; type < PCI_REGION_TYPE_COUNT; type++) { u64 align = (type == PCI_REGION_TYPE_IO) ? diff --git a/src/hw/pcidevice.c b/src/hw/pcidevice.c index cfebf66..d01e27b 100644 --- a/src/hw/pcidevice.c +++ b/src/hw/pcidevice.c @@ -134,25 +134,25 @@ pci_find_init_device(const struct pci_device_id *ids, void *arg) return NULL; } -u8 pci_find_capability(struct pci_device *pci, u8 cap_id, u8 cap) +u8 pci_find_capability(u16 bdf, u8 cap_id, u8 cap) { int i; -u16 status = pci_config_readw(pci->bdf, PCI_STATUS); +u16 status = pci_config_readw(bdf, PCI_STATUS); if (!(status & PCI_STATUS_CAP_LIST)) return 0; if (cap == 0) { /* find first */ -cap = pci_config_readb(pci->bdf, PCI_CAPABILITY_LIST); +cap = pci_config_readb(bdf, PCI_CAPABILITY_LIST); } else { /* find next */ -cap = pci_config_readb(pci->bdf, cap + PCI_CAP_LIST_NEXT); +cap = pci_config_readb(bdf, cap + PCI_CAP_LIST_NEXT); } for (i = 0; cap && i <= 0xff; i++) { -if (pci_config_readb(pci->bdf, cap + PCI_CAP_LIST_ID) == cap_id) +if (pci_config_readb(bdf, cap + PCI_CAP_LIST_ID) == cap_id) return cap; -cap = pci_config_readb(pci->bdf, cap + PCI_CAP_LIST_NEXT); +cap = pci_config_readb(bdf, cap + PCI_CAP_LIST_NEXT); } return 0; diff --git a/src/hw/pcidevice.h b/src/hw/pcidevice.h index 354b549..adcc75a 100644 --- a/src/hw/pcidevice.h +++ b/src/hw/pcidevice.h @@ -69,7 +69,7 @@ int pci_init_device(const struct pci_device_id *ids , struct pci_device *pci, void *arg); struct pci_device *pci_find_init_device(const struct pci_device_id *ids , void *arg); -u8 pci_find_capability(struct pci_device *pci, u8 cap_id, u8 cap); +u8 pci_find_capability(u16 bdf, u8 cap_id, u8 cap); void pci_enable_busmaster(struct pci_device *pci); u16 pci_enable_iobar(struct pci_device *pci, u32 addr); void *pci_enable_membar(struct pci_device *pci, u32 addr); diff --git a/src/hw/virtio-pci.c b/src/hw/virtio-pci.c index e5c2c33..4e33033 100644 --- a/src/hw/virtio-pci.c +++ b/src/hw/virtio-pci.c @@ -381,7 +381,7 @@ fail: void vp_init_simple(struct vp_device *vp, struct pci_device *pci) { -u8 cap = pci_find_capability(pci, PCI_CAP_ID_VNDR, 0); +u8 cap = pci_find_capability(pci->bdf, PCI_CAP_ID_VNDR, 0); struct vp_cap *vp_cap; const char *mode; u32 offset, base, mul; @@ -479,7 +479,7 @@ void vp_init_simple(struct vp_device *vp, struct pci_device *pci) vp_cap->cap, type, vp_cap->bar, addr, offset, mode); } -cap = pci_find_capability(pci, PCI_CAP_ID_VNDR, cap); +cap = pci_find_capability(pci->bdf, PCI_CAP_ID_VNDR, cap); } if (vp->common.cap && vp->notify.cap && vp->isr.cap && vp->device.cap) { ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2 2/6] hw/i386: allow SHPC for Q35 machine
On 23/07/2017 1:15, Aleksandr Bezzubikov wrote: Unmask previously masked SHPC feature in _OSC method. Signed-off-by: Aleksandr Bezzubikov--- hw/i386/acpi-build.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 6b7bade..0d99585 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1850,7 +1850,7 @@ static Aml *build_q35_osc_method(void) * Always allow native PME, AER (no dependencies) * Never allow SHPC (no SHPC controller in this system) Seems the above comment is not longer correct :) Thanks, Marcel */ -aml_append(if_ctx, aml_and(a_ctrl, aml_int(0x1D), a_ctrl)); +aml_append(if_ctx, aml_and(a_ctrl, aml_int(0x1F), a_ctrl)); if_ctx2 = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(1; /* Unknown revision */ ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2 4/6] hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware
On 23/07/2017 1:15, Aleksandr Bezzubikov wrote: On PCI init PCI bridges may need some extra info about bus number to reserve, IO, memory and prefetchable memory limits. QEMU can provide this with special vendor-specific PCI capability. Sizes of limits match ones from PCI Type 1 Configuration Space Header, number of buses to reserve occupies only 1 byte since it is the size of Subordinate Bus Number register. Hi Alexandr, Signed-off-by: Aleksandr Bezzubikov--- hw/pci/pci_bridge.c | 27 +++ include/hw/pci/pci_bridge.h | 18 ++ 2 files changed, 45 insertions(+) diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index 720119b..8ec6c2c 100644 --- a/hw/pci/pci_bridge.c +++ b/hw/pci/pci_bridge.c @@ -408,6 +408,33 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, br->bus_name = bus_name; } + +int pci_bridge_help_cap_init(PCIDevice *dev, int cap_offset, Can you please rename to something like 'pci_bridge_qemu_cap_init' to be more specific? + uint8_t bus_reserve, uint32_t io_limit, + uint16_t mem_limit, uint64_t pref_limit, I am not sure regarding "limit" suffix, this is a reservation, not a limitation. + Error **errp) +{ +size_t cap_len = sizeof(PCIBridgeQemuCap); +PCIBridgeQemuCap cap; + +cap.len = cap_len; +cap.bus_res = bus_reserve; +cap.io_lim = io_limit & 0xFF; +cap.io_lim_upper = io_limit >> 8 & 0x; +cap.mem_lim = mem_limit; +cap.pref_lim = pref_limit & 0x; +cap.pref_lim_upper = pref_limit >> 16 & 0x; + +int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, +cap_offset, cap_len, errp); +if (offset < 0) { +return offset; +} + +memcpy(dev->config + offset + 2, (char *) + 2, cap_len - 2); +return 0; +} + static const TypeInfo pci_bridge_type_info = { .name = TYPE_PCI_BRIDGE, .parent = TYPE_PCI_DEVICE, diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h index ff7cbaa..c9f642c 100644 --- a/include/hw/pci/pci_bridge.h +++ b/include/hw/pci/pci_bridge.h @@ -67,4 +67,22 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, #define PCI_BRIDGE_CTL_DISCARD_STATUS0x400 /* Discard timer status */ #define PCI_BRIDGE_CTL_DISCARD_SERR 0x800 /* Discard timer SERR# enable */ +typedef struct PCIBridgeQemuCap { +uint8_t id; /* Standard PCI capability header field */ +uint8_t next; /* Standard PCI capability header field */ +uint8_t len;/* Standard PCI vendor-specific capability header field */ +uint8_t bus_res; +uint32_t pref_lim_upper; +uint16_t pref_lim; +uint16_t mem_lim; This 32bit IOMEM, right? +uint16_t io_lim_upper; +uint8_t io_lim; Why do we need io_lim and io_lim_upper? Thanks, Marcel +uint8_t padding; +} PCIBridgeQemuCap; + +int pci_bridge_help_cap_init(PCIDevice *dev, int cap_offset, + uint8_t bus_reserve, uint32_t io_limit, + uint16_t mem_limit, uint64_t pref_limit, + Error **errp); + #endif /* QEMU_PCI_BRIDGE_H */ ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2 5/6] hw/pci: add bus_reserve property to pcie-root-port
On 23/07/2017 15:22, Michael S. Tsirkin wrote: On Sun, Jul 23, 2017 at 01:15:42AM +0300, Aleksandr Bezzubikov wrote: To enable hotplugging of a newly created pcie-pci-bridge, we need to tell firmware (SeaBIOS in this case) Hi Michael, Presumably, EFI would need to support this too? Sure, Eduardo added to CC, but he is in PTO now. to reserve additional buses for pcie-root-port, that allows us to hotplug pcie-pci-bridge into this root port. The number of buses to reserve is provided to the device via a corresponding property, and to the firmware via new PCI capability (next patch). The property's default value is 1 as we want to hotplug at least 1 bridge. If so you should just teach firmware to allocate one bus # unconditionally. That would be a problem for the PCIe machines, since each PCIe devices is plugged in a different bus and we are already limited to 256 PCIe devices. Allocating an extra-bus always would really limit the PCIe devices we can use. But why would that be so? What's wrong with a device directly in the root port? First, plugging a legacy PCI device into a PCIe Root Port looks strange at least, and it can;t be done on real HW anyway. (incompatible slots) Second (and more important), if we want 2 or more PCI devices we would loose both IO ports space and bus numbers. Signed-off-by: Aleksandr Bezzubikov--- hw/pci-bridge/pcie_root_port.c | 1 + include/hw/pci/pcie_port.h | 3 +++ 2 files changed, 4 insertions(+) diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c index 4d588cb..b0e49e1 100644 --- a/hw/pci-bridge/pcie_root_port.c +++ b/hw/pci-bridge/pcie_root_port.c @@ -137,6 +137,7 @@ static void rp_exit(PCIDevice *d) static Property rp_props[] = { DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present, QEMU_PCIE_SLTCAP_PCP_BITNR, true), +DEFINE_PROP_UINT8("bus_reserve", PCIEPort, bus_reserve, 1), DEFINE_PROP_END_OF_LIST() }; diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h index 1333266..1b2dd1f 100644 --- a/include/hw/pci/pcie_port.h +++ b/include/hw/pci/pcie_port.h @@ -34,6 +34,9 @@ struct PCIEPort { /* pci express switch port */ uint8_t port; + +/* additional buses to reserve on firmware init */ +uint8_t bus_reserve; }; void pcie_port_init_reg(PCIDevice *d); So here is a property and it does not do anything. It makes it easier to work on series maybe, but review is harder since we do not see what it does at all. Please do not split up patches like this - you can maintain it split up in your branch if you like and merge before sending. Agreed, Alexandr please merge patches 4-5-6 for your next submission. Thanks, Marcel -- 2.7.4 ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH 0/2] Allow RedHat PCI bridges reserve more buses than necessary during init
On 21/07/2017 20:28, Kevin O'Connor wrote: On Fri, Jul 21, 2017 at 03:15:46PM +0300, Marcel Apfelbaum wrote: On 21/07/2017 13:04, Gerd Hoffmann wrote: I'd prefer to have a single vendor capability for all resource allocation hints provided by qemu. Sure, the capability looking something like: [flags: reserve-buses|reserve-IO|reserve-MEM|...] [extra-buses][IO-size][MEM-size] if reserve-buses -> use 'extra-buses' value and so on I don't have any objection to using a PCI capability, but I do wonder if fw_cfg would be a better fit. This information is purely qemu -> firmware, right? Hi Kevin, Right, but theoretically speaking a guest OS driver could also get the hint on hotplug, or simply because the OS chooses to re-assign resources on its own. A while ago we discussed the fw_cfg option, but Gerd preferred the vendor capability, and since the capability looked cleaner Aleksandr opted for it. He will send V2 soon together with the QEMU counterpart feature. Thanks, Marcel -Kevin ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH 0/2] Allow RedHat PCI bridges reserve more buses than necessary during init
On 21/07/2017 15:42, Gerd Hoffmann wrote: On Fri, 2017-07-21 at 15:15 +0300, Marcel Apfelbaum wrote: On 21/07/2017 13:04, Gerd Hoffmann wrote: Hi, What about window sizes? IIRC there was a plan to provide allocation hints for them too ... Yes, is in my TODO list, however not as part of Aleksandr's series which aims to provide PCIe-PCI bridge hotplug support. I'd prefer to have a single vendor capability for all resource allocation hints provided by qemu. [Adding Laszlo] Sure, the capability looking something like: [flags: reserve-buses|reserve-IO|reserve-MEM|...] [extra-buses][IO-size][MEM-size] if reserve-buses -> use 'extra-buses' value and so on Do we need the flags? We can use "value == 0 -> no hint for you". Maybe we don't want to allocate IO at all, So value 0 would say do not allocate. But we can disable IO by other means. Also what about prefetchable vs. non-prefetchable memory? I guess we want a size hint for both memory windows? Good point, thanks! Marcel cheers, Gerd ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH 0/2] Allow RedHat PCI bridges reserve more buses than necessary during init
On 21/07/2017 13:04, Gerd Hoffmann wrote: Hi, What about window sizes? IIRC there was a plan to provide allocation hints for them too ... Yes, is in my TODO list, however not as part of Aleksandr's series which aims to provide PCIe-PCI bridge hotplug support. I'd prefer to have a single vendor capability for all resource allocation hints provided by qemu. [Adding Laszlo] Sure, the capability looking something like: [flags: reserve-buses|reserve-IO|reserve-MEM|...] [extra-buses][IO-size][MEM-size] if reserve-buses -> use 'extra-buses' value and so on would be OK? Thanks, Marcel cheers, Gerd ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] Fwd: [RFC PATCH 0/2] Allow RedHat PCI bridges reserve more buses than necessary during init
On 19/07/2017 21:56, Konrad Rzeszutek Wilk wrote: On Wed, Jul 19, 2017 at 09:38:50PM +0300, Alexander Bezzubikov wrote: 2017-07-19 21:18 GMT+03:00 Konrad Rzeszutek Wilk: On Wed, Jul 19, 2017 at 05:14:41PM +, Alexander Bezzubikov wrote: ср, 19 июля 2017 г. в 16:57, Konrad Rzeszutek Wilk < konrad.w...@oracle.com>: On Wed, Jul 19, 2017 at 04:20:12PM +0300, Aleksandr Bezzubikov wrote: Now PCI bridges (and PCIE root port too) get a bus range number in system init, basing on currently plugged devices. That's why when one wants to hotplug another bridge, it needs his child bus, which the parent is unable to provide. Could you explain how you trigger this? I'm trying to hot plug pcie-pci bridge into pcie root port, and Linux says 'cannot allocate bus number for device bla-bla'. This obviously does not allow me to use the bridge at all. The suggested workaround is to have vendor-specific capability in RedHat generic pcie-root-port that contains number of additional bus to reserve on BIOS PCI init. But wouldn't the proper fix be for the PCI bridge to have the subordinate value be extended to fit more bus ranges? What do you mean? This is what I'm trying to do. Do you suppose to get rid of vendor-specific cap and use original register value instead of it? I would suggest a simple fix - each bridge has a a number of bus devices it can use. You have up to 255 - so you split the number of northbridge numbers by the amount of NUMA nodes (if that is used) - so for example if you have 4 NUMA nodes, each bridge would cover 63 bus numbers. Meaning the root bridge would cover 0->63 bus, 64->128, and so on. That gives you enough space to plug in your plugged in devices (up to 63). And if you need sub-briges then carve out a specific range. Hi Konrad, The problem is that we don't know at the init moment how many subbridges we may need, Is possible the explanation was not clear clear and led to some miscommunication. And the explanation above does not either. It just setups at init time an range where you can plug in your new devices in. But in a more uniform way such that you can also utilize this with NUMA and _PXM topology in the future. I fully agree with you and actually QEMU has already implemented the exact idea you are describing here, its called a pxb/pxb-pci device, that can be "bounded" to a specific NUMA node and has a subrange of bus numbers dedicated to it. However this problem is different. In a PCI Express machine you can hotplug PCIe devices only into PCIe Root Ports (or switch downstream ports, but not in current scope). We want to be able to hotplug a PCIe-PCI bridge into a PCIe Root Port so we can then hot-plug legacy PCI devices. Since the PCIe Root Port is a type of PCI bridge, at boot time it only gets the bus sub-range (primary bus,subordinate bus] which is computed by firmware and leaves no bus number that can be used by a hot-plugged pci-bridge. And this obviously does not depend on how we arrange NUMA/proximities. We are also not looking for a fix for a specific guest OS, so reserving some extra bus-numbers it has minimal impact on the system. I do agree the problem may be solved differently, however we can't reach all guest OS vendors and ask them to support an alternative solution in a reasonable time frame. Thanks, Marcel and how deep the whole device tree will be. The key moment - PCI bridge hotplugging needs either rescan all buses on each bridge device addition, or reserve space in advance during BIOS init. can all buses on each bridge device addition, or reserve It is more complex than that - you may need to move devices that are below you. And Linux kernel (nor any other OS) can handle that. (They can during bootup) In this series the second way was chosen. Aleksandr Bezzubikov (2): pci: add support for direct usage of bdf for capability lookup pci: enable RedHat pci bridges to reserve more buses src/fw/pciinit.c | 12 ++-- src/hw/pcidevice.c | 24 src/hw/pcidevice.h | 1 + 3 files changed, 35 insertions(+), 2 deletions(-) -- 2.7.4 -- Alexander Bezzubikov -- Alexander Bezzubikov ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [RFC PATCH 2/2] pci: enable RedHat pci bridges to reserve more buses
On 19/07/2017 16:56, Konrad Rzeszutek Wilk wrote: On Wed, Jul 19, 2017 at 04:20:14PM +0300, Aleksandr Bezzubikov wrote: In case of RedHat PCI bridges reserve additional buses, which number is provided It is "Red Hat" in a vendor-specific capability. And perhaps also a #define ? Right, please add it to src/hw/pci_ids.h. Thanks, Marcel Signed-off-by: Aleksandr Bezzubikov--- src/fw/pciinit.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 08221e6..b6f3a01 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -578,9 +578,17 @@ pci_bios_init_bus_rec(int bus, u8 *pci_bus) pci_bios_init_bus_rec(secbus, pci_bus); if (subbus != *pci_bus) { +u16 vendor = pci_config_readw(bdf, PCI_VENDOR_ID); +u8 res_bus = 0; +if (vendor == 0x1b36) { +u8 cap = pci_find_capability_bdf(bdf, PCI_CAP_ID_VNDR, 0); +if (cap) { +res_bus = pci_config_readb(bdf, cap + 16); +} +} dprintf(1, "PCI: subordinate bus = 0x%x -> 0x%x\n", -subbus, *pci_bus); -subbus = *pci_bus; +subbus, *pci_bus + res_bus); +subbus = *pci_bus + res_bus; } else { dprintf(1, "PCI: subordinate bus = 0x%x\n", subbus); } -- 2.7.4 ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH 1/2] pci: add support for direct usage of bdf for capability lookup
On 19/07/2017 16:20, Aleksandr Bezzubikov wrote: Add a capability lookup function which gets bdf instead of pci_device as its first argument. It may be useful when we have bdf, but don't have the whole pci_device structure. Signed-off-by: Aleksandr Bezzubikov--- src/hw/pcidevice.c | 24 src/hw/pcidevice.h | 1 + 2 files changed, 25 insertions(+) diff --git a/src/hw/pcidevice.c b/src/hw/pcidevice.c index cfebf66..3fa240e 100644 --- a/src/hw/pcidevice.c +++ b/src/hw/pcidevice.c @@ -158,6 +158,30 @@ u8 pci_find_capability(struct pci_device *pci, u8 cap_id, u8 cap) return 0; } Hi Aleksandr, +u8 pci_find_capability_bdf(int bdf, u8 cap_id, u8 cap) +{ Please do not duplicate the code for the 'pci_find_capability'. In this case you can reuse the code by making 'pci_find_capability' call 'pci_find_capability_bdf', or better, instead of a new function, change the signature as you proposed and make the calling code use bdf instead of the pci device, since is the only info needed. Thanks, Marcel +int i; +u16 status = pci_config_readw(bdf, PCI_STATUS); + +if (!(status & PCI_STATUS_CAP_LIST)) +return 0; + +if (cap == 0) { +/* find first */ +cap = pci_config_readb(bdf, PCI_CAPABILITY_LIST); +} else { +/* find next */ +cap = pci_config_readb(bdf, cap + PCI_CAP_LIST_NEXT); +} +for (i = 0; cap && i <= 0xff; i++) { +if (pci_config_readb(bdf, cap + PCI_CAP_LIST_ID) == cap_id) +return cap; +cap = pci_config_readb(bdf, cap + PCI_CAP_LIST_NEXT); +} + +return 0; +} + // Enable PCI bus-mastering (ie, DMA) support on a pci device void pci_enable_busmaster(struct pci_device *pci) diff --git a/src/hw/pcidevice.h b/src/hw/pcidevice.h index 354b549..e4ed5cf 100644 --- a/src/hw/pcidevice.h +++ b/src/hw/pcidevice.h @@ -70,6 +70,7 @@ int pci_init_device(const struct pci_device_id *ids struct pci_device *pci_find_init_device(const struct pci_device_id *ids , void *arg); u8 pci_find_capability(struct pci_device *pci, u8 cap_id, u8 cap); +u8 pci_find_capability_bdf(int bdf, u8 cap_id, u8 cap); void pci_enable_busmaster(struct pci_device *pci); u16 pci_enable_iobar(struct pci_device *pci, u32 addr); void *pci_enable_membar(struct pci_device *pci, u32 addr); ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH] pci: don't map virtio 1.0 storage devices above 4G
On 09/12/2016 06:46 AM, Michael S. Tsirkin wrote: On Sun, Sep 11, 2016 at 08:23:34PM +0300, Marcel Apfelbaum wrote: Otherwise SeaBIOS can't access virtio's modern BAR. Signed-off-by: Marcel Apfelbaum <mar...@redhat.com> --- Hi, If there is no room to map all MMIO BARs into the 32-bit PCI window, SeaBIOS will re-allocate all 64-bit MMIO BARs into over-4G space. Virtio 1.0 block devices (virtio-blk/virtio-scsi) use a 64-bit BAR unusable by SeaBIOS if mapped over 4G space, preventing the system to boot. The simplest solution is to follow the xhci model and simply skip migrating the virtio 1.0 modern bar into over-4G space. In order to reproduce the problem use: -device virtio-blk-pci, ...\ -object memory-backend-file,id=mem,size=4G,mem-path=/dev/hugepages,share=on \ -device ivshmem-plain,memdev=mem \ Thanks, Marcel src/fw/pciinit.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 35d9902..3b76e66 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -655,6 +655,13 @@ static void pci_region_migrate_64bit_entries(struct pci_region *from, continue; if (entry->dev->class == PCI_CLASS_SERIAL_USB) continue; +if (entry->dev->vendor == PCI_VENDOR_ID_REDHAT_QUMRANET && +(entry->dev->device == PCI_DEVICE_ID_VIRTIO_BLK_09 || + entry->dev->device == PCI_DEVICE_ID_VIRTIO_BLK_10 || + entry->dev->device == PCI_DEVICE_ID_VIRTIO_SCSI_09 || + entry->dev->device == PCI_DEVICE_ID_VIRTIO_SCSI_10)) +continue; + // Move from source list to destination list. hlist_del(>node); hlist_add(>node, last); -- 2.5.5 What if guest is booting from the network? No harm done. And the Guest OS can re-map the hardware registers to 64-bit if desired. Rather than special-case virtio storage, It is not a special case. The system can boot from usb, storage or network. (I might forget something here :)) Each hardware vendor needs to ensure their hardware can be used by the firmware and this is what we are doing here. For example commit (a247e678) pci: don't map usb host adapters above 4G does that for all USB controllers. it seems to me that the right thing to do is to only allocate resources for the boot devices, and rely on the OS to allocate resources for the rest. IIUC this is typically controlled by the plug and play OS flag in smbios. What about not PNP Operating Systems? The firmware should address them all. Also, what about the first time we load the OS, this smbios OS flag would not be there and we can't install the system. (if I got this flag right) This seems to me as to much changes to the ecosystem (this firmware, all pxe drivers out there, specific issues with different OSes...) for the specific case, I am not saying we shouldn't do that for the long run. Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org https://www.coreboot.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH] pci: don't map virtio 1.0 storage devices above 4G
Otherwise SeaBIOS can't access virtio's modern BAR. Signed-off-by: Marcel Apfelbaum <mar...@redhat.com> --- Hi, If there is no room to map all MMIO BARs into the 32-bit PCI window, SeaBIOS will re-allocate all 64-bit MMIO BARs into over-4G space. Virtio 1.0 block devices (virtio-blk/virtio-scsi) use a 64-bit BAR unusable by SeaBIOS if mapped over 4G space, preventing the system to boot. The simplest solution is to follow the xhci model and simply skip migrating the virtio 1.0 modern bar into over-4G space. In order to reproduce the problem use: -device virtio-blk-pci, ...\ -object memory-backend-file,id=mem,size=4G,mem-path=/dev/hugepages,share=on \ -device ivshmem-plain,memdev=mem \ Thanks, Marcel src/fw/pciinit.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 35d9902..3b76e66 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -655,6 +655,13 @@ static void pci_region_migrate_64bit_entries(struct pci_region *from, continue; if (entry->dev->class == PCI_CLASS_SERIAL_USB) continue; +if (entry->dev->vendor == PCI_VENDOR_ID_REDHAT_QUMRANET && +(entry->dev->device == PCI_DEVICE_ID_VIRTIO_BLK_09 || + entry->dev->device == PCI_DEVICE_ID_VIRTIO_BLK_10 || + entry->dev->device == PCI_DEVICE_ID_VIRTIO_SCSI_09 || + entry->dev->device == PCI_DEVICE_ID_VIRTIO_SCSI_10)) +continue; + // Move from source list to destination list. hlist_del(>node); hlist_add(>node, last); -- 2.5.5 ___ SeaBIOS mailing list SeaBIOS@seabios.org https://www.coreboot.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH V2] fw/pci: add Q35 S3 support
Following the i440fx example, save the LPC, SMBUS and PCIEXBAR bdfs between OS sleeps and use them to re-configure the corresponding registers. Tested-by: Gal Hammer <gham...@redhat.com> Signed-off-by: Marcel Apfelbaum <mar...@redhat.com> --- Hi, v1 -> v2: - made ICH9SmbusBDF static (Laszlo) - use "patience" diff for git-format (Laszlo) (no change this time, maybe will help next time, I added it to my script) The patch was tested with Win7 and Fedora 23 guests. Any comments are welcomed. Thanks, Marcel src/fw/pciinit.c | 73 +++- 1 file changed, 56 insertions(+), 17 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 5da6cf6..b2a546b 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -149,6 +149,22 @@ static void piix_isa_bridge_setup(struct pci_device *pci, void *arg) dprintf(1, "PIIX3/PIIX4 init: elcr=%02x %02x\n", elcr[0], elcr[1]); } +static void mch_isa_lpc_setup(u16 bdf) +{ +/* pm io base */ +pci_config_writel(bdf, ICH9_LPC_PMBASE, + acpi_pm_base | ICH9_LPC_PMBASE_RTE); + +/* acpi enable, SCI: IRQ9 000b = irq9*/ +pci_config_writeb(bdf, ICH9_LPC_ACPI_CTRL, ICH9_LPC_ACPI_CTRL_ACPI_EN); + +/* set root complex register block BAR */ +pci_config_writel(bdf, ICH9_LPC_RCBA, + ICH9_LPC_RCBA_ADDR | ICH9_LPC_RCBA_EN); +} + +static int ICH9LpcBDF = -1; + /* ICH9 LPC PCI to ISA bridge */ /* PCI_VENDOR_ID_INTEL && PCI_DEVICE_ID_INTEL_ICH9_LPC */ static void mch_isa_bridge_setup(struct pci_device *dev, void *arg) @@ -176,16 +192,10 @@ static void mch_isa_bridge_setup(struct pci_device *dev, void *arg) outb(elcr[1], ICH9_LPC_PORT_ELCR2); dprintf(1, "Q35 LPC init: elcr=%02x %02x\n", elcr[0], elcr[1]); -/* pm io base */ -pci_config_writel(bdf, ICH9_LPC_PMBASE, - acpi_pm_base | ICH9_LPC_PMBASE_RTE); +ICH9LpcBDF = bdf; -/* acpi enable, SCI: IRQ9 000b = irq9*/ -pci_config_writeb(bdf, ICH9_LPC_ACPI_CTRL, ICH9_LPC_ACPI_CTRL_ACPI_EN); +mch_isa_lpc_setup(bdf); -/* set root complex register block BAR */ -pci_config_writel(bdf, ICH9_LPC_RCBA, - ICH9_LPC_RCBA_ADDR | ICH9_LPC_RCBA_EN); e820_add(ICH9_LPC_RCBA_ADDR, 16*1024, E820_RESERVED); acpi_pm1a_cnt = acpi_pm_base + 0x04; @@ -244,11 +254,8 @@ static void piix4_pm_setup(struct pci_device *pci, void *arg) pmtimer_setup(acpi_pm_base + 0x08); } -/* ICH9 SMBUS */ -/* PCI_VENDOR_ID_INTEL && PCI_DEVICE_ID_INTEL_ICH9_SMBUS */ -static void ich9_smbus_setup(struct pci_device *dev, void *arg) +static void ich9_smbus_enable(u16 bdf) { -u16 bdf = dev->bdf; /* map smbus into io space */ pci_config_writel(bdf, ICH9_SMB_SMB_BASE, (acpi_pm_base + 0x100) | PCI_BASE_ADDRESS_SPACE_IO); @@ -257,6 +264,17 @@ static void ich9_smbus_setup(struct pci_device *dev, void *arg) pci_config_writeb(bdf, ICH9_SMB_HOSTC, ICH9_SMB_HOSTC_HST_EN); } +static int ICH9SmbusBDF = -1; + +/* ICH9 SMBUS */ +/* PCI_VENDOR_ID_INTEL && PCI_DEVICE_ID_INTEL_ICH9_SMBUS */ +static void ich9_smbus_setup(struct pci_device *dev, void *arg) +{ +ICH9SmbusBDF = dev->bdf; + +ich9_smbus_enable(dev->bdf); +} + static const struct pci_device_id pci_device_tbl[] = { /* PIIX3/PIIX4 PCI to ISA bridge */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371SB_0, @@ -293,6 +311,9 @@ static const struct pci_device_id pci_device_tbl[] = { PCI_DEVICE_END, }; +static int MCHMmcfgBDF = -1; +static void mch_mmconfig_setup(u16 bdf); + void pci_resume(void) { if (!CONFIG_QEMU) { @@ -302,6 +323,18 @@ void pci_resume(void) if (PiixPmBDF >= 0) { piix4_pm_config_setup(PiixPmBDF); } + +if (ICH9LpcBDF >= 0) { +mch_isa_lpc_setup(ICH9LpcBDF); +} + +if (ICH9SmbusBDF >= 0) { +ich9_smbus_enable(ICH9SmbusBDF); +} + +if(MCHMmcfgBDF >= 0) { +mch_mmconfig_setup(MCHMmcfgBDF); +} } static void pci_bios_init_device(struct pci_device *pci) @@ -388,18 +421,24 @@ static void i440fx_mem_addr_setup(struct pci_device *dev, void *arg) pci_slot_get_irq = piix_pci_slot_get_irq; } -static void mch_mem_addr_setup(struct pci_device *dev, void *arg) +static void mch_mmconfig_setup(u16 bdf) { u64 addr = Q35_HOST_BRIDGE_PCIEXBAR_ADDR; -u32 size = Q35_HOST_BRIDGE_PCIEXBAR_SIZE; - -/* setup mmconfig */ -u16 bdf = dev->bdf; u32 upper = addr >> 32; u32 lower = (addr & 0x) | Q35_HOST_BRIDGE_PCIEXBAREN; pci_config_writel(bdf, Q35_HOST_BRIDGE_PCIEXBAR, 0); pci_config_writel(bdf, Q35_HOST_BRIDGE_PCIEXBAR + 4, upper); pci_config_writel(bdf, Q35_HOST_BRIDGE_PCIEXBAR, lower); +} + +static void mch_mem_addr_setup(struct pci_device *dev, void *arg) +{ +u64 addr = Q35_HOST_BRIDGE_PCIEXBAR_ADDR;
Re: [SeaBIOS] [PATCH] fw/pci: add Q35 S3 support
On 03/01/2016 03:22 PM, Laszlo Ersek wrote: On 02/29/16 21:13, Marcel Apfelbaum wrote: Following the i440fx example, save the LPC, SMBUS and PCIEXBAR bdfs between OS sleeps and use them to re-configure the corresponding registers. Signed-off-by: Marcel Apfelbaum <mar...@redhat.com> --- Hi, The patch was tested with Win7 and Fedora 23 guests. Any comments are welcomed. Thanks, Marcel src/fw/pciinit.c | 73 +++- 1 file changed, 56 insertions(+), 17 deletions(-) Looks good to me. General recommendation: please consider using the "patience" diff algorithm with git. It has a lesser tendency to intersperse old and new code for "unrelated" changes (rewrites). Sure, I'll try it for v2. One point: diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 5da6cf6..e44bab0 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -149,6 +149,22 @@ static void piix_isa_bridge_setup(struct pci_device *pci, void *arg) dprintf(1, "PIIX3/PIIX4 init: elcr=%02x %02x\n", elcr[0], elcr[1]); } +static void mch_isa_lpc_setup(u16 bdf) +{ +/* pm io base */ +pci_config_writel(bdf, ICH9_LPC_PMBASE, + acpi_pm_base | ICH9_LPC_PMBASE_RTE); + +/* acpi enable, SCI: IRQ9 000b = irq9*/ +pci_config_writeb(bdf, ICH9_LPC_ACPI_CTRL, ICH9_LPC_ACPI_CTRL_ACPI_EN); + +/* set root complex register block BAR */ +pci_config_writel(bdf, ICH9_LPC_RCBA, + ICH9_LPC_RCBA_ADDR | ICH9_LPC_RCBA_EN); +} + +static int ICH9LpcBDF = -1; + /* ICH9 LPC PCI to ISA bridge */ /* PCI_VENDOR_ID_INTEL && PCI_DEVICE_ID_INTEL_ICH9_LPC */ static void mch_isa_bridge_setup(struct pci_device *dev, void *arg) @@ -176,16 +192,10 @@ static void mch_isa_bridge_setup(struct pci_device *dev, void *arg) outb(elcr[1], ICH9_LPC_PORT_ELCR2); dprintf(1, "Q35 LPC init: elcr=%02x %02x\n", elcr[0], elcr[1]); -/* pm io base */ -pci_config_writel(bdf, ICH9_LPC_PMBASE, - acpi_pm_base | ICH9_LPC_PMBASE_RTE); +ICH9LpcBDF = bdf; -/* acpi enable, SCI: IRQ9 000b = irq9*/ -pci_config_writeb(bdf, ICH9_LPC_ACPI_CTRL, ICH9_LPC_ACPI_CTRL_ACPI_EN); +mch_isa_lpc_setup(bdf); -/* set root complex register block BAR */ -pci_config_writel(bdf, ICH9_LPC_RCBA, - ICH9_LPC_RCBA_ADDR | ICH9_LPC_RCBA_EN); e820_add(ICH9_LPC_RCBA_ADDR, 16*1024, E820_RESERVED); acpi_pm1a_cnt = acpi_pm_base + 0x04; @@ -244,11 +254,8 @@ static void piix4_pm_setup(struct pci_device *pci, void *arg) pmtimer_setup(acpi_pm_base + 0x08); } -/* ICH9 SMBUS */ -/* PCI_VENDOR_ID_INTEL && PCI_DEVICE_ID_INTEL_ICH9_SMBUS */ -static void ich9_smbus_setup(struct pci_device *dev, void *arg) +static void ich9_smbus_enable(u16 bdf) { -u16 bdf = dev->bdf; /* map smbus into io space */ pci_config_writel(bdf, ICH9_SMB_SMB_BASE, (acpi_pm_base + 0x100) | PCI_BASE_ADDRESS_SPACE_IO); @@ -257,6 +264,17 @@ static void ich9_smbus_setup(struct pci_device *dev, void *arg) pci_config_writeb(bdf, ICH9_SMB_HOSTC, ICH9_SMB_HOSTC_HST_EN); } +int ICH9SmbusBDF = -1; Can you make this static? Thanks! This is supposed to be static, I might have deleted the "static" by mistake during development. And is still working... Thanks for the review, Marcel Thanks Laszlo + +/* ICH9 SMBUS */ +/* PCI_VENDOR_ID_INTEL && PCI_DEVICE_ID_INTEL_ICH9_SMBUS */ +static void ich9_smbus_setup(struct pci_device *dev, void *arg) +{ +ICH9SmbusBDF = dev->bdf; + +ich9_smbus_enable(dev->bdf); +} + static const struct pci_device_id pci_device_tbl[] = { /* PIIX3/PIIX4 PCI to ISA bridge */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371SB_0, @@ -293,6 +311,9 @@ static const struct pci_device_id pci_device_tbl[] = { PCI_DEVICE_END, }; +static int MCHMmcfgBDF = -1; +static void mch_mmconfig_setup(u16 bdf); + void pci_resume(void) { if (!CONFIG_QEMU) { @@ -302,6 +323,18 @@ void pci_resume(void) if (PiixPmBDF >= 0) { piix4_pm_config_setup(PiixPmBDF); } + +if (ICH9LpcBDF >= 0) { +mch_isa_lpc_setup(ICH9LpcBDF); +} + +if (ICH9SmbusBDF >= 0) { +ich9_smbus_enable(ICH9SmbusBDF); +} + +if(MCHMmcfgBDF >= 0) { +mch_mmconfig_setup(MCHMmcfgBDF); +} } static void pci_bios_init_device(struct pci_device *pci) @@ -388,18 +421,24 @@ static void i440fx_mem_addr_setup(struct pci_device *dev, void *arg) pci_slot_get_irq = piix_pci_slot_get_irq; } -static void mch_mem_addr_setup(struct pci_device *dev, void *arg) +static void mch_mmconfig_setup(u16 bdf) { u64 addr = Q35_HOST_BRIDGE_PCIEXBAR_ADDR; -u32 size = Q35_HOST_BRIDGE_PCIEXBAR_SIZE; - -/* setup mmconfig */ -u16 bdf = dev->bdf; u32 upper = addr >> 32; u32 lowe
[SeaBIOS] [PATCH] fw/pci: add Q35 S3 support
Following the i440fx example, save the LPC, SMBUS and PCIEXBAR bdfs between OS sleeps and use them to re-configure the corresponding registers. Signed-off-by: Marcel Apfelbaum <mar...@redhat.com> --- Hi, The patch was tested with Win7 and Fedora 23 guests. Any comments are welcomed. Thanks, Marcel src/fw/pciinit.c | 73 +++- 1 file changed, 56 insertions(+), 17 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 5da6cf6..e44bab0 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -149,6 +149,22 @@ static void piix_isa_bridge_setup(struct pci_device *pci, void *arg) dprintf(1, "PIIX3/PIIX4 init: elcr=%02x %02x\n", elcr[0], elcr[1]); } +static void mch_isa_lpc_setup(u16 bdf) +{ +/* pm io base */ +pci_config_writel(bdf, ICH9_LPC_PMBASE, + acpi_pm_base | ICH9_LPC_PMBASE_RTE); + +/* acpi enable, SCI: IRQ9 000b = irq9*/ +pci_config_writeb(bdf, ICH9_LPC_ACPI_CTRL, ICH9_LPC_ACPI_CTRL_ACPI_EN); + +/* set root complex register block BAR */ +pci_config_writel(bdf, ICH9_LPC_RCBA, + ICH9_LPC_RCBA_ADDR | ICH9_LPC_RCBA_EN); +} + +static int ICH9LpcBDF = -1; + /* ICH9 LPC PCI to ISA bridge */ /* PCI_VENDOR_ID_INTEL && PCI_DEVICE_ID_INTEL_ICH9_LPC */ static void mch_isa_bridge_setup(struct pci_device *dev, void *arg) @@ -176,16 +192,10 @@ static void mch_isa_bridge_setup(struct pci_device *dev, void *arg) outb(elcr[1], ICH9_LPC_PORT_ELCR2); dprintf(1, "Q35 LPC init: elcr=%02x %02x\n", elcr[0], elcr[1]); -/* pm io base */ -pci_config_writel(bdf, ICH9_LPC_PMBASE, - acpi_pm_base | ICH9_LPC_PMBASE_RTE); +ICH9LpcBDF = bdf; -/* acpi enable, SCI: IRQ9 000b = irq9*/ -pci_config_writeb(bdf, ICH9_LPC_ACPI_CTRL, ICH9_LPC_ACPI_CTRL_ACPI_EN); +mch_isa_lpc_setup(bdf); -/* set root complex register block BAR */ -pci_config_writel(bdf, ICH9_LPC_RCBA, - ICH9_LPC_RCBA_ADDR | ICH9_LPC_RCBA_EN); e820_add(ICH9_LPC_RCBA_ADDR, 16*1024, E820_RESERVED); acpi_pm1a_cnt = acpi_pm_base + 0x04; @@ -244,11 +254,8 @@ static void piix4_pm_setup(struct pci_device *pci, void *arg) pmtimer_setup(acpi_pm_base + 0x08); } -/* ICH9 SMBUS */ -/* PCI_VENDOR_ID_INTEL && PCI_DEVICE_ID_INTEL_ICH9_SMBUS */ -static void ich9_smbus_setup(struct pci_device *dev, void *arg) +static void ich9_smbus_enable(u16 bdf) { -u16 bdf = dev->bdf; /* map smbus into io space */ pci_config_writel(bdf, ICH9_SMB_SMB_BASE, (acpi_pm_base + 0x100) | PCI_BASE_ADDRESS_SPACE_IO); @@ -257,6 +264,17 @@ static void ich9_smbus_setup(struct pci_device *dev, void *arg) pci_config_writeb(bdf, ICH9_SMB_HOSTC, ICH9_SMB_HOSTC_HST_EN); } +int ICH9SmbusBDF = -1; + +/* ICH9 SMBUS */ +/* PCI_VENDOR_ID_INTEL && PCI_DEVICE_ID_INTEL_ICH9_SMBUS */ +static void ich9_smbus_setup(struct pci_device *dev, void *arg) +{ +ICH9SmbusBDF = dev->bdf; + +ich9_smbus_enable(dev->bdf); +} + static const struct pci_device_id pci_device_tbl[] = { /* PIIX3/PIIX4 PCI to ISA bridge */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371SB_0, @@ -293,6 +311,9 @@ static const struct pci_device_id pci_device_tbl[] = { PCI_DEVICE_END, }; +static int MCHMmcfgBDF = -1; +static void mch_mmconfig_setup(u16 bdf); + void pci_resume(void) { if (!CONFIG_QEMU) { @@ -302,6 +323,18 @@ void pci_resume(void) if (PiixPmBDF >= 0) { piix4_pm_config_setup(PiixPmBDF); } + +if (ICH9LpcBDF >= 0) { +mch_isa_lpc_setup(ICH9LpcBDF); +} + +if (ICH9SmbusBDF >= 0) { +ich9_smbus_enable(ICH9SmbusBDF); +} + +if(MCHMmcfgBDF >= 0) { +mch_mmconfig_setup(MCHMmcfgBDF); +} } static void pci_bios_init_device(struct pci_device *pci) @@ -388,18 +421,24 @@ static void i440fx_mem_addr_setup(struct pci_device *dev, void *arg) pci_slot_get_irq = piix_pci_slot_get_irq; } -static void mch_mem_addr_setup(struct pci_device *dev, void *arg) +static void mch_mmconfig_setup(u16 bdf) { u64 addr = Q35_HOST_BRIDGE_PCIEXBAR_ADDR; -u32 size = Q35_HOST_BRIDGE_PCIEXBAR_SIZE; - -/* setup mmconfig */ -u16 bdf = dev->bdf; u32 upper = addr >> 32; u32 lower = (addr & 0x) | Q35_HOST_BRIDGE_PCIEXBAREN; pci_config_writel(bdf, Q35_HOST_BRIDGE_PCIEXBAR, 0); pci_config_writel(bdf, Q35_HOST_BRIDGE_PCIEXBAR + 4, upper); pci_config_writel(bdf, Q35_HOST_BRIDGE_PCIEXBAR, lower); +} + +static void mch_mem_addr_setup(struct pci_device *dev, void *arg) +{ +u64 addr = Q35_HOST_BRIDGE_PCIEXBAR_ADDR; +u32 size = Q35_HOST_BRIDGE_PCIEXBAR_SIZE; + +/* setup mmconfig */ +MCHMmcfgBDF = dev->bdf; +mch_mmconfig_setup(dev->bdf); e820_add(addr, size, E820_RESERVED); /* setup pci i/o window (above mmconfig) */ -- 2.4.3
Re: [SeaBIOS] [PATCH] pci: panic when out of bus numbers
On 01/04/2016 01:43 PM, Marcel Apfelbaum wrote: Currently the bios goes into an endless loop if more than 255 PCI buses are requested. Hi, This is the reproducing script: cli=" -M q35 " while [ ${i:=0} -lt ${1:-0} ] do dstreamId=$((i)) ustreamId=$((dstreamId/32)) chassisId=$((dstreamId+1)) blkDiskId=$((dstreamId)) if [ $((dstreamId%32)) -eq 0 ] then cli="$cli -device ioh3420,bus=pcie.0,id=root.$ustreamId,slot=$ustreamId" cli="$cli -device x3130-upstream,bus=root.$ustreamId,id=upstream$ustreamId" fi cli="$cli -device xio3130-downstream,bus=upstream$ustreamId,id=downstream$dstreamId,chassis=$chassisId" i=$((i+1)) done $cli Run it with a numeric parameter > 240. Thanks, Marcel Change that to panic which is preferred in this case. Signed-off-by: Marcel Apfelbaum <mar...@redhat.com> --- Hi, I am well aware that this is not a common scenario, but the bug was found and maybe it deserves a little attention. I opted for using panic because I saw it used for other scenarios where the bios is out of resources. Another way to look at this would be to *stop* registering buses over 255 and loose some of the devices, but bring the system up. Thanks, Marcel src/fw/pciinit.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index c31c2fa..5da6cf6 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -469,6 +469,9 @@ pci_bios_init_bus_rec(int bus, u8 *pci_bus) u8 secbus = pci_config_readb(bdf, PCI_SECONDARY_BUS); (*pci_bus)++; +if (!(*pci_bus)) { +panic("PCI: out of bus numbers!\n"); +} if (*pci_bus != secbus) { dprintf(1, "PCI: secondary bus = 0x%x -> 0x%x\n", secbus, *pci_bus); ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH] pci: panic when out of bus numbers
Currently the bios goes into an endless loop if more than 255 PCI buses are requested. Change that to panic which is preferred in this case. Signed-off-by: Marcel Apfelbaum <mar...@redhat.com> --- Hi, I am well aware that this is not a common scenario, but the bug was found and maybe it deserves a little attention. I opted for using panic because I saw it used for other scenarios where the bios is out of resources. Another way to look at this would be to *stop* registering buses over 255 and loose some of the devices, but bring the system up. Thanks, Marcel src/fw/pciinit.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index c31c2fa..5da6cf6 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -469,6 +469,9 @@ pci_bios_init_bus_rec(int bus, u8 *pci_bus) u8 secbus = pci_config_readb(bdf, PCI_SECONDARY_BUS); (*pci_bus)++; +if (!(*pci_bus)) { +panic("PCI: out of bus numbers!\n"); +} if (*pci_bus != secbus) { dprintf(1, "PCI: secondary bus = 0x%x -> 0x%x\n", secbus, *pci_bus); -- 2.4.3 ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH] pci: panic when out of bus numbers
On 01/04/2016 02:19 PM, Gerd Hoffmann wrote: On Mo, 2016-01-04 at 13:43 +0200, Marcel Apfelbaum wrote: Currently the bios goes into an endless loop if more than 255 PCI buses are requested. Hi Gerd, Thanks for looking into this. Given the bus number register is 8bit I'm wondering whenever this is a valid hardware configuration in the first place? For sure no :), however we do have a possible endless loop and maybe is cleaner to panic. (no matter who is "responsible") In case it isn't I think qemu should throw an error instead if you try to create a vm with more than 255 pci busses. I suppose it could go into QEMU too, I'll give it a try. Thanks, Marcel cheers, Gerd ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH 6/6] q35: skip q35-acpi-dsdt.aml load if not needed
M On 12/17/2015 12:40 PM, Gerd Hoffmann wrote: Only old machine types which don't use the acpi builder (qemu 1.7 + older) have to load that file for proper acpi support. Signed-off-by: Gerd Hoffmann <kra...@redhat.com> --- hw/i386/pc_q35.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 133bc68..727269e 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -129,7 +129,10 @@ static void pc_q35_init(MachineState *machine) } pc_cpus_init(pcms); -pc_acpi_init("q35-acpi-dsdt.aml"); +if (!has_acpi_build) { +/* only machine types 1.7 & older need this */ Actually 1.6 and older, right? (I might be wrong) +pc_acpi_init("q35-acpi-dsdt.aml"); +} kvmclock_create(); It looks OK to me. Reviewed-by: Marcel Apfelbaum <mar...@redhat.com> Thanks, Marcel ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH] fw/pci: do not automatically allocate IO region for PCIe bridges
On 12/07/2015 11:46 AM, Gerd Hoffmann wrote: Hi, However, PCIe devices can work without IO, so there is no need to allocate IO space for hotplug. Makes sense. diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 7b8aab7..4b37792 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -736,7 +736,9 @@ static int pci_bios_check_devices(struct pci_bus *busses) if (pci_region_align(>r[type]) > align) align = pci_region_align(>r[type]); u64 sum = pci_region_sum(>r[type]); -if (!sum && hotplug_support) +int res_opt = (type == PCI_REGION_TYPE_IO) && + pci_find_capability(s->bus_dev, PCI_CAP_ID_EXP, 0); I'd make the variable names longer and more descriptive. Also move the pcie check out of the loop. Note that pci_bus_hotplug_support() looks for the pcie capability too, so we probably should turn that into something like pci_bridge_get_props(), so we have to look at the bridge capabilities only once. Hi Gerd, Thanks for the review. I'll address it and post again. Thanks, Marcel cheers, Gerd ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH V2] fw/pci: do not automatically allocate IO region for PCIe bridges
PCIe downstream ports (Root Ports and switches Downstream Ports) appear to firmware as PCI-PCI bridges and a 4K IO space is allocated for them even if there is no device behind them requesting IO space, all that for hotplug purpose. However, PCIe devices can work without IO, so there is no need to allocate IO space for hotplug. Signed-off-by: Marcel Apfelbaum <mar...@redhat.com> --- v1 -> v2: - Addressed Gerd's comments: - move pci_find_capability out of the loop and call it only once. - more descriptive names for variables Notes: - This patch fixes a 15 PCIe Root ports limitation when used with virtio pci-express devices having no IO space requirements: -M q35 `for i in {1..15}; do echo -device ioh3420,chassis=$i,id=b$i -device virtio-net-pci,bus=b$i,disable-legacy=on; done` - There is a patch on QEMU devel list that tackles the problem from another angle by allowing PCIe downstream ports to not forward IO requests. https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg04478.html - The mentioned patch is of course not enough and also requires management software intervention. Thanks, Marcel src/fw/pciinit.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 7b8aab7..c31c2fa 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -645,9 +645,8 @@ pci_region_create_entry(struct pci_bus *bus, struct pci_device *dev, return entry; } -static int pci_bus_hotplug_support(struct pci_bus *bus) +static int pci_bus_hotplug_support(struct pci_bus *bus, u8 pcie_cap) { -u8 pcie_cap = pci_find_capability(bus->bus_dev, PCI_CAP_ID_EXP, 0); u8 shpc_cap; if (pcie_cap) { @@ -727,7 +726,8 @@ static int pci_bios_check_devices(struct pci_bus *busses) */ parent = [0]; int type; -int hotplug_support = pci_bus_hotplug_support(s); +u8 pcie_cap = pci_find_capability(s->bus_dev, PCI_CAP_ID_EXP, 0); +int hotplug_support = pci_bus_hotplug_support(s, pcie_cap); for (type = 0; type < PCI_REGION_TYPE_COUNT; type++) { u64 align = (type == PCI_REGION_TYPE_IO) ? PCI_BRIDGE_IO_MIN : PCI_BRIDGE_MEM_MIN; @@ -736,7 +736,8 @@ static int pci_bios_check_devices(struct pci_bus *busses) if (pci_region_align(>r[type]) > align) align = pci_region_align(>r[type]); u64 sum = pci_region_sum(>r[type]); -if (!sum && hotplug_support) +int resource_optional = pcie_cap && (type == PCI_REGION_TYPE_IO); +if (!sum && hotplug_support && !resource_optional) sum = align; /* reserve min size for hot-plug */ u64 size = ALIGN(sum, align); int is64 = pci_bios_bridge_region_is64(>r[type], -- 2.1.0 ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v7 10/10] hw/pci-bridge: format special OFW unit address for PXB host
On 06/24/2015 08:11 PM, Kevin O'Connor wrote: On Fri, Jun 19, 2015 at 04:40:17AM +0200, Laszlo Ersek wrote: We have agreed that OpenFirmware device paths in the bootorder fw_cfg file should follow the pattern /pci@i0cf8,%x/... for devices that live behind an extra root bus. The extra root bus in question is the %x'th among the extra root buses. (In other words, %x gives the position of the affected extra root bus relative to the other extra root buses, in bus_nr order.) %x starts at 1, and is formatted in hex. The portion of the unit address that comes before the comma is dynamically taken from the main host bridge, similarly to sysbus_get_fw_dev_path(). Cc: Kevin O'Connor ke...@koconnor.net Cc: Michael S. Tsirkin m...@redhat.com Cc: Marcel Apfelbaum mar...@redhat.com Signed-off-by: Laszlo Ersek ler...@redhat.com --- Notes: v7: - implement the format that both Kevin and Michael agreed with. Example: /pci@i0cf8,1/pci-bridge@0/scsi@0/channel@0/disk@0,0 - I updated the OVMF patchset accordingly, but I won't post it until this QEMU patch is applied - Someone please write the SeaBIOS patch The associated SeaBIOS patch is below. Does anyone have a qemu command line handy to test with the PXB bus? -device pxb,id=bridge1,bus_nr=10 -netdev user,id=u -device e1000,id=net2,bus=bridge1,netdev=u Let me know if you have any issues with it. Thanks, Marcel -Kevin --- a/src/boot.c +++ b/src/boot.c @@ -112,9 +112,9 @@ build_pci_path(char *buf, int max, const char *devname, struct pci_device *pci) if (pci-parent) { p = build_pci_path(p, max, pci-bridge, pci-parent); } else { -if (pci-rootbus) -p += snprintf(p, max, /pci-root@%x, pci-rootbus); p += snprintf(p, buf+max-p, %s, FW_PCI_DOMAIN); +if (pci-rootbus) +p += snprintf(p, buf+max-p, ,%x, pci-rootbus); } int dev = pci_bdf_to_dev(pci-bdf), fn = pci_bdf_to_fn(pci-bdf); ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH V2] pci: fixes to allow booting from extra root pci buses.
On 06/12/2015 09:00 AM, Gerd Hoffmann wrote: Hi, On each boot, coreboot might decide to assign a different bus id to the extra roots (for example, if a device with a PCI bridge is inserted and it's bus allocation causes bus ids to shift). Technically, coreboot could even change the order extra buses are assigned bus ids, but doesn't today. This was seen on several AMD systems - I'm told at least some Intel systems have multiple root buses, but the bus numbers are just hard wired. This is how the qemu pxb works: root bus numbers are a config option for the root bridge device, i.e. from the guest point of view they are hard-wired. Exactly. In our case, the HW assigns the PXB bus bumber, and again, I saw this also on real HW with multiple buses, the bus nr comes from ACPI, meaning the vendor. Let's focus on the problem in hand: We need a way for QEMU to write some fw path on bootorder fw_config file and both Seabios/OVMF need to know how to correctly map this to the actual device. If the boot device is behind a PXI extra root bus, there is a need not only to differentiate the root bus but also to know *which one*. So we need the bus number, what other way is there? As Gerd mentioned, the PXB bus number is provided in QEMU command line, meaning hard-wired. We can of course, as Laszlo suggested, add an extra condition the use of this path: /pci-root@bus-br/ on running in QEMU in order not to interfere with other HW. Less pretty but more robust. Thanks, Marcel cheers, Gerd ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH V2] pci: fixes to allow booting from extra root pci buses.
On 06/11/2015 05:24 PM, Kevin O'Connor wrote: On Thu, Jun 11, 2015 at 05:12:33PM +0300, Marcel Apfelbaum wrote: On 06/11/2015 04:58 PM, Kevin O'Connor wrote: On Thu, Jun 11, 2015 at 04:37:08PM +0300, Marcel Apfelbaum wrote: The fixes solves the following issue: The PXB device exposes a new pci root bridge with the fw path: /pci-root@4/..., in which 4 is the root bus number. Before this patch the fw path was wrongly computed: /pci-root@1/pci@i0cf8/... Fix the above issues: Correct the bus number and remove the extra host bridge description. Why is that wrong? The previous path looks correct to me. The prev path includes both the extra root bridge and *then* the usual host bridge. /pci-root@1/pci@i0cf8/ ... ^ new ^ regular ^ devices Since the new pci root bridge (and bus) is on paralel with the regular one. it is not correct to add it to the path. The architecture is: /host bridge/devices... /extra root bridge/devices... /extra root bridge/devices... And not /extra root bridge//host bridge/devices Your patch changed both the /extra root bridge/devices... part and the @1 part. The change of the @1 in /pci-root@1/ is not correct IMO. Why? @1 should be the unit address which is the text representation of the physical address, in our case the slot. Since the bus number in our case is 4, I think /pci-root@4/ is the 'correct' address. Does open-firmware have any examples for PCI paths and in particular PCI paths when there are multiple root-buses? Maybe Laszlo can say more, but we both agreed that this would be the berst representation of extra root buses on both OVMF and Seabios. It's possible to replace the pci@i0cf8 with pci-root@1 but that seems odd as the extra root bus is accessible via io accesses to 0x0cf8. While this is true, /pci-root@[...]/ may represent also other kind of host bridges not only PXBs. But we can change this of course, as long as OVMF can also work with it. Another option would be to place the pci-root@1 behind the pci@i0cf8 as in /pci@i0cf8/pci-root@1/ Or, the root bus could be appended to the host bridge as in /pci@i0cf8,1/ The latest representation makes sense to me, but /pci@i0cf8,4/..., after comma the bus number. Laszlo, will this work for OVMF? Thanks, Marcel -Kevin ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH] pci: fixes to allow booting from extra root pci buses.
The PXB device exposes a new pci root bridge with the fw path: /pci-root@4/..., in which 4 is the root bus number. Before this patch the fw path was wrongly computed: /pci-root@1/pci@i0cf8/... Fix the above issues: Correct the bus number and remove the extra host bridge description. Signed-off-by: Marcel Apfelbaum mar...@redhat.com --- Laszlo worked on supporting pxb for OVMF and discovered that there is a problem when booting devices from a PXB. This is a link to the latest QEMU series: https://www.mail-archive.com/qemu-devel@nongnu.org/msg302493.html Thanks, Marcel src/boot.c | 1 - src/hw/pci.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/src/boot.c b/src/boot.c index ec59c37..a3bb13b 100644 --- a/src/boot.c +++ b/src/boot.c @@ -114,7 +114,6 @@ build_pci_path(char *buf, int max, const char *devname, struct pci_device *pci) } else { if (pci-rootbus) p += snprintf(p, max, /pci-root@%x, pci-rootbus); -p += snprintf(p, buf+max-p, %s, FW_PCI_DOMAIN); } int dev = pci_bdf_to_dev(pci-bdf), fn = pci_bdf_to_fn(pci-bdf); diff --git a/src/hw/pci.c b/src/hw/pci.c index 0379b55..9e77af4 100644 --- a/src/hw/pci.c +++ b/src/hw/pci.c @@ -133,7 +133,7 @@ pci_probe_devices(void) if (bus != lastbus) rootbuses++; lastbus = bus; -rootbus = rootbuses; +rootbus = bus; if (bus MaxPCIBus) MaxPCIBus = bus; } else { -- 2.1.0 ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH] pci: fixes to allow booting from extra root pci buses.
On 06/11/2015 03:45 PM, Michael S. Tsirkin wrote: On Thu, Jun 11, 2015 at 03:41:35PM +0300, Marcel Apfelbaum wrote: The PXB device exposes a new pci root bridge with the fw path: /pci-root@4/..., in which 4 is the root bus number. Before this patch the fw path was wrongly computed: /pci-root@1/pci@i0cf8/... Fix the above issues: Correct the bus number and remove the extra host bridge description. Signed-off-by: Marcel Apfelbaum mar...@redhat.com I would like a unit test for various paths, they are part of guest ABI so we can never change them. Could you add a unit test please? A QEMU unit-test you mean? By the way, I found an issue with this patch, please do not merge it yet. Also can you please quote the open firmware spec text that says this is the correct format? Laszlo has found something, I'll look up something to quote, sure. Thanks, Marcel --- Laszlo worked on supporting pxb for OVMF and discovered that there is a problem when booting devices from a PXB. This is a link to the latest QEMU series: https://www.mail-archive.com/qemu-devel@nongnu.org/msg302493.html Thanks, Marcel src/boot.c | 1 - src/hw/pci.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/src/boot.c b/src/boot.c index ec59c37..a3bb13b 100644 --- a/src/boot.c +++ b/src/boot.c @@ -114,7 +114,6 @@ build_pci_path(char *buf, int max, const char *devname, struct pci_device *pci) } else { if (pci-rootbus) p += snprintf(p, max, /pci-root@%x, pci-rootbus); -p += snprintf(p, buf+max-p, %s, FW_PCI_DOMAIN); } int dev = pci_bdf_to_dev(pci-bdf), fn = pci_bdf_to_fn(pci-bdf); diff --git a/src/hw/pci.c b/src/hw/pci.c index 0379b55..9e77af4 100644 --- a/src/hw/pci.c +++ b/src/hw/pci.c @@ -133,7 +133,7 @@ pci_probe_devices(void) if (bus != lastbus) rootbuses++; lastbus = bus; -rootbus = rootbuses; +rootbus = bus; if (bus MaxPCIBus) MaxPCIBus = bus; } else { -- 2.1.0 ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v5 for-2.3 28/28] docs: Add PXB documentation
On 03/10/2015 07:42 PM, Michael S. Tsirkin wrote: On Tue, Mar 10, 2015 at 06:21:14PM +0200, Marcel Apfelbaum wrote: On 03/10/2015 05:47 PM, Michael S. Tsirkin wrote: On Tue, Mar 10, 2015 at 05:32:14PM +0200, Marcel Apfelbaum wrote: Signed-off-by: Marcel Apfelbaum mar...@redhat.com --- docs/pci_expander_bridge.txt | 52 1 file changed, 52 insertions(+) create mode 100644 docs/pci_expander_bridge.txt diff --git a/docs/pci_expander_bridge.txt b/docs/pci_expander_bridge.txt new file mode 100644 index 000..58bf7a8 --- /dev/null +++ b/docs/pci_expander_bridge.txt @@ -0,0 +1,52 @@ +PCI EXPANDER BRIDGE (PXB) += + +Description +=== +PXB is a light-weight host bridge in the same PCI domain +as the main host bridge whose purpose is to enable +the main host bridge to support multiple PCI root buses. +It is implemented only for i440fx. BTW what makes it i440fx specific? Also, what happens if you try to use it with a different machine type? Is is i440fx specific, please look at patch 22/28. Also we have a specific check for i440fx, so CRS will not be emitted for other machine types. Thanks, Marcel In fact it won't work at all. Need to think about it, maybe we can make it work more generally. For CRS, should be possible to emit for q35 too? We can make it work, but not on the scope of this series. However, I'll add a IHostBridgeSnoop interface that will make the device work only with associated bus and this will make it less general. Thanks, Marcel + +As opposed to PCI-2-PCI bridge's secondary bus, PXB's bus +is a primary bus and can be associated with a NUMA node +(different from the main host bridge) allowing the guest OS +to recognize the proximity of a pass-through device to +other resources as RAM and CPUs. + +Usage += +A detailed command line would be: + +[qemu-bin + storage options] +-bios [seabios-dir]/out/bios.bin -L [seabios-dir]/out/ +-m 2G +-object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 -numa node,nodeid=0,cpus=0,memdev=ram-node0 +-object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1 -numa node,nodeid=1,cpus=1,memdev=ram-node1 +-device pxb-device,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd +-device pxb-device,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 -device e1000,bus=bridge2,addr=0x3 +-device pxb-device,id=bridge3,bus=pci.0,bus_nr=40 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1 + +Here you have: + - 2 NUMA nodes for the guest, 0 and 1. (both mapped to the same NUMA node in host, but you can and should put it in different host NUMA nodes) + - a pxb host bridge attached to NUMA 1 with an e1000 behind it + - a pxb host bridge attached to NUMA 0 with an e1000 behind it + - a pxb host bridge not attached to any NUMA with a hard drive behind it. + +Implementation +== +The PXB is composed by: +- HostBridge (TYPE_PXB_HOST) + The host bridge allows to register and query the PXB's rPCI root bus in QEMU. +- PXBDev(TYPE_PXB_DEVICE) + It is a regular PCI Device that resides on the piix host-bridge bus and its bus uses the same PCI domain. + However, the bus behind is exposed through ACPI as a primary PCI bus and starts a new PCI hierarchy. + The interrupts from devices behind the PXB are routed through this device the same as if it were a + PCI-2-PCI bridge. The _PRT follows the i440fx model. +- PCIBridgeDev(TYPE_PCI_BRIDGE_DEV) + Created automatically as part of init sequence. + When adding a device to PXB it is attached to the bridge for two reasons: + - Using the bridge will enable hotplug support + - All the devices behind the bridge will use bridge's IO/MEM windows compacting +the PCI address space. + -- 2.1.0 ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v5 for-2.3 28/28] docs: Add PXB documentation
On 03/16/2015 05:28 PM, Michael S. Tsirkin wrote: On Mon, Mar 16, 2015 at 02:16:40PM +0200, Marcel Apfelbaum wrote: On 03/10/2015 07:42 PM, Michael S. Tsirkin wrote: On Tue, Mar 10, 2015 at 06:21:14PM +0200, Marcel Apfelbaum wrote: On 03/10/2015 05:47 PM, Michael S. Tsirkin wrote: On Tue, Mar 10, 2015 at 05:32:14PM +0200, Marcel Apfelbaum wrote: Signed-off-by: Marcel Apfelbaum mar...@redhat.com --- docs/pci_expander_bridge.txt | 52 1 file changed, 52 insertions(+) create mode 100644 docs/pci_expander_bridge.txt diff --git a/docs/pci_expander_bridge.txt b/docs/pci_expander_bridge.txt new file mode 100644 index 000..58bf7a8 --- /dev/null +++ b/docs/pci_expander_bridge.txt @@ -0,0 +1,52 @@ +PCI EXPANDER BRIDGE (PXB) += + +Description +=== +PXB is a light-weight host bridge in the same PCI domain +as the main host bridge whose purpose is to enable +the main host bridge to support multiple PCI root buses. +It is implemented only for i440fx. BTW what makes it i440fx specific? Also, what happens if you try to use it with a different machine type? Is is i440fx specific, please look at patch 22/28. Also we have a specific check for i440fx, so CRS will not be emitted for other machine types. Thanks, Marcel In fact it won't work at all. Need to think about it, maybe we can make it work more generally. For CRS, should be possible to emit for q35 too? We can make it work, but not on the scope of this series. However, I'll add a IHostBridgeSnoop interface that will make the device work only with associated bus and this will make it less general. Thanks, Marcel OK, this works too. Do you mean PCIHostBridgeSnoop? Sure. Thanks, Marcel + +As opposed to PCI-2-PCI bridge's secondary bus, PXB's bus +is a primary bus and can be associated with a NUMA node +(different from the main host bridge) allowing the guest OS +to recognize the proximity of a pass-through device to +other resources as RAM and CPUs. + +Usage += +A detailed command line would be: + +[qemu-bin + storage options] +-bios [seabios-dir]/out/bios.bin -L [seabios-dir]/out/ +-m 2G +-object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 -numa node,nodeid=0,cpus=0,memdev=ram-node0 +-object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1 -numa node,nodeid=1,cpus=1,memdev=ram-node1 +-device pxb-device,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd +-device pxb-device,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 -device e1000,bus=bridge2,addr=0x3 +-device pxb-device,id=bridge3,bus=pci.0,bus_nr=40 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1 + +Here you have: + - 2 NUMA nodes for the guest, 0 and 1. (both mapped to the same NUMA node in host, but you can and should put it in different host NUMA nodes) + - a pxb host bridge attached to NUMA 1 with an e1000 behind it + - a pxb host bridge attached to NUMA 0 with an e1000 behind it + - a pxb host bridge not attached to any NUMA with a hard drive behind it. + +Implementation +== +The PXB is composed by: +- HostBridge (TYPE_PXB_HOST) + The host bridge allows to register and query the PXB's rPCI root bus in QEMU. +- PXBDev(TYPE_PXB_DEVICE) + It is a regular PCI Device that resides on the piix host-bridge bus and its bus uses the same PCI domain. + However, the bus behind is exposed through ACPI as a primary PCI bus and starts a new PCI hierarchy. + The interrupts from devices behind the PXB are routed through this device the same as if it were a + PCI-2-PCI bridge. The _PRT follows the i440fx model. +- PCIBridgeDev(TYPE_PCI_BRIDGE_DEV) + Created automatically as part of init sequence. + When adding a device to PXB it is attached to the bridge for two reasons: + - Using the bridge will enable hotplug support + - All the devices behind the bridge will use bridge's IO/MEM windows compacting +the PCI address space. + -- 2.1.0 ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v5 for-2.3 02/28] acpi: add aml_or() term
On 03/11/2015 03:17 AM, Shannon Zhao wrote: On 2015/3/10 23:31, Marcel Apfelbaum wrote: Add encoding for ACPI DefOr Opcode. Reviewed-by: Shannon Zhao zhaoshengl...@huawei.com Reviewed-by: Igor Mammedov imamm...@redhat.com Signed-off-by: Marcel Apfelbaum mar...@redhat.com --- hw/acpi/aml-build.c | 10 ++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 11 insertions(+) diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index ace180b..db8d346 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -452,6 +452,16 @@ Aml *aml_and(Aml *arg1, Aml *arg2) return var; } +/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefOr */ +Aml *aml_or(Aml *arg1, Aml *arg2) +{ +Aml *var = aml_opcode(0x7D /* OrOp */); +aml_append(var, arg1); +aml_append(var, arg2); +build_append_int(var-buf, 0x00); /* NullNameOp */ Maybe you forgot to fix this. Same with patch 03, 05, 06, 07. Strange, I was sure I took care of it. Thanks for bringing this up again! Marcel Thanks, Shannon +return var; +} + /* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefNotify */ Aml *aml_notify(Aml *arg1, Aml *arg2) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 17d3beb..c0eb691 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -137,6 +137,7 @@ Aml *aml_int(const uint64_t val); Aml *aml_arg(int pos); Aml *aml_store(Aml *val, Aml *target); Aml *aml_and(Aml *arg1, Aml *arg2); +Aml *aml_or(Aml *arg1, Aml *arg2); Aml *aml_notify(Aml *arg1, Aml *arg2); Aml *aml_call1(const char *method, Aml *arg1); Aml *aml_call2(const char *method, Aml *arg1, Aml *arg2); ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v5 for-2.3 00/28] hw/pc: implement multiple primary busses for pc machines
On 03/11/2015 03:32 PM, Gerd Hoffmann wrote: Hi, v4-v5: - Rebased on pci branch, tree: git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git Have again trouble applying this. Which commit hash? Or do you have a git tree somewhere with this? Yes, it is changing too fast :( This version works with commit: a3b66ab If you have problem with this commit, tell me and I'll push my branch to github. Thanks, Marcel cheers, Gerd ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v5 for-2.3 00/28] hw/pc: implement multiple primary busses for pc machines
On 03/11/2015 03:51 PM, Gerd Hoffmann wrote: On Mi, 2015-03-11 at 15:44 +0200, Marcel Apfelbaum wrote: On 03/11/2015 03:32 PM, Gerd Hoffmann wrote: Hi, v4-v5: - Rebased on pci branch, tree: git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git Have again trouble applying this. Which commit hash? Or do you have a git tree somewhere with this? Yes, it is changing too fast :( This version works with commit: a3b66ab Applies fine, but throws a checkpatch warning: Line over 80 ch, I try not to exaggerate with this kind of warnings, but I left a few. If you think is worth it, I'll take care of it. Thanks, Marcel Applying: hw/acpi: remove from root bus 0 the crs resources used by other busses. === checkpatch complains === WARNING: line over 80 characters #106: FILE: hw/i386/acpi-build.c:988: +aml_dword_memory(aml_pos_decode, aml_min_fixed, aml_max_fixed, WARNING: line over 80 characters #114: FILE: hw/i386/acpi-build.c:996: +aml_dword_memory(aml_pos_decode, aml_min_fixed, aml_max_fixed, total: 0 errors, 2 warnings, 124 lines checked cheers, Gerd ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH v5 for-2.3 00/28] hw/pc: implement multiple primary busses for pc machines
On 03/11/2015 04:12 PM, Gerd Hoffmann wrote: On Di, 2015-03-10 at 17:31 +0200, Marcel Apfelbaum wrote: v4-v5: - Rebased on pci branch, tree: git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git - Added PXB documentation (patch 28/28) - Addressed Gerd Hoffmann's review: - fix PXB behaviour if used with unsupported BIOS (patch 27/28) - Addressed Michael S. Tsirkin's review: - Removed assert in aml_index (patch 5/28) - Renamed pci_ functions to crs_ (patch 12/28) - used uint64_t variables instead of signed ones (patch 12/28) - Emit MEM/IO AML only for PXBs and i440fx (patch 26/28) - Addressed Shannon Zhao's review: - Changed build_append_int to build_append_byte in aml_or (patch 2/25) - Thanks to Igor and Kevin for reviews Hi Gerd, Tested-by: Gerd Hoffmann kra...@gmail.com Appreciated! Possible improvement: When you figure the devices behind the pxb are not initialized by the firmware (i.e. unmapped), you can try grab some address space not used by devices under root bus 0 and assign it to the pxb. Then the linux kernel can initialize the devices even if the firmware did not. Thanks for the idea, it is on my todo list after I'll make hotplug work for PXB buses. Marcel [ Surely should be done incremental like hotplug support to not delay this series even more ] cheers, Gerd ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-ppc] [Qemu-devel] [PATCH v4 for-2.3 00/25] hw/pc: implement multiple primary busses for pc machines
On 03/10/2015 08:23 AM, Alexey Kardashevskiy wrote: On 03/10/2015 06:21 AM, Marcel Apfelbaum wrote: On 03/09/2015 06:55 PM, Gerd Hoffmann wrote: On Mo, 2015-03-09 at 18:26 +0200, Marcel Apfelbaum wrote: On 03/09/2015 04:19 PM, Gerd Hoffmann wrote: Hi, My series is based on commit 09d219a. Try please on top of this commit. Ok, that works. Going to play with that now ;) Good luck! ... and tell me what you think :) If you need any help with the command line of the pxb device, let me know,. First thing I've noticed: You need to define a numa node so you can pass a valid numa node to the pxb-device. Guess that is ok as the whole point of this is to assign pci devices to numa nodes. More complete test instructions would be nice though. Exactly, this is by design. But you can also use it without specifying the NUMA node... A detailed command line would be: [qemu-bin + storage options] -bios [seabios-dir]/out/bios.bin -L [seabios-dir]/out/ -m 2G -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 -numa node,nodeid=0,cpus=0,memdev=ram-node0 -object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1 -numa node,nodeid=1,cpus=1,memdev=ram-node1 -device pxb-device,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd -device pxb-device,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 -device e1000,bus=bridge2,addr=0x3 -device pxb-device,id=bridge3,bus=pci.0,bus_nr=40 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1 I replayed this patchset on top of 09d219a acpi: update generated files and got this: qemu-system-x86_64: -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0: NUMA node binding are not supported by this QEMU qemu-system-x86_64: -object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1: NUMA node binding are not supported by this QEMU Hi, Please check your configuration (after you run ./configure script). See if you have a line like this: - NUMA host support yes This is my exact command line: /scratch/alexey/p/qemu-build/x86_x86_64/x86_64-softmmu/qemu-system-x86_64 \ -L /home/alexey/p/qemu/pc-bios/ \ -hda x86/fc19_24GB_x86.qcow2 \ -enable-kvm \ -kernel x86/vmlinuz-3.12.11-201.fc19.x86_64 \ -initrd x86/initramfs-3.12.11-201.fc19.x86_64.img \ -append root=/dev/sda3 console=ttyS0 \ -nographic \ -nodefaults \ -chardev stdio,id=id2,signal=off,mux=on \ -device isa-serial,id=id3,chardev=id2 \ -mon id=id4,chardev=id2,mode=readline \ -m 2G \ -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 \ -numa node,nodeid=0,cpus=0,memdev=ram-node0 \ -object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1 \ -numa node,nodeid=1,cpus=1,memdev=ram-node1 \ -device pxb-device,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 \ -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd \ -device pxb-device,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 \ -device e1000,bus=bridge2,addr=0x3 \ -device pxb-device,id=bridge3,bus=pci.0,bus_nr=40 \ -drive if=none,id=drive0,file=debian_lenny_powerpc_desktop.qcow2 \ -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1 \ What am I missing here? See above, check for NUMA host support What I actually wanted to find out (instead of asking what I am doing now) is is this PXB device a PCI device sitting on the same PCI host bus adapter (1) or it is a separate PHB (2) with its own PCI domain (new in :00:00.0 PCI address)? I would think it is (1) but then what exactly do you call A primary PCI bus here (that's my ignorance speaking, yes :) )? Thanks. You are right, the PXB is a device on the piix host-bridge bus and its bus uses the same PCI domain. However, the bus behind is exposed through ACPI as Primary PCI bus and starts a new PCI hierarchy. You have a similar approach on Intel 450x chipset: http://www.intel.com/design/chipsets/datashts/243771.htm Look for 82454NX PCI Expander Bridge (PXB) Thanks, Marcel Here you have: - 2 NUMA nodes for the guest, 0 and 1. (both mapped to the same NUMA node in host, but you can and should put it in different host NUMA nodes) - a pxb host bridge attached to NUMA 1 with an e1000 behind it - a pxb host bridge attached to NUMA 0 with an e1000 behind it - a pxb host bridge not attached to any NUMA with a hard drive behind it. As you can see, since you already decide NUMA mapping at command line, it is natural also to attach the pxbs to the NUMA nodes. Second thing: Booting with an unpatched seabios has bad effects: [root@localhost ~]# cat /proc/iomem -000f : PCI Bus :10 -0fff : reserved 1000-0009fbff : System RAM 0009fc00-0009 : reserved 000c-000c91ff : Video ROM 000c9800-000ca1ff : Adapter ROM 000ca800-000ccbff : Adapter ROM 000f-000f : reserved 000f-000f
Re: [SeaBIOS] [Qemu-devel] [PATCH v5 for-2.3 28/28] docs: Add PXB documentation
On 03/10/2015 05:47 PM, Michael S. Tsirkin wrote: On Tue, Mar 10, 2015 at 05:32:14PM +0200, Marcel Apfelbaum wrote: Signed-off-by: Marcel Apfelbaum mar...@redhat.com --- docs/pci_expander_bridge.txt | 52 1 file changed, 52 insertions(+) create mode 100644 docs/pci_expander_bridge.txt diff --git a/docs/pci_expander_bridge.txt b/docs/pci_expander_bridge.txt new file mode 100644 index 000..58bf7a8 --- /dev/null +++ b/docs/pci_expander_bridge.txt @@ -0,0 +1,52 @@ +PCI EXPANDER BRIDGE (PXB) += + +Description +=== +PXB is a light-weight host bridge in the same PCI domain +as the main host bridge whose purpose is to enable +the main host bridge to support multiple PCI root buses. +It is implemented only for i440fx. BTW what makes it i440fx specific? Also, what happens if you try to use it with a different machine type? Is is i440fx specific, please look at patch 22/28. Also we have a specific check for i440fx, so CRS will not be emitted for other machine types. Thanks, Marcel + +As opposed to PCI-2-PCI bridge's secondary bus, PXB's bus +is a primary bus and can be associated with a NUMA node +(different from the main host bridge) allowing the guest OS +to recognize the proximity of a pass-through device to +other resources as RAM and CPUs. + +Usage += +A detailed command line would be: + +[qemu-bin + storage options] +-bios [seabios-dir]/out/bios.bin -L [seabios-dir]/out/ +-m 2G +-object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 -numa node,nodeid=0,cpus=0,memdev=ram-node0 +-object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1 -numa node,nodeid=1,cpus=1,memdev=ram-node1 +-device pxb-device,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd +-device pxb-device,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 -device e1000,bus=bridge2,addr=0x3 +-device pxb-device,id=bridge3,bus=pci.0,bus_nr=40 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1 + +Here you have: + - 2 NUMA nodes for the guest, 0 and 1. (both mapped to the same NUMA node in host, but you can and should put it in different host NUMA nodes) + - a pxb host bridge attached to NUMA 1 with an e1000 behind it + - a pxb host bridge attached to NUMA 0 with an e1000 behind it + - a pxb host bridge not attached to any NUMA with a hard drive behind it. + +Implementation +== +The PXB is composed by: +- HostBridge (TYPE_PXB_HOST) + The host bridge allows to register and query the PXB's rPCI root bus in QEMU. +- PXBDev(TYPE_PXB_DEVICE) + It is a regular PCI Device that resides on the piix host-bridge bus and its bus uses the same PCI domain. + However, the bus behind is exposed through ACPI as a primary PCI bus and starts a new PCI hierarchy. + The interrupts from devices behind the PXB are routed through this device the same as if it were a + PCI-2-PCI bridge. The _PRT follows the i440fx model. +- PCIBridgeDev(TYPE_PCI_BRIDGE_DEV) + Created automatically as part of init sequence. + When adding a device to PXB it is attached to the bridge for two reasons: + - Using the bridge will enable hotplug support + - All the devices behind the bridge will use bridge's IO/MEM windows compacting +the PCI address space. + -- 2.1.0 ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios