Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Wed, 2016-12-14 at 20:26 +0200, Marcel Apfelbaum wrote: > > > > > > Maybe I just don't quite get the relationship between Root > > > > > > Complexes and Root Buses, but I guess my question is: what > > > > > > is preventing us from simply doing whatever a > > > > > > spapr-pci-host-bridge is doing in order to expose a legacy > > > > > > PCI Root Bus (pci.*) to the guest, and create a new > > > > > > spapr-pcie-host-bridge that exposes a PCIe Root Bus (pcie.*) > > > > > > instead? > > > > > > > > > > Hrm, the suggestion of providing both a vanilla-PCI and PCI-E host > > > > > bridge came up before. I think one of us spotted a problem with that, > > > > > but I don't recall what it was now. I guess one is how libvirt would > > > > > map it's stupid-fake-domain-numbers to which root bus to use. > > > > > > This would be a weird configuration, I never heard of something like that > > > on a bare metal machine, but I never worked on pseries, who knows... > > > > Which aspect? Having multiple independent host bridges is perfectly > > reasonable - x86 just doesn't do it well for rather stupid historical > > reasons. > > I agree about the multiple host-bridges, is actually what pxb/pxb-pcie > devices (kind of) do. > > I was talking about having one PCI PHB and another PHB which is PCI Express. > I was referring to one system having both PCI and PCIe PHBs. Sorry this confused you: what I was talking about was just having both a legacy PCI PHB and a PCI Express PHB available in QEMU, not using both for the same guest. The user would pick one or the other and stick with it. > > Again, one possible option here is to continue to treat pseries as > > having a vanilla-PCI bus, but with a special flag saying that it's > > magically able to connect PCI-E devices. > > A PCIe bus supporting PCI devices is strange (QEMU allows it ...), > but a PCI bus supporting PCIe devices is hard to "swallow". I agree. > I would say maybe make it a special case of a PCIe bus with different rules. > It can derive from the PCIe bus class and override the usual behavior > with PAPR specific rule which happen to be similar with the PCI bus rules. It's pretty clear by now that libvirt will need to have some special handling of pseries guests. Having to choose between "weird legacy PCI" and "weird PCI Express", I think I'd rather go with the latter :) -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Wed, 2016-12-14 at 20:26 +0200, Marcel Apfelbaum wrote: > > > The Root complex includes the PCI bus, some configuration > > > registers if > > > needed, provides access to the configuration space, translates > > > relevant CPU > > > reads/writes to PCI(e) transactions... > > > > Do those configuration registers appear within PCI space, or > > outside > > it (e.g. raw MMIO or PIO registers)? > > > > Root Complexes use MMIO to expose the PCI configuration space, > they call it ECAM (enhanced configuration access mechanism) or > MMConfig. Not all of them do. There are plenty of PCIe RCs out there that do config space using different mechanisms such as the good old indirect address/data method, such as ours. IE. Even in real "bare metal" powernv, where the root port is visible to Linux, we still go through firmware for config space to get through to those registers (among other things). My PHB3 implementation (not upstream yet and a bit bitrotted by now) exposed PCIe that way including extended config space and that was working fine. Cheers, Ben.
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On 12/14/2016 04:46 AM, David Gibson wrote: On Tue, Dec 13, 2016 at 02:25:44PM +0200, Marcel Apfelbaum wrote: On 12/07/2016 06:42 PM, Andrea Bolognani wrote: [Added Marcel to CC] Hi, Sorry for the late reply. On Wed, 2016-12-07 at 15:11 +1100, David Gibson wrote: Is the difference between q35 and pseries guests with respect to PCIe only relevant when it comes to assigned devices, or in general? I'm asking this because you seem to focus entirely on assigned devices. Well, in a sense that's up to us. The only existing model we have is PowerVM, and PowerVM only does device passthrough, no emulated devices. PAPR doesn't really distinguish one way or the other, but it's written from the perspective of assuming that all PCI devices correspond to physical devices on the host Okay, that makes sense. On q35, you'd generally expect physically separate (different slot) devices to appear under separate root complexes. This part I don't get at all, so please bear with me. The way I read it you're claiming that eg. a SCSI controller and a network adapter, being physically separate and assigned to separate PCI slots, should have a dedicated PCIe Root Complex each on a q35 guest. Not a PCIe Root Complex, but a PCIe Root port. Ah, sorry. As I said, I've been pretty confused by all the terminology. Right, my understanding was that if the devices were slotted, rather than integrated, each one would sit under a separate root complex, the root complex being a pseudo PCI to PCI bridge. I assume "slotted" means "plugged into a slot that's not one of those provided by pcie.0" or something along those lines. More on the root complex bit later. That doesn't match with my experience, where you would simply assign them to separate slots of the default PCIe Root Bus (pcie.0), eg. 00:01.0 and 00:02.0. The qemu default, or the libvirt default? I'm talking about the libvirt default, which is supposed to follows Marcel's PCIe Guidelines. I think this represents treating the devices as though they were integrated devices in the host bridge. I believe on q35 they would not be hotpluggable Correct. Please have a look to the new document regarding pcie: docs/pcie.txt and the corresponding presentations. Yeah, that's indeed not quite what libvirt would do by default: in reality, there would be a ioh3420 between the pcie.0 slots and each device exactly to enable hotplug. but on pseries they would be (because we don't use the standard hot plug controller). We can account for that in libvirt and avoid adding the extra ioh3420s (or rather the upcoming generic PCIe Root Ports) for pseries guests. Maybe you're referring to the fact that you might want to create multiple PCIe Root Complexes in order to assign the host devices to separate guest NUMA nodes? How is creating multiple PCIe Root Complexes on q35 using pxb-pcie different than creating multiple PHBs using spapr-pci-host-bridge on pseries? Uh.. AIUI the root complex is the PCI to PCI bridge under which PCI-E slots appear. PXB is something different - essentially different host bridges as you say (though with some weird hacks to access config space, which make it dependent on the primary bus in a way which spapr PHBs are not). I'll admit I'm pretty confused myself about the exact distinction between root complex, root port and upstream and downstream ports. I think we both need to get our terminology straight :) I'm sure Marcel will be happy to point us in the right direction. My understanding is that the PCIe Root Complex is the piece of hardware that exposes a PCIe Root Bus (pcie.0 in QEMU); right Oh.. I wasn't as clear as I'd like to be on what the root complex is. But I thought the root complex did have some guest visible presence in the PCI tree. What you're describing here seems equivalent to what I'd call the PCI Host Bridge (== PHB). Yes, a Root Complex is a type of Host Bridge in the sense it bridges between CPU/Memory Controller ant the PCI subsystem. PXBs can be connected to slots in pcie.0 to create more buses that behave, for the most part, like pcie.0 and are mostly useful to bind devices to specific NUMA nodes. right Same applies to legacy PCI with the pxb (instead of pxb-pcie) device. pxb should not be used for PCIe machines, only for legacy PCI ones. Noted. And not for pseries at all. Note that because we have a para-virtualized platform (all PCI config access goes via hypercalls) the distinction between PCI and PCI-E is much blurrier than in the x86 case. OK In a similar fashion, PHBs are the hardware thingies that expose a PCI Root Bus (pci.0 and so on), the main difference being that they are truly independent: so a q35 guest will always have a "primary" PCIe Root Bus and (optionally) a bunch of "secondary" ones, but the same will not be the case for pseries guests. OK I don't think the difference is that important though, at least from libvirt's point of view: whether you're crea
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On 12/13/2016 05:15 PM, Benjamin Herrenschmidt wrote: On Tue, 2016-12-13 at 14:25 +0200, Marcel Apfelbaum wrote: Hrm, the suggestion of providing both a vanilla-PCI and PCI-E host bridge came up before. I think one of us spotted a problem with that, but I don't recall what it was now. I guess one is how libvirt would map it's stupid-fake-domain-numbers to which root bus to use. This would be a weird configuration, I never heard of something like that on a bare metal machine, but I never worked on pseries, who knows... That's the thing though, it's *not* bare metal ;-) On bare metal, we use the "powernv" platform for which we haven't yet upstreamed the PCIE support, but there we have real PCIe root ports with all what's exepcted on them. They come on physically different PHB, one root complex per PHB, but they are "proper" PCIe. Hotplug is a bit weird still because it has to go through some FW interactions as the HW doesn't use the PCIe standard slot control bits (and our qemu model doesn't handle it yet) but otherwise it's standard. Thanks for the explanations, so bare metal is standard minus hotplug. "pseries" on the other hand is a paravirtualized platform. It's the environment presented to guests by our propritary hypervisor (PowerVM, aka pHyp) and KVM/qemu. I think there's a public spec these days but to cut the story short, it doesn't expose PCIe "properly" for bad historical reasons. It tends to hide the parent, ie, the root port for slots connected directly to a PHB or the entire hierarchy from the root to (and including) the downstream switch port for slots under as switch. So you end up with PCIe devices devoid of a proper "parent" which is a bit weird. OK, now is more "clear". I can get back to read the thread from the start :) Thanks, Marcel Hotplug is implemented using specific firmware interfaces that are identical with pre-PCIe (ie PCI/PCI-X) systems. The FW has interface for supporting 3 kinds of hotplug: - Entire PHBs - P2P Bridges - Individual devices on an existing PHB or P2P bridge In practice these days mostly the first one is used for everything under pHyp, though I think we have a mix of 1 and 3 for KVM/qemu. Cheers, Ben.
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Tue, Dec 13, 2016 at 02:25:44PM +0200, Marcel Apfelbaum wrote: > On 12/07/2016 06:42 PM, Andrea Bolognani wrote: > > [Added Marcel to CC] > > > > > Hi, > > Sorry for the late reply. > > > On Wed, 2016-12-07 at 15:11 +1100, David Gibson wrote: > > > > Is the difference between q35 and pseries guests with > > > > respect to PCIe only relevant when it comes to assigned > > > > devices, or in general? I'm asking this because you seem to > > > > focus entirely on assigned devices. > > > > > > Well, in a sense that's up to us. The only existing model we have is > > > PowerVM, and PowerVM only does device passthrough, no emulated > > > devices. PAPR doesn't really distinguish one way or the other, but > > > it's written from the perspective of assuming that all PCI devices > > > correspond to physical devices on the host > > > > Okay, that makes sense. > > > > > > > On q35, > > > > > you'd generally expect physically separate (different slot) devices to > > > > > appear under separate root complexes. > > > > > > > > This part I don't get at all, so please bear with me. > > > > > > > > The way I read it you're claiming that eg. a SCSI controller > > > > and a network adapter, being physically separate and assigned > > > > to separate PCI slots, should have a dedicated PCIe Root > > > > Complex each on a q35 guest. > > > > > Not a PCIe Root Complex, but a PCIe Root port. Ah, sorry. As I said, I've been pretty confused by all the terminology. > > > Right, my understanding was that if the devices were slotted, rather > > > than integrated, each one would sit under a separate root complex, the > > > root complex being a pseudo PCI to PCI bridge. > > > > I assume "slotted" means "plugged into a slot that's not one > > of those provided by pcie.0" or something along those lines. > > > > More on the root complex bit later. > > > > > > That doesn't match with my experience, where you would simply > > > > assign them to separate slots of the default PCIe Root Bus > > > > (pcie.0), eg. 00:01.0 and 00:02.0. > > > > > > The qemu default, or the libvirt default? > > > > I'm talking about the libvirt default, which is supposed to > > follows Marcel's PCIe Guidelines. > > > > > I think this represents > > > treating the devices as though they were integrated devices in the > > > host bridge. I believe on q35 they would not be hotpluggable > > > > Correct. Please have a look to the new document > regarding pcie: docs/pcie.txt and the corresponding presentations. > > > > Yeah, that's indeed not quite what libvirt would do by > > default: in reality, there would be a ioh3420 between the > > pcie.0 slots and each device exactly to enable hotplug. > > > > > but on > > > pseries they would be (because we don't use the standard hot plug > > > controller). > > > > We can account for that in libvirt and avoid adding the > > extra ioh3420s (or rather the upcoming generic PCIe Root > > Ports) for pseries guests. > > > > > > Maybe you're referring to the fact that you might want to > > > > create multiple PCIe Root Complexes in order to assign the > > > > host devices to separate guest NUMA nodes? How is creating > > > > multiple PCIe Root Complexes on q35 using pxb-pcie different > > > > than creating multiple PHBs using spapr-pci-host-bridge on > > > > pseries? > > > > > > Uh.. AIUI the root complex is the PCI to PCI bridge under which PCI-E > > > slots appear. PXB is something different - essentially different host > > > bridges as you say (though with some weird hacks to access config > > > space, which make it dependent on the primary bus in a way which spapr > > > PHBs are not). > > > > > > I'll admit I'm pretty confused myself about the exact distinction > > > between root complex, root port and upstream and downstream ports. > > > > I think we both need to get our terminology straight :) > > I'm sure Marcel will be happy to point us in the right > > direction. > > > > My understanding is that the PCIe Root Complex is the piece > > of hardware that exposes a PCIe Root Bus (pcie.0 in QEMU); > > right Oh.. I wasn't as clear as I'd like to be on what the root complex is. But I thought the root complex did have some guest visible presence in the PCI tree. What you're describing here seems equivalent to what I'd call the PCI Host Bridge (== PHB). > > PXBs can be connected to slots in pcie.0 to create more buses > > that behave, for the most part, like pcie.0 and are mostly > > useful to bind devices to specific NUMA nodes. > > right > > Same applies > > to legacy PCI with the pxb (instead of pxb-pcie) device. > > > > pxb should not be used for PCIe machines, only for legacy PCI ones. Noted. And not for pseries at all. Note that because we have a para-virtualized platform (all PCI config access goes via hypercalls) the distinction between PCI and PCI-E is much blurrier than in the x86 case. > > In a similar fashion, PHBs are the hardware thingies that > > expose a PCI Root Bus (pci.0 and
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Tue, Dec 13, 2016 at 09:15:37AM -0600, Benjamin Herrenschmidt wrote: > On Tue, 2016-12-13 at 14:25 +0200, Marcel Apfelbaum wrote: > > > > Hrm, the suggestion of providing both a vanilla-PCI and PCI-E host > > > > bridge came up before. I think one of us spotted a problem with that, > > > > but I don't recall what it was now. I guess one is how libvirt would > > > > map it's stupid-fake-domain-numbers to which root bus to use. > > > > This would be a weird configuration, I never heard of something like that > > on a bare metal machine, but I never worked on pseries, who knows... > > That's the thing though, it's *not* bare metal ;-) > > On bare metal, we use the "powernv" platform for which we haven't yet > upstreamed > the PCIE support, but there we have real PCIe root ports with all what's > exepcted > on them. > > They come on physically different PHB, one root complex per PHB, but they are > "proper" PCIe. Hotplug is a bit weird still because it has to go through some > FW interactions as the HW doesn't use the PCIe standard slot control bits (and > our qemu model doesn't handle it yet) but otherwise it's standard. > > "pseries" on the other hand is a paravirtualized platform. It's the > environment > presented to guests by our propritary hypervisor (PowerVM, aka pHyp) and > KVM/qemu. I think there's a public spec these days but to cut the story short, > it doesn't expose PCIe "properly" for bad historical reasons. > > It tends to hide the parent, ie, the root port for slots connected directly to > a PHB or the entire hierarchy from the root to (and including) the downstream > switch port for slots under as switch. > > So you end up with PCIe devices devoid of a proper "parent" which is a bit > weird. > > Hotplug is implemented using specific firmware interfaces that are identical > with pre-PCIe (ie PCI/PCI-X) systems. The FW has interface for supporting 3 > kinds of hotplug: > >- Entire PHBs >- P2P Bridges >- Individual devices on an existing PHB or P2P bridge > > In practice these days mostly the first one is used for everything under pHyp, > though I think we have a mix of 1 and 3 for KVM/qemu. Alas, no, we only have 3 on KVM/qemu. Well, possibly you could do (2) by plugging a bridge device using (3) then a device onto the bridge. Whole PHB hotplug has never been implemented for qemu/KVM.. but we probably want to. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Tue, 2016-12-13 at 14:25 +0200, Marcel Apfelbaum wrote: > > > Hrm, the suggestion of providing both a vanilla-PCI and PCI-E host > > > bridge came up before. I think one of us spotted a problem with that, > > > but I don't recall what it was now. I guess one is how libvirt would > > > map it's stupid-fake-domain-numbers to which root bus to use. > > This would be a weird configuration, I never heard of something like that > on a bare metal machine, but I never worked on pseries, who knows... That's the thing though, it's *not* bare metal ;-) On bare metal, we use the "powernv" platform for which we haven't yet upstreamed the PCIE support, but there we have real PCIe root ports with all what's exepcted on them. They come on physically different PHB, one root complex per PHB, but they are "proper" PCIe. Hotplug is a bit weird still because it has to go through some FW interactions as the HW doesn't use the PCIe standard slot control bits (and our qemu model doesn't handle it yet) but otherwise it's standard. "pseries" on the other hand is a paravirtualized platform. It's the environment presented to guests by our propritary hypervisor (PowerVM, aka pHyp) and KVM/qemu. I think there's a public spec these days but to cut the story short, it doesn't expose PCIe "properly" for bad historical reasons. It tends to hide the parent, ie, the root port for slots connected directly to a PHB or the entire hierarchy from the root to (and including) the downstream switch port for slots under as switch. So you end up with PCIe devices devoid of a proper "parent" which is a bit weird. Hotplug is implemented using specific firmware interfaces that are identical with pre-PCIe (ie PCI/PCI-X) systems. The FW has interface for supporting 3 kinds of hotplug: - Entire PHBs - P2P Bridges - Individual devices on an existing PHB or P2P bridge In practice these days mostly the first one is used for everything under pHyp, though I think we have a mix of 1 and 3 for KVM/qemu. Cheers, Ben.
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Tue, 13 Dec 2016 14:25:44 +0200 Marcel Apfelbaum wrote: > >> Now... from what Laine was saying it sounds like more of the > >> differences between PCI-E placement and PCI placement may be > >> implemented by libvirt than qemu than I realized. So possibly we do > >> want to make the bus be PCI-E on the qemu side, but have libvirt use > >> the vanilla-PCI placement guidelines rather than PCI-E for pseries. > > > > Basically the special casing I was mentioning earlier. > > That looks complicated.. I wish I would no more about the pseries > PCIe stuff, does any one know where I can get the info ? (besides 'google > it'...) This is described in the LoPAPR specification, available at: https://openpowerfoundation.org/wp-content/uploads/2016/05/LoPAPR_DRAFT_v11_24March2016_cmt1.pdf Cheers. -- Greg
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On 12/07/2016 06:42 PM, Andrea Bolognani wrote: [Added Marcel to CC] Hi, Sorry for the late reply. On Wed, 2016-12-07 at 15:11 +1100, David Gibson wrote: Is the difference between q35 and pseries guests with respect to PCIe only relevant when it comes to assigned devices, or in general? I'm asking this because you seem to focus entirely on assigned devices. Well, in a sense that's up to us. The only existing model we have is PowerVM, and PowerVM only does device passthrough, no emulated devices. PAPR doesn't really distinguish one way or the other, but it's written from the perspective of assuming that all PCI devices correspond to physical devices on the host Okay, that makes sense. On q35, you'd generally expect physically separate (different slot) devices to appear under separate root complexes. This part I don't get at all, so please bear with me. The way I read it you're claiming that eg. a SCSI controller and a network adapter, being physically separate and assigned to separate PCI slots, should have a dedicated PCIe Root Complex each on a q35 guest. Not a PCIe Root Complex, but a PCIe Root port. Right, my understanding was that if the devices were slotted, rather than integrated, each one would sit under a separate root complex, the root complex being a pseudo PCI to PCI bridge. I assume "slotted" means "plugged into a slot that's not one of those provided by pcie.0" or something along those lines. More on the root complex bit later. That doesn't match with my experience, where you would simply assign them to separate slots of the default PCIe Root Bus (pcie.0), eg. 00:01.0 and 00:02.0. The qemu default, or the libvirt default? I'm talking about the libvirt default, which is supposed to follows Marcel's PCIe Guidelines. I think this represents treating the devices as though they were integrated devices in the host bridge. I believe on q35 they would not be hotpluggable Correct. Please have a look to the new document regarding pcie: docs/pcie.txt and the corresponding presentations. Yeah, that's indeed not quite what libvirt would do by default: in reality, there would be a ioh3420 between the pcie.0 slots and each device exactly to enable hotplug. but on pseries they would be (because we don't use the standard hot plug controller). We can account for that in libvirt and avoid adding the extra ioh3420s (or rather the upcoming generic PCIe Root Ports) for pseries guests. Maybe you're referring to the fact that you might want to create multiple PCIe Root Complexes in order to assign the host devices to separate guest NUMA nodes? How is creating multiple PCIe Root Complexes on q35 using pxb-pcie different than creating multiple PHBs using spapr-pci-host-bridge on pseries? Uh.. AIUI the root complex is the PCI to PCI bridge under which PCI-E slots appear. PXB is something different - essentially different host bridges as you say (though with some weird hacks to access config space, which make it dependent on the primary bus in a way which spapr PHBs are not). I'll admit I'm pretty confused myself about the exact distinction between root complex, root port and upstream and downstream ports. I think we both need to get our terminology straight :) I'm sure Marcel will be happy to point us in the right direction. My understanding is that the PCIe Root Complex is the piece of hardware that exposes a PCIe Root Bus (pcie.0 in QEMU); right PXBs can be connected to slots in pcie.0 to create more buses that behave, for the most part, like pcie.0 and are mostly useful to bind devices to specific NUMA nodes. right Same applies to legacy PCI with the pxb (instead of pxb-pcie) device. pxb should not be used for PCIe machines, only for legacy PCI ones. In a similar fashion, PHBs are the hardware thingies that expose a PCI Root Bus (pci.0 and so on), the main difference being that they are truly independent: so a q35 guest will always have a "primary" PCIe Root Bus and (optionally) a bunch of "secondary" ones, but the same will not be the case for pseries guests. OK I don't think the difference is that important though, at least from libvirt's point of view: whether you're creating a pseries guest with two PHBs, or a q35 guest with its built-in PCIe Root Complex and an extra PCIe Expander Bus, you will end up with two "top level" buses that you can plug more devices into. I agree If we had spapr-pcie-host-bridge, we could treat them mostly the same - with caveats such as the one described above, of course. Whereas on pseries they'll appear as siblings on a virtual bus (which makes no physical sense for point-to-point PCI-E). What is the virtual bus in question? Why would it matter that they're siblings? On pseries it won't. But my understanding is that libvirt won't create them that way on q35 - instead it will insert the RCs / P2P bridges to allow them to be hotplugged. Inserting that bridge may confuse pseries guests which aren'
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
[Added Marcel to CC] On Wed, 2016-12-07 at 15:11 +1100, David Gibson wrote: > > Is the difference between q35 and pseries guests with > > respect to PCIe only relevant when it comes to assigned > > devices, or in general? I'm asking this because you seem to > > focus entirely on assigned devices. > > Well, in a sense that's up to us. The only existing model we have is > PowerVM, and PowerVM only does device passthrough, no emulated > devices. PAPR doesn't really distinguish one way or the other, but > it's written from the perspective of assuming that all PCI devices > correspond to physical devices on the host Okay, that makes sense. > > > On q35, > > > you'd generally expect physically separate (different slot) devices to > > > appear under separate root complexes. > > > > This part I don't get at all, so please bear with me. > > > > The way I read it you're claiming that eg. a SCSI controller > > and a network adapter, being physically separate and assigned > > to separate PCI slots, should have a dedicated PCIe Root > > Complex each on a q35 guest. > > Right, my understanding was that if the devices were slotted, rather > than integrated, each one would sit under a separate root complex, the > root complex being a pseudo PCI to PCI bridge. I assume "slotted" means "plugged into a slot that's not one of those provided by pcie.0" or something along those lines. More on the root complex bit later. > > That doesn't match with my experience, where you would simply > > assign them to separate slots of the default PCIe Root Bus > > (pcie.0), eg. 00:01.0 and 00:02.0. > > The qemu default, or the libvirt default? I'm talking about the libvirt default, which is supposed to follows Marcel's PCIe Guidelines. > I think this represents > treating the devices as though they were integrated devices in the > host bridge. I believe on q35 they would not be hotpluggable Yeah, that's indeed not quite what libvirt would do by default: in reality, there would be a ioh3420 between the pcie.0 slots and each device exactly to enable hotplug. > but on > pseries they would be (because we don't use the standard hot plug > controller). We can account for that in libvirt and avoid adding the extra ioh3420s (or rather the upcoming generic PCIe Root Ports) for pseries guests. > > Maybe you're referring to the fact that you might want to > > create multiple PCIe Root Complexes in order to assign the > > host devices to separate guest NUMA nodes? How is creating > > multiple PCIe Root Complexes on q35 using pxb-pcie different > > than creating multiple PHBs using spapr-pci-host-bridge on > > pseries? > > Uh.. AIUI the root complex is the PCI to PCI bridge under which PCI-E > slots appear. PXB is something different - essentially different host > bridges as you say (though with some weird hacks to access config > space, which make it dependent on the primary bus in a way which spapr > PHBs are not). > > I'll admit I'm pretty confused myself about the exact distinction > between root complex, root port and upstream and downstream ports. I think we both need to get our terminology straight :) I'm sure Marcel will be happy to point us in the right direction. My understanding is that the PCIe Root Complex is the piece of hardware that exposes a PCIe Root Bus (pcie.0 in QEMU); PXBs can be connected to slots in pcie.0 to create more buses that behave, for the most part, like pcie.0 and are mostly useful to bind devices to specific NUMA nodes. Same applies to legacy PCI with the pxb (instead of pxb-pcie) device. In a similar fashion, PHBs are the hardware thingies that expose a PCI Root Bus (pci.0 and so on), the main difference being that they are truly independent: so a q35 guest will always have a "primary" PCIe Root Bus and (optionally) a bunch of "secondary" ones, but the same will not be the case for pseries guests. I don't think the difference is that important though, at least from libvirt's point of view: whether you're creating a pseries guest with two PHBs, or a q35 guest with its built-in PCIe Root Complex and an extra PCIe Expander Bus, you will end up with two "top level" buses that you can plug more devices into. If we had spapr-pcie-host-bridge, we could treat them mostly the same - with caveats such as the one described above, of course. > > > Whereas on pseries they'll > > > appear as siblings on a virtual bus (which makes no physical sense for > > > point-to-point PCI-E). > > > > What is the virtual bus in question? Why would it matter > > that they're siblings? > > On pseries it won't. But my understanding is that libvirt won't > create them that way on q35 - instead it will insert the RCs / P2P > bridges to allow them to be hotplugged. Inserting that bridge may > confuse pseries guests which aren't expecting it. libvirt will automatically add PCIe Root Ports to make the devices hotpluggable on q35 guests, yes. But, as mentioned above, we can teach it not to. > > I'm possibly missing the po
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Tue, Dec 06, 2016 at 06:30:47PM +0100, Andrea Bolognani wrote: > On Fri, 2016-12-02 at 15:18 +1100, David Gibson wrote: > > > So, would the PCIe Root Bus in a pseries guest behave > > > differently than the one in a q35 or mach-virt guest? > > > > Yes. I had a long discussion with BenH and got a somewhat better idea > > about this. > > Sorry, but I'm afraid you're going to have to break this > down even further for me :( > > > If only a single host PE (== iommu group) is passed through and there > > are no emulated devices, the difference isn't too bad: basically on > > pseries you'll see the subtree that would be below the root complex on > > q35. > > > > But if you pass through multiple groups, things get weird. > > Is the difference between q35 and pseries guests with > respect to PCIe only relevant when it comes to assigned > devices, or in general? I'm asking this because you seem to > focus entirely on assigned devices. Well, in a sense that's up to us. The only existing model we have is PowerVM, and PowerVM only does device passthrough, no emulated devices. PAPR doesn't really distinguish one way or the other, but it's written from the perspective of assuming that all PCI devices correspond to physical devices on the host > > On q35, > > you'd generally expect physically separate (different slot) devices to > > appear under separate root complexes. > > This part I don't get at all, so please bear with me. > > The way I read it you're claiming that eg. a SCSI controller > and a network adapter, being physically separate and assigned > to separate PCI slots, should have a dedicated PCIe Root > Complex each on a q35 guest. Right, my understanding was that if the devices were slotted, rather than integrated, each one would sit under a separate root complex, the root complex being a pseudo PCI to PCI bridge. > That doesn't match with my experience, where you would simply > assign them to separate slots of the default PCIe Root Bus > (pcie.0), eg. 00:01.0 and 00:02.0. The qemu default, or the libvirt default? I think this represents treating the devices as though they were integrated devices in the host bridge. I believe on q35 they would not be hotpluggable - but on pseries they would be (because we don't use the standard hot plug controller). > Maybe you're referring to the fact that you might want to > create multiple PCIe Root Complexes in order to assign the > host devices to separate guest NUMA nodes? How is creating > multiple PCIe Root Complexes on q35 using pxb-pcie different > than creating multiple PHBs using spapr-pci-host-bridge on > pseries? Uh.. AIUI the root complex is the PCI to PCI bridge under which PCI-E slots appear. PXB is something different - essentially different host bridges as you say (though with some weird hacks to access config space, which make it dependent on the primary bus in a way which spapr PHBs are not). I'll admit I'm pretty confused myself about the exact distinction between root complex, root port and upstream and downstream ports. > > Whereas on pseries they'll > > appear as siblings on a virtual bus (which makes no physical sense for > > point-to-point PCI-E). > > What is the virtual bus in question? Why would it matter > that they're siblings? On pseries it won't. But my understanding is that libvirt won't create them that way on q35 - instead it will insert the RCs / P2P bridges to allow them to be hotplugged. Inserting that bridge may confuse pseries guests which aren't expecting it. > I'm possibly missing the point entirely, but so far it > looks to me like there are different configurations you > might want to use depending on your goal, and both q35 > and pseries give you comparable tools to achieve such > configurations. > > > I suppose we could try treating all devices on pseries as though they > > were chipset builtin devices on q35, which will appear on the root > > PCI-E bus without root complex. But I suspect that's likely to cause > > trouble with hotplug, and it will certainly need different address > > allocation from libvirt. > > PCIe Integrated Endpoint Devices are not hotpluggable on > q35, that's why libvirt will follow QEMU's PCIe topology > recommendations and place a PCIe Root Port between them; > I assume the same could be done for pseries guests as > soon as QEMU grows support for generic PCIe Root Ports, > something Marcel has already posted patches for. Here you've hit on it. No, we should not do that for pseries, AFAICT. PAPR doesn't really have the concept of integrated endpoint devices, and all devices can be hotplugged via the PAPR mechanisms (and none can via the PCI-E standard hotplug mechanism). > Again, sorry for clearly misunderstanding your explanation, > but I'm still not seeing the issue here. I'm sure it's very > clear in your mind, but I'm afraid you're going to have to > walk me through it :( I wish it were entirely clear in my mind. Like I say I'm still pretty confused by exactly the
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Fri, 2016-12-02 at 15:18 +1100, David Gibson wrote: > > So, would the PCIe Root Bus in a pseries guest behave > > differently than the one in a q35 or mach-virt guest? > > Yes. I had a long discussion with BenH and got a somewhat better idea > about this. Sorry, but I'm afraid you're going to have to break this down even further for me :( > If only a single host PE (== iommu group) is passed through and there > are no emulated devices, the difference isn't too bad: basically on > pseries you'll see the subtree that would be below the root complex on > q35. > > But if you pass through multiple groups, things get weird. Is the difference between q35 and pseries guests with respect to PCIe only relevant when it comes to assigned devices, or in general? I'm asking this because you seem to focus entirely on assigned devices. > On q35, > you'd generally expect physically separate (different slot) devices to > appear under separate root complexes. This part I don't get at all, so please bear with me. The way I read it you're claiming that eg. a SCSI controller and a network adapter, being physically separate and assigned to separate PCI slots, should have a dedicated PCIe Root Complex each on a q35 guest. That doesn't match with my experience, where you would simply assign them to separate slots of the default PCIe Root Bus (pcie.0), eg. 00:01.0 and 00:02.0. Maybe you're referring to the fact that you might want to create multiple PCIe Root Complexes in order to assign the host devices to separate guest NUMA nodes? How is creating multiple PCIe Root Complexes on q35 using pxb-pcie different than creating multiple PHBs using spapr-pci-host-bridge on pseries? > Whereas on pseries they'll > appear as siblings on a virtual bus (which makes no physical sense for > point-to-point PCI-E). What is the virtual bus in question? Why would it matter that they're siblings? I'm possibly missing the point entirely, but so far it looks to me like there are different configurations you might want to use depending on your goal, and both q35 and pseries give you comparable tools to achieve such configurations. > I suppose we could try treating all devices on pseries as though they > were chipset builtin devices on q35, which will appear on the root > PCI-E bus without root complex. But I suspect that's likely to cause > trouble with hotplug, and it will certainly need different address > allocation from libvirt. PCIe Integrated Endpoint Devices are not hotpluggable on q35, that's why libvirt will follow QEMU's PCIe topology recommendations and place a PCIe Root Port between them; I assume the same could be done for pseries guests as soon as QEMU grows support for generic PCIe Root Ports, something Marcel has already posted patches for. Again, sorry for clearly misunderstanding your explanation, but I'm still not seeing the issue here. I'm sure it's very clear in your mind, but I'm afraid you're going to have to walk me through it :( > > Regardless of how we decide to move forward with the > > PCIe-enabled pseries machine type, libvirt will have to > > know about this so it can behave appropriately. > > So there are kind of two extremes of how to address this. There are a > variety of options in between, but I suspect they're going to be even > more muddled and hideous than the extremes. > > 1) Give up. You said there's already a flag that says a PCI-E bus is > able to accept vanilla-PCI devices. We add a hack flag that says a > vanilla-PCI bus is able to accept PCI-E devices. We keep address > allocation as it is now - the pseries topology really does resemble > vanilla-PCI much better than it does PCI-E. But, we allow PCI-E > devices, and PAPR has mechanisms for accessing the extended config > space. PCI-E standard hotplug and error reporting will never work, > but PAPR provides its own mechanisms for those, so that should be ok. We can definitely special-case pseries guests and take the "anything goes" approach to PCI vs PCIe, but it would certainly be nicer if we could avoid presenting our users the head-scratching situation of PCIe devices being plugged into legacy PCI slots and still showing up as PCIe in the guest. What about virtio devices, which present themselves either as legacy PCI or PCIe depending on the kind of slot they are plugged into? Would they show up as PCIe or legacy PCI on a PCIe-enabled pseries guest? > 2) Start exposing the PCI-E heirarchy for pseries guests much more > like q35, root complexes and all. It's not clear that PAPR actually > *forbids* exposing the root complex, it just doesn't require it and > that's not what PowerVM does. But.. there are big questions about > whether existing guests will cope with this or not. When you start > adding in multiple passed through devices and particularly virtual > functions as well, things could get very ugly - we might need to > construct multiple emulated virtual root complexes or other messes. > > In the short to medium ter
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On 03/12/16 08:41, Benjamin Herrenschmidt wrote: > On Fri, 2016-12-02 at 16:50 +1100, David Gibson wrote: >> >> Uh.. I don't entirely follow you. From the host point of view there >> are multiple iommu groups (PEs), but from the guest point of view >> there's only one. On the guest side iommu granularity is always >> per-vPHB. > > Ok so the H_PUT_TCE calls affect all the underlying tables ? Yes, it updates all hardware tables attached to the same guest's LIOBN (which in reality can only be seen on power7 as power8 shares tables between PEs). -- Alexey
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Fri, 2016-12-02 at 16:50 +1100, David Gibson wrote: > > Uh.. I don't entirely follow you. From the host point of view there > are multiple iommu groups (PEs), but from the guest point of view > there's only one. On the guest side iommu granularity is always > per-vPHB. Ok so the H_PUT_TCE calls affect all the underlying tables ? Cheers, Ben.
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Fri, Dec 02, 2016 at 04:17:50PM +1100, Benjamin Herrenschmidt wrote: > On Fri, 2016-12-02 at 15:18 +1100, David Gibson wrote: > > But if you pass through multiple groups, things get weird. On q35, > > you'd generally expect physically separate (different slot) devices to > > appear under separate root complexes. Whereas on pseries they'll > > appear as siblings on a virtual bus (which makes no physical sense for > > point-to-point PCI-E). > > It's also somewhat broken if they aren't in the same iommu domain > because the way we pass the iommu buid is via the parent node, so > a given iommu domain must reside below a common parent and not share > it. Uh.. I don't entirely follow you. From the host point of view there are multiple iommu groups (PEs), but from the guest point of view there's only one. On the guest side iommu granularity is always per-vPHB. > > I suppose we could try treating all devices on pseries as though they > > were chipset builtin devices on q35, which will appear on the root > > PCI-E bus without root complex. But I suspect that's likely to cause > > trouble with hotplug, and it will certainly need different address > > allocation from libvirt. > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Fri, 2016-12-02 at 15:18 +1100, David Gibson wrote: > But if you pass through multiple groups, things get weird. On q35, > you'd generally expect physically separate (different slot) devices to > appear under separate root complexes. Whereas on pseries they'll > appear as siblings on a virtual bus (which makes no physical sense for > point-to-point PCI-E). It's also somewhat broken if they aren't in the same iommu domain because the way we pass the iommu buid is via the parent node, so a given iommu domain must reside below a common parent and not share it. > I suppose we could try treating all devices on pseries as though they > were chipset builtin devices on q35, which will appear on the root > PCI-E bus without root complex. But I suspect that's likely to cause > trouble with hotplug, and it will certainly need different address > allocation from libvirt.
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Fri, Nov 25, 2016 at 03:36:25PM +0100, Andrea Bolognani wrote: > On Wed, 2016-11-23 at 16:02 +1100, David Gibson wrote: > > > > The change from OHCI to XHCI only affected the *default* USB > > > > controller, which libvirt tries its best not to use anyway: > > > > instead, it will prefer to use '-M ...,usb=off' along with > > > > '-device ...' and set both the controller model and its PCI > > > > address explicitly, partially to shield its users from such > > > > changes in QEMU. > > > > > > Ok. Always likes this approach really. We should start moving to this > > > direction with PHB - stop adding the default PHB at all when -nodefaults > > > is > > > passed (or -machine pseries,pci=off ?) and let libvirt manage PHBs itself > > > (and provide another spapr-phb type like spapr-pcie-host-bridge or add a > > > "pcie_root_bus_type" property to the existing PHB type). > > > > > > What will be wrong with this approach? > > > > Hm, that's a good point. If were removing the default PHB entirely, > > that I would consider a possible case for a new machine type. Putting > > construction of the PHBs into libvirt's hand could make life simpler > > there. Although it would make it a bit of a pain for people starting > > qemu by hand. > > > > I think this option is worth some thought. > > Note that libvirt always runs QEMU with -nodefaults, so we > could just remove the default PHB if that flag is present, > and leave it in if that's not the case. > > The idea itself sounds good, but of course it will require > more work from the libvirt side than simply making the PCIe > machine type behave like q35 and mach-virt. Yeah, but the latter basically just won't work. > Moreover, we already have an RFE for supporting additional > PHBs, we could solve both issues in one fell swoop and have > the libvirt guest XML be a more faithful representation of > the actual virtual hardware, which is always a win in my > book. Right. And the general recomendation for PAPR guests is that each passed through device (or rather, each passed through iommu group) have a separate virtual PHB on the guest. With this rework libvirt could switch over to implementing that recommendation. > That will be even more important if it turns out that the > rules for PCIe device assignment (eg. what device/controller > can be plugged into which slot) are different for PCIe PHBs > than they are for q35/mach-virt PCIe Root Bus. I've asked > for clarifications about this elsewhere in the thread. > > -- > Andrea Bolognani / Red Hat / Virtualization > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Fri, Nov 25, 2016 at 02:46:21PM +0100, Andrea Bolognani wrote: > On Wed, 2016-11-23 at 16:00 +1100, David Gibson wrote: > > > Existing libvirt versions assume that pseries guests have > > > a legacy PCI root bus, and will base their PCI address > > > allocation / PCI topology decisions on that fact: they > > > will, for example, use legacy PCI bridges. > > > > Um.. yeah.. trouble is libvirt's PCI-E address allocation probably > > won't work for spapr PCI-E either, because of the weird PCI-E without > > root complex presentation we get in PAPR. > > So, would the PCIe Root Bus in a pseries guest behave > differently than the one in a q35 or mach-virt guest? Yes. I had a long discussion with BenH and got a somewhat better idea about this. If only a single host PE (== iommu group) is passed through and there are no emulated devices, the difference isn't too bad: basically on pseries you'll see the subtree that would be below the root complex on q35. But if you pass through multiple groups, things get weird. On q35, you'd generally expect physically separate (different slot) devices to appear under separate root complexes. Whereas on pseries they'll appear as siblings on a virtual bus (which makes no physical sense for point-to-point PCI-E). I suppose we could try treating all devices on pseries as though they were chipset builtin devices on q35, which will appear on the root PCI-E bus without root complex. But I suspect that's likely to cause trouble with hotplug, and it will certainly need different address allocation from libvirt. > Does it have a different number of slots, do we have to > plug different controllers into them, ...? > > Regardless of how we decide to move forward with the > PCIe-enabled pseries machine type, libvirt will have to > know about this so it can behave appropriately. So there are kind of two extremes of how to address this. There are a variety of options in between, but I suspect they're going to be even more muddled and hideous than the extremes. 1) Give up. You said there's already a flag that says a PCI-E bus is able to accept vanilla-PCI devices. We add a hack flag that says a vanilla-PCI bus is able to accept PCI-E devices. We keep address allocation as it is now - the pseries topology really does resemble vanilla-PCI much better than it does PCI-E. But, we allow PCI-E devices, and PAPR has mechanisms for accessing the extended config space. PCI-E standard hotplug and error reporting will never work, but PAPR provides its own mechanisms for those, so that should be ok. 2) Start exposing the PCI-E heirarchy for pseries guests much more like q35, root complexes and all. It's not clear that PAPR actually *forbids* exposing the root complex, it just doesn't require it and that's not what PowerVM does. But.. there are big questions about whether existing guests will cope with this or not. When you start adding in multiple passed through devices and particularly virtual functions as well, things could get very ugly - we might need to construct multiple emulated virtual root complexes or other messes. In the short to medium term, I'm thinking option (1) seems pretty compelling. > > > > I believe after we introduced the very first > > > > pseries-pcie-X.Y, we will just stop adding new pseries-X.Y. > > > > > > Isn't i440fx still being updated despite the fact that q35 > > > exists? Granted, there are a lot more differences between > > > those two machine types than just the root bus type. > > > > Right, there are heaps of differences between i440fx and q35, and > > reasons to keep both updated. For pseries we have neither the impetus > > nor the resources to maintain two different machine type variant, > > where the only difference is between legacy PCI and weirdly presented > > PCI-E. > > Calling the PCIe machine type either pseries-2.8 or > pseries-pcie-2.8 would result in the very same amount of > work, and in both cases it would be understood that the > legacy PCI machine type is no longer going to be updated, > but can still be used to run existing guests. So, I'm not sure if the idea of a new machine type has legs or not, but let's think it through a bit further. Suppose we have a new machine type, let's call it 'papr'. I'm thinking it would be (at least with -nodefaults) basically a super-minimal version of pseries: so each PHB would have to be explicitly created, the VIO bridge would have to be explicitly created, likewise the NVRAM. Not sure about the "devices" which really represent firmware features - the RTC, RNG, hypervisor event source and so forth. Might have some advantages. Then again, it doesn't really solve the specific problem here. It means libvirt (or the user) has to explicitly choose a PCI or PCI-E PHB to put things on, but libvirt's PCI-E address allocation will still be wrong in all probability. Guh. As an aside, here's a RANT. libvirt address allocation. Seriously, W. T. F! libvirt insists on owning address allocat
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Wed, 2016-11-23 at 16:02 +1100, David Gibson wrote: > > > The change from OHCI to XHCI only affected the *default* USB > > > controller, which libvirt tries its best not to use anyway: > > > instead, it will prefer to use '-M ...,usb=off' along with > > > '-device ...' and set both the controller model and its PCI > > > address explicitly, partially to shield its users from such > > > changes in QEMU. > > > > Ok. Always likes this approach really. We should start moving to this > > direction with PHB - stop adding the default PHB at all when -nodefaults is > > passed (or -machine pseries,pci=off ?) and let libvirt manage PHBs itself > > (and provide another spapr-phb type like spapr-pcie-host-bridge or add a > > "pcie_root_bus_type" property to the existing PHB type). > > > > What will be wrong with this approach? > > Hm, that's a good point. If were removing the default PHB entirely, > that I would consider a possible case for a new machine type. Putting > construction of the PHBs into libvirt's hand could make life simpler > there. Although it would make it a bit of a pain for people starting > qemu by hand. > > I think this option is worth some thought. Note that libvirt always runs QEMU with -nodefaults, so we could just remove the default PHB if that flag is present, and leave it in if that's not the case. The idea itself sounds good, but of course it will require more work from the libvirt side than simply making the PCIe machine type behave like q35 and mach-virt. Moreover, we already have an RFE for supporting additional PHBs, we could solve both issues in one fell swoop and have the libvirt guest XML be a more faithful representation of the actual virtual hardware, which is always a win in my book. That will be even more important if it turns out that the rules for PCIe device assignment (eg. what device/controller can be plugged into which slot) are different for PCIe PHBs than they are for q35/mach-virt PCIe Root Bus. I've asked for clarifications about this elsewhere in the thread. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Wed, 2016-11-23 at 16:00 +1100, David Gibson wrote: > > Existing libvirt versions assume that pseries guests have > > a legacy PCI root bus, and will base their PCI address > > allocation / PCI topology decisions on that fact: they > > will, for example, use legacy PCI bridges. > > Um.. yeah.. trouble is libvirt's PCI-E address allocation probably > won't work for spapr PCI-E either, because of the weird PCI-E without > root complex presentation we get in PAPR. So, would the PCIe Root Bus in a pseries guest behave differently than the one in a q35 or mach-virt guest? Does it have a different number of slots, do we have to plug different controllers into them, ...? Regardless of how we decide to move forward with the PCIe-enabled pseries machine type, libvirt will have to know about this so it can behave appropriately. > > > I believe after we introduced the very first > > > pseries-pcie-X.Y, we will just stop adding new pseries-X.Y. > > > > Isn't i440fx still being updated despite the fact that q35 > > exists? Granted, there are a lot more differences between > > those two machine types than just the root bus type. > > Right, there are heaps of differences between i440fx and q35, and > reasons to keep both updated. For pseries we have neither the impetus > nor the resources to maintain two different machine type variant, > where the only difference is between legacy PCI and weirdly presented > PCI-E. Calling the PCIe machine type either pseries-2.8 or pseries-pcie-2.8 would result in the very same amount of work, and in both cases it would be understood that the legacy PCI machine type is no longer going to be updated, but can still be used to run existing guests. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Tue, Nov 22, 2016 at 01:26:49PM +1100, Alexey Kardashevskiy wrote: > On 22/11/16 00:08, Andrea Bolognani wrote: > > On Mon, 2016-11-21 at 13:12 +1100, Alexey Kardashevskiy wrote: > > 1) switch to PCI Express on newer machine types, and > >expose some sort of capability through QMP so that > >libvirt can know about the switch > > > > [...] > > Option 1) would break horribly with existing libvirt > > versions, and so would Option 2) if we default to using > > How exactly 1) will break libvirt? Migrating from pseries-2.7 to > pseries-2.8 does not work anyway, and machines are allowed to behave > different from version to version, what distinct difference will using > "pseries-pcie-X.Y" make? > >>> > >>> Existing libvirt versions assume that pseries guests have > >> > >> If libvirt is using just "pseries" (without a version), then having a > >> virtual PCIe-PCI bridge (and "pci.0" always available by default) will do > >> it. > > > > Please don't. Any device that is included in the guest > > by default and can't be turned off makes libvirt's life > > significantly more difficult (see below). > > > >> If libvirt is using a specific version of pseries, then it already knows > >> that <=2.7 has pci.0 as a root, pcie.0 otherwise. libvirt has a knowledge > >> what QEMU version has what, right? > > > > It doesn't yet, that's the point :) > > > > We *could* add such knowledge to libvirt[1], but *existing* > > libvirt versions would still not know about it, which means > > that upgrading QEMU withough upgrading libvirt will result > > in failure to create new guests. > > > > > >> In what scenario will an additional machine type help? > > > > Because then libvirt could learn that > > > > pseries-x.y <-> pci.0 > > pseries-pcie-x.y <-> pcie.0 > > > > the same way it already knows that > > > > pc-i440fx-x.y <-> pci.0 > > pc-q35-x.y<-> pcie.0 > > > > and choosing between one or the other would be, I think, > > much easier for the user as well. > > > >>> a legacy PCI root bus, and will base their PCI address > >>> allocation / PCI topology decisions on that fact: they > >>> will, for example, use legacy PCI bridges. > >>> > >>> So if you used a new QEMU binary with a libvirt version > >>> that doesn't know about the change, new guests would end up > >>> using the wrong controllers. Existing guests would not be > >>> affected as they would stick with the older machine types, > >>> of course. > >>> > I believe after we introduced the very first > pseries-pcie-X.Y, we will just stop adding new pseries-X.Y. > >>> > >>> Isn't i440fx still being updated despite the fact that q35 > >>> exists? Granted, there are a lot more differences between > >>> those two machine types than just the root bus type. > >> > >> I do not know about i440<->q35 but in pseries the difference is going to be > >> very simple. > >> > >> For example, we did not change the machine type when we switched from > >> default OHCI to XHCI, switching from PCI to PCIe does not sound like we > >> need a whole new machine type for this either. > > > > The change from OHCI to XHCI only affected the *default* USB > > controller, which libvirt tries its best not to use anyway: > > instead, it will prefer to use '-M ...,usb=off' along with > > '-device ...' and set both the controller model and its PCI > > address explicitly, partially to shield its users from such > > changes in QEMU. > > > Ok. Always likes this approach really. We should start moving to this > direction with PHB - stop adding the default PHB at all when -nodefaults is > passed (or -machine pseries,pci=off ?) and let libvirt manage PHBs itself > (and provide another spapr-phb type like spapr-pcie-host-bridge or add a > "pcie_root_bus_type" property to the existing PHB type). > > What will be wrong with this approach? Hm, that's a good point. If were removing the default PHB entirely, that I would consider a possible case for a new machine type. Putting construction of the PHBs into libvirt's hand could make life simpler there. Although it would make it a bit of a pain for people starting qemu by hand. I think this option is worth some thought. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Fri, Nov 18, 2016 at 09:17:22AM +0100, Andrea Bolognani wrote: > On Thu, 2016-11-17 at 13:02 +1100, Alexey Kardashevskiy wrote: > > > That said, considering that a big part of the PCI address > > > allocation logic is based off whether the specific machine > > > type exposes a legay PCI Root Bus or a PCI Express Root Bus, > > > libvirt will need a way to be able to tell which one is which. > > > > > > Version checks are pretty much out of the question, as they > > > fail as soon as downstream releases enter the picture. A > > > few ways we could deal with the situation: > > > > > > 1) switch to PCI Express on newer machine types, and > > > expose some sort of capability through QMP so that > > > libvirt can know about the switch > > [...] > > > Option 1) would break horribly with existing libvirt > > > versions, and so would Option 2) if we default to using > > > > How exactly 1) will break libvirt? Migrating from pseries-2.7 to > > pseries-2.8 does not work anyway, and machines are allowed to behave > > different from version to version, what distinct difference will using > > "pseries-pcie-X.Y" make? > > Existing libvirt versions assume that pseries guests have > a legacy PCI root bus, and will base their PCI address > allocation / PCI topology decisions on that fact: they > will, for example, use legacy PCI bridges. Um.. yeah.. trouble is libvirt's PCI-E address allocation probably won't work for spapr PCI-E either, because of the weird PCI-E without root complex presentation we get in PAPR. > So if you used a new QEMU binary with a libvirt version > that doesn't know about the change, new guests would end up > using the wrong controllers. Existing guests would not be > affected as they would stick with the older machine types, > of course. > > > I believe after we introduced the very first > > pseries-pcie-X.Y, we will just stop adding new pseries-X.Y. > > Isn't i440fx still being updated despite the fact that q35 > exists? Granted, there are a lot more differences between > those two machine types than just the root bus type. Right, there are heaps of differences between i440fx and q35, and reasons to keep both updated. For pseries we have neither the impetus nor the resources to maintain two different machine type variant, where the only difference is between legacy PCI and weirdly presented PCI-E. > Even if no newer pseries-x.y were to be added after > introducing pseries-pcie, you could still easily create > guests that use either root bus, so no loss in functionality. > > -- > Andrea Bolognani / Red Hat / Virtualization > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On 22/11/16 00:08, Andrea Bolognani wrote: > On Mon, 2016-11-21 at 13:12 +1100, Alexey Kardashevskiy wrote: > 1) switch to PCI Express on newer machine types, and >expose some sort of capability through QMP so that >libvirt can know about the switch > > [...] > Option 1) would break horribly with existing libvirt > versions, and so would Option 2) if we default to using How exactly 1) will break libvirt? Migrating from pseries-2.7 to pseries-2.8 does not work anyway, and machines are allowed to behave different from version to version, what distinct difference will using "pseries-pcie-X.Y" make? >>> >>> Existing libvirt versions assume that pseries guests have >> >> If libvirt is using just "pseries" (without a version), then having a >> virtual PCIe-PCI bridge (and "pci.0" always available by default) will do it. > > Please don't. Any device that is included in the guest > by default and can't be turned off makes libvirt's life > significantly more difficult (see below). > >> If libvirt is using a specific version of pseries, then it already knows >> that <=2.7 has pci.0 as a root, pcie.0 otherwise. libvirt has a knowledge >> what QEMU version has what, right? > > It doesn't yet, that's the point :) > > We *could* add such knowledge to libvirt[1], but *existing* > libvirt versions would still not know about it, which means > that upgrading QEMU withough upgrading libvirt will result > in failure to create new guests. > > >> In what scenario will an additional machine type help? > > Because then libvirt could learn that > > pseries-x.y <-> pci.0 > pseries-pcie-x.y <-> pcie.0 > > the same way it already knows that > > pc-i440fx-x.y <-> pci.0 > pc-q35-x.y<-> pcie.0 > > and choosing between one or the other would be, I think, > much easier for the user as well. > >>> a legacy PCI root bus, and will base their PCI address >>> allocation / PCI topology decisions on that fact: they >>> will, for example, use legacy PCI bridges. >>> >>> So if you used a new QEMU binary with a libvirt version >>> that doesn't know about the change, new guests would end up >>> using the wrong controllers. Existing guests would not be >>> affected as they would stick with the older machine types, >>> of course. >>> I believe after we introduced the very first pseries-pcie-X.Y, we will just stop adding new pseries-X.Y. >>> >>> Isn't i440fx still being updated despite the fact that q35 >>> exists? Granted, there are a lot more differences between >>> those two machine types than just the root bus type. >> >> I do not know about i440<->q35 but in pseries the difference is going to be >> very simple. >> >> For example, we did not change the machine type when we switched from >> default OHCI to XHCI, switching from PCI to PCIe does not sound like we >> need a whole new machine type for this either. > > The change from OHCI to XHCI only affected the *default* USB > controller, which libvirt tries its best not to use anyway: > instead, it will prefer to use '-M ...,usb=off' along with > '-device ...' and set both the controller model and its PCI > address explicitly, partially to shield its users from such > changes in QEMU. Ok. Always likes this approach really. We should start moving to this direction with PHB - stop adding the default PHB at all when -nodefaults is passed (or -machine pseries,pci=off ?) and let libvirt manage PHBs itself (and provide another spapr-phb type like spapr-pcie-host-bridge or add a "pcie_root_bus_type" property to the existing PHB type). What will be wrong with this approach? > > Let's not forget that libvirt is a management layer, and as > such it needs to have as much control as possible over the > virtual hardware presented to the guest, mostly for guest ABI > compatibility purposes. Defaulting to XHCI instead of OHCI, > same as pretty much all other defaults, is good for people who > run QEMU directly, but libvirt needs to be able to control > such knobs itself in order to effectively perform the task it > was designed for. > > Moreover, we're talking about a more fundamental change here: > the PCI Root Bus is not just any random device, it's the one > fundation upon which the entire PCI hierarchy is built. Based > on whether the machine exposes a PCI Express Root Bus or a > legacy PCI Express Root Bus, libvirt will create very > different PCI hierarchies, eg. using several ioh3420 devices > instead of a single pci-bridge device. > > I'm still not sure if that enough to warrant an entirely new > machine type, but it definitely has a more far-reaching impact > than simply flipping the default USB controller from OHCI to > XHCI. > >>> Even if no newer pseries-x.y were to be added after >>> introducing pseries-pcie, you could still easily create >>> guests that use either root bus, so no loss in functionality. >> >> I could do this with the existing pseries
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Mon, 2016-11-21 at 13:12 +1100, Alexey Kardashevskiy wrote: > > > >1) switch to PCI Express on newer machine types, and > > > > expose some sort of capability through QMP so that > > > > libvirt can know about the switch > > > > > > > > [...] > > > > Option 1) would break horribly with existing libvirt > > > > versions, and so would Option 2) if we default to using > > > > > > How exactly 1) will break libvirt? Migrating from pseries-2.7 to > > > pseries-2.8 does not work anyway, and machines are allowed to behave > > > different from version to version, what distinct difference will using > > > "pseries-pcie-X.Y" make? > > > > Existing libvirt versions assume that pseries guests have > > If libvirt is using just "pseries" (without a version), then having a > virtual PCIe-PCI bridge (and "pci.0" always available by default) will do it. Please don't. Any device that is included in the guest by default and can't be turned off makes libvirt's life significantly more difficult (see below). > If libvirt is using a specific version of pseries, then it already knows > that <=2.7 has pci.0 as a root, pcie.0 otherwise. libvirt has a knowledge > what QEMU version has what, right? It doesn't yet, that's the point :) We *could* add such knowledge to libvirt[1], but *existing* libvirt versions would still not know about it, which means that upgrading QEMU withough upgrading libvirt will result in failure to create new guests. > In what scenario will an additional machine type help? Because then libvirt could learn that pseries-x.y <-> pci.0 pseries-pcie-x.y <-> pcie.0 the same way it already knows that pc-i440fx-x.y <-> pci.0 pc-q35-x.y<-> pcie.0 and choosing between one or the other would be, I think, much easier for the user as well. > > a legacy PCI root bus, and will base their PCI address > > allocation / PCI topology decisions on that fact: they > > will, for example, use legacy PCI bridges. > > > > So if you used a new QEMU binary with a libvirt version > > that doesn't know about the change, new guests would end up > > using the wrong controllers. Existing guests would not be > > affected as they would stick with the older machine types, > > of course. > > > > > I believe after we introduced the very first > > > pseries-pcie-X.Y, we will just stop adding new pseries-X.Y. > > > > Isn't i440fx still being updated despite the fact that q35 > > exists? Granted, there are a lot more differences between > > those two machine types than just the root bus type. > > I do not know about i440<->q35 but in pseries the difference is going to be > very simple. > > For example, we did not change the machine type when we switched from > default OHCI to XHCI, switching from PCI to PCIe does not sound like we > need a whole new machine type for this either. The change from OHCI to XHCI only affected the *default* USB controller, which libvirt tries its best not to use anyway: instead, it will prefer to use '-M ...,usb=off' along with '-device ...' and set both the controller model and its PCI address explicitly, partially to shield its users from such changes in QEMU. Let's not forget that libvirt is a management layer, and as such it needs to have as much control as possible over the virtual hardware presented to the guest, mostly for guest ABI compatibility purposes. Defaulting to XHCI instead of OHCI, same as pretty much all other defaults, is good for people who run QEMU directly, but libvirt needs to be able to control such knobs itself in order to effectively perform the task it was designed for. Moreover, we're talking about a more fundamental change here: the PCI Root Bus is not just any random device, it's the one fundation upon which the entire PCI hierarchy is built. Based on whether the machine exposes a PCI Express Root Bus or a legacy PCI Express Root Bus, libvirt will create very different PCI hierarchies, eg. using several ioh3420 devices instead of a single pci-bridge device. I'm still not sure if that enough to warrant an entirely new machine type, but it definitely has a more far-reaching impact than simply flipping the default USB controller from OHCI to XHCI. > > Even if no newer pseries-x.y were to be added after > > introducing pseries-pcie, you could still easily create > > guests that use either root bus, so no loss in functionality. > > I could do this with the existing pseries if the machine had a "root but > type" property. That was indeed one of my original proposals ;) Just note, once again, that the default for this property would have to be "off" or we would run into the same issues described above. [1] Even though, as I mentioned earlier in the thread, performing version checks on machine types is frowned upon as it turns into a minefield as soon as backports and downstream machine types enter the picture... It's much better to expose some flag through QMP for libvirt to introspect -- Andrea Bolog
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On 18/11/16 19:17, Andrea Bolognani wrote: > On Thu, 2016-11-17 at 13:02 +1100, Alexey Kardashevskiy wrote: >>> That said, considering that a big part of the PCI address >>> allocation logic is based off whether the specific machine >>> type exposes a legay PCI Root Bus or a PCI Express Root Bus, >>> libvirt will need a way to be able to tell which one is which. >>> >>> Version checks are pretty much out of the question, as they >>> fail as soon as downstream releases enter the picture. A >>> few ways we could deal with the situation: >>> >>>1) switch to PCI Express on newer machine types, and >>> expose some sort of capability through QMP so that >>> libvirt can know about the switch > > [...] >>> Option 1) would break horribly with existing libvirt >>> versions, and so would Option 2) if we default to using >> >> How exactly 1) will break libvirt? Migrating from pseries-2.7 to >> pseries-2.8 does not work anyway, and machines are allowed to behave >> different from version to version, what distinct difference will using >> "pseries-pcie-X.Y" make? > > Existing libvirt versions assume that pseries guests have If libvirt is using just "pseries" (without a version), then having a virtual PCIe-PCI bridge (and "pci.0" always available by default) will do it. If libvirt is using a specific version of pseries, then it already knows that <=2.7 has pci.0 as a root, pcie.0 otherwise. libvirt has a knowledge what QEMU version has what, right? In what scenario will an additional machine type help? > a legacy PCI root bus, and will base their PCI address > allocation / PCI topology decisions on that fact: they > will, for example, use legacy PCI bridges. > > So if you used a new QEMU binary with a libvirt version > that doesn't know about the change, new guests would end up > using the wrong controllers. Existing guests would not be > affected as they would stick with the older machine types, > of course. > >> I believe after we introduced the very first >> pseries-pcie-X.Y, we will just stop adding new pseries-X.Y. > > Isn't i440fx still being updated despite the fact that q35 > exists? Granted, there are a lot more differences between > those two machine types than just the root bus type. I do not know about i440<->q35 but in pseries the difference is going to be very simple. For example, we did not change the machine type when we switched from default OHCI to XHCI, switching from PCI to PCIe does not sound like we need a whole new machine type for this either. > Even if no newer pseries-x.y were to be added after > introducing pseries-pcie, you could still easily create > guests that use either root bus, so no loss in functionality. I could do this with the existing pseries if the machine had a "root but type" property. -- Alexey
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Thu, Nov 17, 2016 at 01:02:57PM +1100, Alexey Kardashevskiy wrote: > On 16/11/16 01:02, Andrea Bolognani wrote: > > On Tue, 2016-11-01 at 13:46 +1100, David Gibson wrote: > >> On Mon, Oct 31, 2016 at 03:10:23PM +1100, Alexey Kardashevskiy wrote: > >>> > >>> On 31/10/16 13:53, David Gibson wrote: > > On Fri, Oct 28, 2016 at 12:07:12PM +0200, Greg Kurz wrote: > > > > On Fri, 28 Oct 2016 18:56:40 +1100 > > Alexey Kardashevskiy wrote: > > > >> > >> At the moment sPAPR PHB creates a root buf of TYPE_PCI_BUS type. > >> This means that vfio-pci devices attached to it (and this is > >> a default behaviour) hide PCIe extended capabilities as > >> the bus does not pass a pci_bus_is_express(pdev->bus) check. > >> > >> This changes adds a default PCI bus type property to sPAPR PHB > >> and uses TYPE_PCIE_BUS if none passed; older machines get TYPE_PCI_BUS > >> for backward compatibility as a bus type is used in the bus name > >> so the root bus name becomes "pcie.0" instead of "pci.0". > >> > >> Signed-off-by: Alexey Kardashevskiy > >> --- > >> > >> What can possibly go wrong with such change of a name? > >> From devices prospective, I cannot see any. > >> > >> libvirt might get upset as "pci.0" will not be available, > >> will it make sense to create pcie.0 as a root bus and always > >> add a PCIe->PCI bridge and name its bus "pci.0"? > >> > >> Or create root bus from TYPE_PCIE_BUS and force name to "pci.0"? > >> pci_register_bus() can do this. > >> > >> > >> --- > >> hw/ppc/spapr.c | 5 + > >> hw/ppc/spapr_pci.c | 5 - > >> include/hw/pci-host/spapr.h | 1 + > >> 3 files changed, 10 insertions(+), 1 deletion(-) > >> > >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > >> index 0b3820b..a268511 100644 > >> --- a/hw/ppc/spapr.c > >> +++ b/hw/ppc/spapr.c > >> @@ -2541,6 +2541,11 @@ DEFINE_SPAPR_MACHINE(2_8, "2.8", true); > >> .driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ > >> .property = "mem64_win_size", \ > >> .value= "0",\ > >> +}, \ > >> +{ \ > >> +.driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ > >> +.property = "root_bus_type",\ > >> +.value= TYPE_PCI_BUS, \ > >> }, > >> > >> static void phb_placement_2_7(sPAPRMachineState *spapr, uint32_t > >> index, > >> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > >> index 7cde30e..2fa1f22 100644 > >> --- a/hw/ppc/spapr_pci.c > >> +++ b/hw/ppc/spapr_pci.c > >> @@ -1434,7 +1434,9 @@ static void spapr_phb_realize(DeviceState *dev, > >> Error **errp) > >> bus = pci_register_bus(dev, NULL, > >> pci_spapr_set_irq, pci_spapr_map_irq, > >> sphb, > >> &sphb->memspace, &sphb->iospace, > >> - PCI_DEVFN(0, 0), PCI_NUM_PINS, > >> TYPE_PCI_BUS); > >> + PCI_DEVFN(0, 0), PCI_NUM_PINS, > >> + sphb->root_bus_type ? sphb->root_bus_type : > >> + TYPE_PCIE_BUS); > > > > Shouldn't we ensure that sphb->root_bus_type is either TYPE_PCIE_BUS or > > TYPE_PCI_BUS ? > > Yes, I think so. In fact, I think it would be better to make the > property a boolean that just selects PCI-E, rather than this which > exposes qemu (semi-)internal type names on the comamnd line. > >>> > >>> Sure, a "pcie-root" boolean property should do. > >>> > >>> However this is not my main concern, I rather wonder if we have to have > >>> pci.0 when we pick PCIe for the root. > >> > >> Right. > >> > >> I've added Andrea Bologna to the CC list to get a libvirt perspective. > > > > Thanks for doing so: changes such as this one can have quite > > an impact on the upper layers of the stack, so the earliest > > libvirt is involved in the discussion the better. > > > > I'm going to go a step further and cross-post to libvir-list > > in order to give other libvirt contributors a chance to chime > > in too. > > > >> Andrea, > >> > >> To summarise the issue here: > >> * As I've said before the PAPR spec kinda-sorta abstracts the > >> difference between vanilla PCI and PCI-E > >> * However, because within qemu we're declaring the bus as PCI that > >> means some PCI-E devices aren't working right > >> * In particular it means that PCI-E extended config space isn't > >> available > >> > >> The proposal is to change (on newer machine types) the spapr PHB code > >> to declare a PCI-E bus instead. AIUI this s
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Thu, 2016-11-17 at 13:02 +1100, Alexey Kardashevskiy wrote: > > That said, considering that a big part of the PCI address > > allocation logic is based off whether the specific machine > > type exposes a legay PCI Root Bus or a PCI Express Root Bus, > > libvirt will need a way to be able to tell which one is which. > > > > Version checks are pretty much out of the question, as they > > fail as soon as downstream releases enter the picture. A > > few ways we could deal with the situation: > > > > 1) switch to PCI Express on newer machine types, and > > expose some sort of capability through QMP so that > > libvirt can know about the switch [...] > > Option 1) would break horribly with existing libvirt > > versions, and so would Option 2) if we default to using > > How exactly 1) will break libvirt? Migrating from pseries-2.7 to > pseries-2.8 does not work anyway, and machines are allowed to behave > different from version to version, what distinct difference will using > "pseries-pcie-X.Y" make? Existing libvirt versions assume that pseries guests have a legacy PCI root bus, and will base their PCI address allocation / PCI topology decisions on that fact: they will, for example, use legacy PCI bridges. So if you used a new QEMU binary with a libvirt version that doesn't know about the change, new guests would end up using the wrong controllers. Existing guests would not be affected as they would stick with the older machine types, of course. > I believe after we introduced the very first > pseries-pcie-X.Y, we will just stop adding new pseries-X.Y. Isn't i440fx still being updated despite the fact that q35 exists? Granted, there are a lot more differences between those two machine types than just the root bus type. Even if no newer pseries-x.y were to be added after introducing pseries-pcie, you could still easily create guests that use either root bus, so no loss in functionality. -- Andrea Bolognani / Red Hat / Virtualization
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On 16/11/16 01:02, Andrea Bolognani wrote: > On Tue, 2016-11-01 at 13:46 +1100, David Gibson wrote: >> On Mon, Oct 31, 2016 at 03:10:23PM +1100, Alexey Kardashevskiy wrote: >>> >>> On 31/10/16 13:53, David Gibson wrote: On Fri, Oct 28, 2016 at 12:07:12PM +0200, Greg Kurz wrote: > > On Fri, 28 Oct 2016 18:56:40 +1100 > Alexey Kardashevskiy wrote: > >> >> At the moment sPAPR PHB creates a root buf of TYPE_PCI_BUS type. >> This means that vfio-pci devices attached to it (and this is >> a default behaviour) hide PCIe extended capabilities as >> the bus does not pass a pci_bus_is_express(pdev->bus) check. >> >> This changes adds a default PCI bus type property to sPAPR PHB >> and uses TYPE_PCIE_BUS if none passed; older machines get TYPE_PCI_BUS >> for backward compatibility as a bus type is used in the bus name >> so the root bus name becomes "pcie.0" instead of "pci.0". >> >> Signed-off-by: Alexey Kardashevskiy >> --- >> >> What can possibly go wrong with such change of a name? >> From devices prospective, I cannot see any. >> >> libvirt might get upset as "pci.0" will not be available, >> will it make sense to create pcie.0 as a root bus and always >> add a PCIe->PCI bridge and name its bus "pci.0"? >> >> Or create root bus from TYPE_PCIE_BUS and force name to "pci.0"? >> pci_register_bus() can do this. >> >> >> --- >> hw/ppc/spapr.c | 5 + >> hw/ppc/spapr_pci.c | 5 - >> include/hw/pci-host/spapr.h | 1 + >> 3 files changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >> index 0b3820b..a268511 100644 >> --- a/hw/ppc/spapr.c >> +++ b/hw/ppc/spapr.c >> @@ -2541,6 +2541,11 @@ DEFINE_SPAPR_MACHINE(2_8, "2.8", true); >> .driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ >> .property = "mem64_win_size", \ >> .value= "0",\ >> +}, \ >> +{ \ >> +.driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ >> +.property = "root_bus_type",\ >> +.value= TYPE_PCI_BUS, \ >> }, >> >> static void phb_placement_2_7(sPAPRMachineState *spapr, uint32_t index, >> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c >> index 7cde30e..2fa1f22 100644 >> --- a/hw/ppc/spapr_pci.c >> +++ b/hw/ppc/spapr_pci.c >> @@ -1434,7 +1434,9 @@ static void spapr_phb_realize(DeviceState *dev, >> Error **errp) >> bus = pci_register_bus(dev, NULL, >> pci_spapr_set_irq, pci_spapr_map_irq, sphb, >> &sphb->memspace, &sphb->iospace, >> - PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS); >> + PCI_DEVFN(0, 0), PCI_NUM_PINS, >> + sphb->root_bus_type ? sphb->root_bus_type : >> + TYPE_PCIE_BUS); > > Shouldn't we ensure that sphb->root_bus_type is either TYPE_PCIE_BUS or > TYPE_PCI_BUS ? Yes, I think so. In fact, I think it would be better to make the property a boolean that just selects PCI-E, rather than this which exposes qemu (semi-)internal type names on the comamnd line. >>> >>> Sure, a "pcie-root" boolean property should do. >>> >>> However this is not my main concern, I rather wonder if we have to have >>> pci.0 when we pick PCIe for the root. >> >> Right. >> >> I've added Andrea Bologna to the CC list to get a libvirt perspective. > > Thanks for doing so: changes such as this one can have quite > an impact on the upper layers of the stack, so the earliest > libvirt is involved in the discussion the better. > > I'm going to go a step further and cross-post to libvir-list > in order to give other libvirt contributors a chance to chime > in too. > >> Andrea, >> >> To summarise the issue here: >> * As I've said before the PAPR spec kinda-sorta abstracts the >> difference between vanilla PCI and PCI-E >> * However, because within qemu we're declaring the bus as PCI that >> means some PCI-E devices aren't working right >> * In particular it means that PCI-E extended config space isn't >> available >> >> The proposal is to change (on newer machine types) the spapr PHB code >> to declare a PCI-E bus instead. AIUI this still won't make the root >> complex guest visible (which it's not supposed to be under PAPR), and >> the guest shouldn't see a difference in most cases - it will still see >> the PAPR abstracted PCIish bus, but will now be able to get extended >> config space. >> >> The possible problem from a libvirt perspective
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Tue, 2016-11-01 at 13:46 +1100, David Gibson wrote: > On Mon, Oct 31, 2016 at 03:10:23PM +1100, Alexey Kardashevskiy wrote: > > > > On 31/10/16 13:53, David Gibson wrote: > > > > > > On Fri, Oct 28, 2016 at 12:07:12PM +0200, Greg Kurz wrote: > > > > > > > > On Fri, 28 Oct 2016 18:56:40 +1100 > > > > Alexey Kardashevskiy wrote: > > > > > > > > > > > > > > At the moment sPAPR PHB creates a root buf of TYPE_PCI_BUS type. > > > > > This means that vfio-pci devices attached to it (and this is > > > > > a default behaviour) hide PCIe extended capabilities as > > > > > the bus does not pass a pci_bus_is_express(pdev->bus) check. > > > > > > > > > > This changes adds a default PCI bus type property to sPAPR PHB > > > > > and uses TYPE_PCIE_BUS if none passed; older machines get TYPE_PCI_BUS > > > > > for backward compatibility as a bus type is used in the bus name > > > > > so the root bus name becomes "pcie.0" instead of "pci.0". > > > > > > > > > > Signed-off-by: Alexey Kardashevskiy > > > > > --- > > > > > > > > > > What can possibly go wrong with such change of a name? > > > > > From devices prospective, I cannot see any. > > > > > > > > > > libvirt might get upset as "pci.0" will not be available, > > > > > will it make sense to create pcie.0 as a root bus and always > > > > > add a PCIe->PCI bridge and name its bus "pci.0"? > > > > > > > > > > Or create root bus from TYPE_PCIE_BUS and force name to "pci.0"? > > > > > pci_register_bus() can do this. > > > > > > > > > > > > > > > --- > > > > > hw/ppc/spapr.c | 5 + > > > > > hw/ppc/spapr_pci.c | 5 - > > > > > include/hw/pci-host/spapr.h | 1 + > > > > > 3 files changed, 10 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > > > > index 0b3820b..a268511 100644 > > > > > --- a/hw/ppc/spapr.c > > > > > +++ b/hw/ppc/spapr.c > > > > > @@ -2541,6 +2541,11 @@ DEFINE_SPAPR_MACHINE(2_8, "2.8", true); > > > > > .driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ > > > > > .property = "mem64_win_size", \ > > > > > .value= "0",\ > > > > > +}, \ > > > > > +{ \ > > > > > +.driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ > > > > > +.property = "root_bus_type",\ > > > > > +.value= TYPE_PCI_BUS, \ > > > > > }, > > > > > > > > > > static void phb_placement_2_7(sPAPRMachineState *spapr, uint32_t > > > > >index, > > > > > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > > > > > index 7cde30e..2fa1f22 100644 > > > > > --- a/hw/ppc/spapr_pci.c > > > > > +++ b/hw/ppc/spapr_pci.c > > > > > @@ -1434,7 +1434,9 @@ static void spapr_phb_realize(DeviceState *dev, > > > > > Error **errp) > > > > > bus = pci_register_bus(dev, NULL, > > > > > pci_spapr_set_irq, pci_spapr_map_irq, > > > > >sphb, > > > > > &sphb->memspace, &sphb->iospace, > > > > > - PCI_DEVFN(0, 0), PCI_NUM_PINS, > > > > > TYPE_PCI_BUS); > > > > > + PCI_DEVFN(0, 0), PCI_NUM_PINS, > > > > > + sphb->root_bus_type ? sphb->root_bus_type > > > > > : > > > > > + TYPE_PCIE_BUS); > > > > > > > > Shouldn't we ensure that sphb->root_bus_type is either TYPE_PCIE_BUS or > > > > TYPE_PCI_BUS ? > > > > > > Yes, I think so. In fact, I think it would be better to make the > > > property a boolean that just selects PCI-E, rather than this which > > > exposes qemu (semi-)internal type names on the comamnd line. > > > > Sure, a "pcie-root" boolean property should do. > > > > However this is not my main concern, I rather wonder if we have to have > > pci.0 when we pick PCIe for the root. > > Right. > > I've added Andrea Bologna to the CC list to get a libvirt perspective. Thanks for doing so: changes such as this one can have quite an impact on the upper layers of the stack, so the earliest libvirt is involved in the discussion the better. I'm going to go a step further and cross-post to libvir-list in order to give other libvirt contributors a chance to chime in too. > Andrea, > > To summarise the issue here: >* As I've said before the PAPR spec kinda-sorta abstracts the > difference between vanilla PCI and PCI-E >* However, because within qemu we're declaring the bus as PCI that > means some PCI-E devices aren't working right >* In particular it means that PCI-E extended config space isn't > available > > The proposal is to change (on newer machine types) the spapr PHB code > to declare a PCI-E bus instead. AIUI this still won't make the root > complex guest visible (which it's not supposed to be under PAPR), and > the guest shouldn't see a difference in most cases - it will still see > the PAPR abstracted PC
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Mon, Oct 31, 2016 at 03:10:23PM +1100, Alexey Kardashevskiy wrote: > On 31/10/16 13:53, David Gibson wrote: > > On Fri, Oct 28, 2016 at 12:07:12PM +0200, Greg Kurz wrote: > >> On Fri, 28 Oct 2016 18:56:40 +1100 > >> Alexey Kardashevskiy wrote: > >> > >>> At the moment sPAPR PHB creates a root buf of TYPE_PCI_BUS type. > >>> This means that vfio-pci devices attached to it (and this is > >>> a default behaviour) hide PCIe extended capabilities as > >>> the bus does not pass a pci_bus_is_express(pdev->bus) check. > >>> > >>> This changes adds a default PCI bus type property to sPAPR PHB > >>> and uses TYPE_PCIE_BUS if none passed; older machines get TYPE_PCI_BUS > >>> for backward compatibility as a bus type is used in the bus name > >>> so the root bus name becomes "pcie.0" instead of "pci.0". > >>> > >>> Signed-off-by: Alexey Kardashevskiy > >>> --- > >>> > >>> What can possibly go wrong with such change of a name? > >>> From devices prospective, I cannot see any. > >>> > >>> libvirt might get upset as "pci.0" will not be available, > >>> will it make sense to create pcie.0 as a root bus and always > >>> add a PCIe->PCI bridge and name its bus "pci.0"? > >>> > >>> Or create root bus from TYPE_PCIE_BUS and force name to "pci.0"? > >>> pci_register_bus() can do this. > >>> > >>> > >>> --- > >>> hw/ppc/spapr.c | 5 + > >>> hw/ppc/spapr_pci.c | 5 - > >>> include/hw/pci-host/spapr.h | 1 + > >>> 3 files changed, 10 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > >>> index 0b3820b..a268511 100644 > >>> --- a/hw/ppc/spapr.c > >>> +++ b/hw/ppc/spapr.c > >>> @@ -2541,6 +2541,11 @@ DEFINE_SPAPR_MACHINE(2_8, "2.8", true); > >>> .driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ > >>> .property = "mem64_win_size", \ > >>> .value= "0",\ > >>> +}, \ > >>> +{ \ > >>> +.driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ > >>> +.property = "root_bus_type",\ > >>> +.value= TYPE_PCI_BUS, \ > >>> }, > >>> > >>> static void phb_placement_2_7(sPAPRMachineState *spapr, uint32_t index, > >>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > >>> index 7cde30e..2fa1f22 100644 > >>> --- a/hw/ppc/spapr_pci.c > >>> +++ b/hw/ppc/spapr_pci.c > >>> @@ -1434,7 +1434,9 @@ static void spapr_phb_realize(DeviceState *dev, > >>> Error **errp) > >>> bus = pci_register_bus(dev, NULL, > >>> pci_spapr_set_irq, pci_spapr_map_irq, sphb, > >>> &sphb->memspace, &sphb->iospace, > >>> - PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS); > >>> + PCI_DEVFN(0, 0), PCI_NUM_PINS, > >>> + sphb->root_bus_type ? sphb->root_bus_type : > >>> + TYPE_PCIE_BUS); > >> > >> Shouldn't we ensure that sphb->root_bus_type is either TYPE_PCIE_BUS or > >> TYPE_PCI_BUS ? > > > > Yes, I think so. In fact, I think it would be better to make the > > property a boolean that just selects PCI-E, rather than this which > > exposes qemu (semi-)internal type names on the comamnd line. > > > Sure, a "pcie-root" boolean property should do. > > However this is not my main concern, I rather wonder if we have to have > pci.0 when we pick PCIe for the root. Right. I've added Andrea Bologna to the CC list to get a libvirt perspective. Andrea, To summarise the issue here: * As I've said before the PAPR spec kinda-sorta abstracts the difference between vanilla PCI and PCI-E * However, because within qemu we're declaring the bus as PCI that means some PCI-E devices aren't working right * In particular it means that PCI-E extended config space isn't available The proposal is to change (on newer machine types) the spapr PHB code to declare a PCI-E bus instead. AIUI this still won't make the root complex guest visible (which it's not supposed to be under PAPR), and the guest shouldn't see a difference in most cases - it will still see the PAPR abstracted PCIish bus, but will now be able to get extended config space. The possible problem from a libvirt perspective is that doing this in the simplest way in qemu would change the name of the default bus from pci.0 to pcie.0. We have two suggested ways to mitigate this: 1) Automatically create a PCI-E to PCI bridge, so that new machine types will have both a pcie.0 and pci.0 bus 2) Force the name of the bus to be pci.0, even though it's treated as PCI-E in other ways. We're trying to work out exactly what will and won't cause trouble for libvirt. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ |
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On 31/10/16 13:53, David Gibson wrote: > On Fri, Oct 28, 2016 at 12:07:12PM +0200, Greg Kurz wrote: >> On Fri, 28 Oct 2016 18:56:40 +1100 >> Alexey Kardashevskiy wrote: >> >>> At the moment sPAPR PHB creates a root buf of TYPE_PCI_BUS type. >>> This means that vfio-pci devices attached to it (and this is >>> a default behaviour) hide PCIe extended capabilities as >>> the bus does not pass a pci_bus_is_express(pdev->bus) check. >>> >>> This changes adds a default PCI bus type property to sPAPR PHB >>> and uses TYPE_PCIE_BUS if none passed; older machines get TYPE_PCI_BUS >>> for backward compatibility as a bus type is used in the bus name >>> so the root bus name becomes "pcie.0" instead of "pci.0". >>> >>> Signed-off-by: Alexey Kardashevskiy >>> --- >>> >>> What can possibly go wrong with such change of a name? >>> From devices prospective, I cannot see any. >>> >>> libvirt might get upset as "pci.0" will not be available, >>> will it make sense to create pcie.0 as a root bus and always >>> add a PCIe->PCI bridge and name its bus "pci.0"? >>> >>> Or create root bus from TYPE_PCIE_BUS and force name to "pci.0"? >>> pci_register_bus() can do this. >>> >>> >>> --- >>> hw/ppc/spapr.c | 5 + >>> hw/ppc/spapr_pci.c | 5 - >>> include/hw/pci-host/spapr.h | 1 + >>> 3 files changed, 10 insertions(+), 1 deletion(-) >>> >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >>> index 0b3820b..a268511 100644 >>> --- a/hw/ppc/spapr.c >>> +++ b/hw/ppc/spapr.c >>> @@ -2541,6 +2541,11 @@ DEFINE_SPAPR_MACHINE(2_8, "2.8", true); >>> .driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ >>> .property = "mem64_win_size", \ >>> .value= "0",\ >>> +}, \ >>> +{ \ >>> +.driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ >>> +.property = "root_bus_type",\ >>> +.value= TYPE_PCI_BUS, \ >>> }, >>> >>> static void phb_placement_2_7(sPAPRMachineState *spapr, uint32_t index, >>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c >>> index 7cde30e..2fa1f22 100644 >>> --- a/hw/ppc/spapr_pci.c >>> +++ b/hw/ppc/spapr_pci.c >>> @@ -1434,7 +1434,9 @@ static void spapr_phb_realize(DeviceState *dev, Error >>> **errp) >>> bus = pci_register_bus(dev, NULL, >>> pci_spapr_set_irq, pci_spapr_map_irq, sphb, >>> &sphb->memspace, &sphb->iospace, >>> - PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS); >>> + PCI_DEVFN(0, 0), PCI_NUM_PINS, >>> + sphb->root_bus_type ? sphb->root_bus_type : >>> + TYPE_PCIE_BUS); >> >> Shouldn't we ensure that sphb->root_bus_type is either TYPE_PCIE_BUS or >> TYPE_PCI_BUS ? > > Yes, I think so. In fact, I think it would be better to make the > property a boolean that just selects PCI-E, rather than this which > exposes qemu (semi-)internal type names on the comamnd line. Sure, a "pcie-root" boolean property should do. However this is not my main concern, I rather wonder if we have to have pci.0 when we pick PCIe for the root. >> Otherwise, if we pass something like: >> >> -global spapr-pci-host-bridge.root_bus_type=pcie >> >> we get the following not really explicit error: >> >> ERROR:/home/greg/Work/qemu/qemu-spapr/qom/object.c:474:object_new_with_type: >> assertion failed: (type != NULL) >> >>> phb->bus = bus; >>> qbus_set_hotplug_handler(BUS(phb->bus), DEVICE(sphb), NULL); >>> >>> @@ -1590,6 +1592,7 @@ static Property spapr_phb_properties[] = { >>> DEFINE_PROP_UINT64("pgsz", sPAPRPHBState, page_size_mask, >>> (1ULL << 12) | (1ULL << 16)), >>> DEFINE_PROP_UINT32("numa_node", sPAPRPHBState, numa_node, -1), >>> +DEFINE_PROP_STRING("root_bus_type", sPAPRPHBState, root_bus_type), >>> DEFINE_PROP_END_OF_LIST(), >>> }; >>> >>> diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h >>> index b92c1b5..a4ee873 100644 >>> --- a/include/hw/pci-host/spapr.h >>> +++ b/include/hw/pci-host/spapr.h >>> @@ -79,6 +79,7 @@ struct sPAPRPHBState { >>> uint64_t dma64_win_addr; >>> >>> uint32_t numa_node; >>> +char *root_bus_type; >>> }; >>> >>> #define SPAPR_PCI_MEM_WIN_BUS_OFFSET 0x8000ULL >> > -- Alexey signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Fri, Oct 28, 2016 at 12:07:12PM +0200, Greg Kurz wrote: > On Fri, 28 Oct 2016 18:56:40 +1100 > Alexey Kardashevskiy wrote: > > > At the moment sPAPR PHB creates a root buf of TYPE_PCI_BUS type. > > This means that vfio-pci devices attached to it (and this is > > a default behaviour) hide PCIe extended capabilities as > > the bus does not pass a pci_bus_is_express(pdev->bus) check. > > > > This changes adds a default PCI bus type property to sPAPR PHB > > and uses TYPE_PCIE_BUS if none passed; older machines get TYPE_PCI_BUS > > for backward compatibility as a bus type is used in the bus name > > so the root bus name becomes "pcie.0" instead of "pci.0". > > > > Signed-off-by: Alexey Kardashevskiy > > --- > > > > What can possibly go wrong with such change of a name? > > From devices prospective, I cannot see any. > > > > libvirt might get upset as "pci.0" will not be available, > > will it make sense to create pcie.0 as a root bus and always > > add a PCIe->PCI bridge and name its bus "pci.0"? > > > > Or create root bus from TYPE_PCIE_BUS and force name to "pci.0"? > > pci_register_bus() can do this. > > > > > > --- > > hw/ppc/spapr.c | 5 + > > hw/ppc/spapr_pci.c | 5 - > > include/hw/pci-host/spapr.h | 1 + > > 3 files changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > index 0b3820b..a268511 100644 > > --- a/hw/ppc/spapr.c > > +++ b/hw/ppc/spapr.c > > @@ -2541,6 +2541,11 @@ DEFINE_SPAPR_MACHINE(2_8, "2.8", true); > > .driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ > > .property = "mem64_win_size", \ > > .value= "0",\ > > +}, \ > > +{ \ > > +.driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ > > +.property = "root_bus_type",\ > > +.value= TYPE_PCI_BUS, \ > > }, > > > > static void phb_placement_2_7(sPAPRMachineState *spapr, uint32_t index, > > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > > index 7cde30e..2fa1f22 100644 > > --- a/hw/ppc/spapr_pci.c > > +++ b/hw/ppc/spapr_pci.c > > @@ -1434,7 +1434,9 @@ static void spapr_phb_realize(DeviceState *dev, Error > > **errp) > > bus = pci_register_bus(dev, NULL, > > pci_spapr_set_irq, pci_spapr_map_irq, sphb, > > &sphb->memspace, &sphb->iospace, > > - PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS); > > + PCI_DEVFN(0, 0), PCI_NUM_PINS, > > + sphb->root_bus_type ? sphb->root_bus_type : > > + TYPE_PCIE_BUS); > > Shouldn't we ensure that sphb->root_bus_type is either TYPE_PCIE_BUS or > TYPE_PCI_BUS ? Yes, I think so. In fact, I think it would be better to make the property a boolean that just selects PCI-E, rather than this which exposes qemu (semi-)internal type names on the comamnd line. > Otherwise, if we pass something like: > > -global spapr-pci-host-bridge.root_bus_type=pcie > > we get the following not really explicit error: > > ERROR:/home/greg/Work/qemu/qemu-spapr/qom/object.c:474:object_new_with_type: > assertion failed: (type != NULL) > > > phb->bus = bus; > > qbus_set_hotplug_handler(BUS(phb->bus), DEVICE(sphb), NULL); > > > > @@ -1590,6 +1592,7 @@ static Property spapr_phb_properties[] = { > > DEFINE_PROP_UINT64("pgsz", sPAPRPHBState, page_size_mask, > > (1ULL << 12) | (1ULL << 16)), > > DEFINE_PROP_UINT32("numa_node", sPAPRPHBState, numa_node, -1), > > +DEFINE_PROP_STRING("root_bus_type", sPAPRPHBState, root_bus_type), > > DEFINE_PROP_END_OF_LIST(), > > }; > > > > diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h > > index b92c1b5..a4ee873 100644 > > --- a/include/hw/pci-host/spapr.h > > +++ b/include/hw/pci-host/spapr.h > > @@ -79,6 +79,7 @@ struct sPAPRPHBState { > > uint64_t dma64_win_addr; > > > > uint32_t numa_node; > > +char *root_bus_type; > > }; > > > > #define SPAPR_PCI_MEM_WIN_BUS_OFFSET 0x8000ULL > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH qemu] spapr_pci: Create PCI-express root bus by default
On Fri, 28 Oct 2016 18:56:40 +1100 Alexey Kardashevskiy wrote: > At the moment sPAPR PHB creates a root buf of TYPE_PCI_BUS type. > This means that vfio-pci devices attached to it (and this is > a default behaviour) hide PCIe extended capabilities as > the bus does not pass a pci_bus_is_express(pdev->bus) check. > > This changes adds a default PCI bus type property to sPAPR PHB > and uses TYPE_PCIE_BUS if none passed; older machines get TYPE_PCI_BUS > for backward compatibility as a bus type is used in the bus name > so the root bus name becomes "pcie.0" instead of "pci.0". > > Signed-off-by: Alexey Kardashevskiy > --- > > What can possibly go wrong with such change of a name? > From devices prospective, I cannot see any. > > libvirt might get upset as "pci.0" will not be available, > will it make sense to create pcie.0 as a root bus and always > add a PCIe->PCI bridge and name its bus "pci.0"? > > Or create root bus from TYPE_PCIE_BUS and force name to "pci.0"? > pci_register_bus() can do this. > > > --- > hw/ppc/spapr.c | 5 + > hw/ppc/spapr_pci.c | 5 - > include/hw/pci-host/spapr.h | 1 + > 3 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 0b3820b..a268511 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -2541,6 +2541,11 @@ DEFINE_SPAPR_MACHINE(2_8, "2.8", true); > .driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ > .property = "mem64_win_size", \ > .value= "0",\ > +}, \ > +{ \ > +.driver = TYPE_SPAPR_PCI_HOST_BRIDGE, \ > +.property = "root_bus_type",\ > +.value= TYPE_PCI_BUS, \ > }, > > static void phb_placement_2_7(sPAPRMachineState *spapr, uint32_t index, > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > index 7cde30e..2fa1f22 100644 > --- a/hw/ppc/spapr_pci.c > +++ b/hw/ppc/spapr_pci.c > @@ -1434,7 +1434,9 @@ static void spapr_phb_realize(DeviceState *dev, Error > **errp) > bus = pci_register_bus(dev, NULL, > pci_spapr_set_irq, pci_spapr_map_irq, sphb, > &sphb->memspace, &sphb->iospace, > - PCI_DEVFN(0, 0), PCI_NUM_PINS, TYPE_PCI_BUS); > + PCI_DEVFN(0, 0), PCI_NUM_PINS, > + sphb->root_bus_type ? sphb->root_bus_type : > + TYPE_PCIE_BUS); Shouldn't we ensure that sphb->root_bus_type is either TYPE_PCIE_BUS or TYPE_PCI_BUS ? Otherwise, if we pass something like: -global spapr-pci-host-bridge.root_bus_type=pcie we get the following not really explicit error: ERROR:/home/greg/Work/qemu/qemu-spapr/qom/object.c:474:object_new_with_type: assertion failed: (type != NULL) > phb->bus = bus; > qbus_set_hotplug_handler(BUS(phb->bus), DEVICE(sphb), NULL); > > @@ -1590,6 +1592,7 @@ static Property spapr_phb_properties[] = { > DEFINE_PROP_UINT64("pgsz", sPAPRPHBState, page_size_mask, > (1ULL << 12) | (1ULL << 16)), > DEFINE_PROP_UINT32("numa_node", sPAPRPHBState, numa_node, -1), > +DEFINE_PROP_STRING("root_bus_type", sPAPRPHBState, root_bus_type), > DEFINE_PROP_END_OF_LIST(), > }; > > diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h > index b92c1b5..a4ee873 100644 > --- a/include/hw/pci-host/spapr.h > +++ b/include/hw/pci-host/spapr.h > @@ -79,6 +79,7 @@ struct sPAPRPHBState { > uint64_t dma64_win_addr; > > uint32_t numa_node; > +char *root_bus_type; > }; > > #define SPAPR_PCI_MEM_WIN_BUS_OFFSET 0x8000ULL