On Sun, Nov 21, 2010 at 06:01:11PM +0200, Gleb Natapov wrote: > On Sun, Nov 21, 2010 at 04:48:44PM +0200, Michael S. Tsirkin wrote: > > On Sun, Nov 21, 2010 at 02:50:14PM +0200, Gleb Natapov wrote: > > > On Sun, Nov 21, 2010 at 01:53:26PM +0200, Michael S. Tsirkin wrote: > > > > > > The guests. > > > > > Which one? There are many guests. Your favorite? > > > > > > > > > > > For CLI, we need an easy way to map a device in guest to the > > > > > > device in qemu and back. > > > > > Then use eth0, /dev/sdb, or even C:. Your way is not less broken > > > > > since what > > > > > you are saying is "lets use name that guest assigned to a device". > > > > > > > > No I am saying let's use the name that our ACPI tables assigned. > > > > > > > ACPI does not assign any name. In a best case ACPI tables describe > > > resources > > > used by a device. > > > > Not only that. bus number and segment aren't resources as such. > > They describe addressing. > > > > > And not all guests qemu supports has support for ACPI. Qemu > > > even support machines types that do not support ACPI. > > > > So? Different machines -> different names. > > > You want to have different cli for different type of machines qemu > supports?
Different device names. > > > > > > > > > > > > > > > > > > > It looks like you identify yourself with most of > > > > > > > qemu users, but if most qemu users are like you then qemu has not > > > > > > > enough > > > > > > > users :) Most users that consider themselves to be "advanced" may > > > > > > > know > > > > > > > what eth1 or /dev/sdb means. This doesn't mean we should provide > > > > > > > "device_del eth1" or "device_add /dev/sdb" command though. > > > > > > > > > > > > > > More important is that "domain" (encoded as number like you used > > > > > > > to) > > > > > > > and "bus number" has no meaning from inside qemu. > > > > > > > So while I said many > > > > > > > times I don't care about exact CLI syntax to much it should make > > > > > > > sense > > > > > > > at least. It can use id to specify PCI bus in CLI like this: > > > > > > > device_del pci.0:1.1. Or it can even use device id too like this: > > > > > > > device_del pci.0:ide.0. Or it can use HW topology like in FO > > > > > > > device > > > > > > > path. But doing ah-hoc device enumeration inside qemu and then > > > > > > > using it > > > > > > > for CLI is not it. > > > > > > > > > > > > > > > functionality in the guests. Qemu is buggy in the moment in > > > > > > > > that is > > > > > > > > uses the bus addresses assigned by guest and not the ones in > > > > > > > > ACPI, > > > > > > > > but that can be fixed. > > > > > > > It looks like you confused ACPI _SEG for something it isn't. > > > > > > > > > > > > Maybe I did. This is what linux does: > > > > > > > > > > > > struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root > > > > > > *root) > > > > > > { > > > > > > struct acpi_device *device = root->device; > > > > > > int domain = root->segment; > > > > > > int busnum = root->secondary.start; > > > > > > > > > > > > And I think this is consistent with the spec. > > > > > > > > > > > It means that one domain may include several host bridges. > > > > > At that level > > > > > domain is defined as something that have unique name for each device > > > > > inside it thus no two buses in one segment/domain can have same bus > > > > > number. This is what PCI spec tells you. > > > > > > > > And that really is enough for CLI because all we need is locate the > > > > specific slot in a unique way. > > > > > > > At qemu level we do not have bus numbers. They are assigned by a guest. > > > So inside a guest domain:bus:slot.func points you to a device, but in > > > qemu does not enumerate buses. > > > > > > > > And this further shows that using "domain" as defined by guest is very > > > > > bad idea. > > > > > > > > As defined by ACPI, really. > > > > > > > ACPI is a part of a guest software that may not event present in the > > > guest. How is it relevant? > > > > It's relevant because this is what guests use. To access the root > > device with cf8/cfc you need to know the bus number assigned to it > > by firmware. How that was assigned is of interest to BIOS/ACPI but not > > really interesting to the user or, I suspect, guest OS. > > > Of course this is incorrect. OS can re-enumerate PCI if it wishes. Linux > have cmd just for that. I haven't looked but I suspect linux will simply assume cf8/cfc and and start doing it from there. If that doesn't get you the root device you wanted, tough. > And saying that ACPI is relevant because this is > what guest software use in a reply to sentence that states that not all > guest even use ACPI is, well, strange. > > And ACPI describes only HW that present at boot time. What if you > hot-plugged root pci bridge? How non existent PCI naming helps you? that's described by ACPI as well. > > > > > > > ACPI spec > > > > > > > says that PCI segment group is purely software concept managed by > > > > > > > system > > > > > > > firmware. In fact one segment may include multiple PCI host > > > > > > > bridges. > > > > > > > > > > > > It can't I think: > > > > > Read _BBN definition: > > > > > The _BBN object is located under a PCI host bridge and must be > > > > > unique for > > > > > every host bridge within a segment since it is the PCI bus number. > > > > > > > > > > Clearly above speaks about multiple host bridge within a segment. > > > > > > > > Yes, it looks like the firmware spec allows that. > > > It even have explicit example that shows it. > > > > > > > > > > > > > Multiple Host Bridges > > > > > > > > > > > > A platform may have multiple PCI Express or PCI-X host bridges. > > > > > > The base > > > > > > address for the > > > > > > MMCONFIG space for these host bridges may need to be allocated > > > > > > at > > > > > > different locations. In such > > > > > > cases, using MCFG table and _CBA method as defined in this > > > > > > section means > > > > > > that each of these host > > > > > > bridges must be in its own PCI Segment Group. > > > > > > > > > > > This is not from ACPI spec, > > > > > > > > PCI Firmware Specification 3.0 > > > > > > > > > but without going to deep into it above > > > > > paragraph talks about some particular case when each host bridge must > > > > > be in its own PCI Segment Group with is a definite prove that in other > > > > > cases multiple host bridges can be in on segment group. > > > > > > > > I stand corrected. I think you are right. But note that if they are, > > > > they must have distinct bus numbers assigned by ACPI. > > > ACPI does not assign any numbers. > > > > For all root pci devices firmware must supply BBN number. This is the > > bus number, isn't it? For nested buses, this is optional. > Nonsense. _BBN is optional and does not present in Seabios DSDT. The spec says it's not optional for host bridges: Firmware must report Host Bridges in the ACPI name space. Each Host Bridge object must contain the following objects: _HID and _CID _CRS to determine all resources consumed and produced (passed through to the secondary bus) by the host bridge. Firmware allocates resources (Memory Addresses, I/O Port, etc.) to Host Bridges. The _CRS descriptor informs the operating system of the resources it may use for configuring devices below the Host Bridge. ● _TRA, _TTP, and _TRS translation offsets to inform the operating system of the mapping between the primary bus and the secondary bus. _PRT and the interrupt descriptor to determine interrupt routing. _BBN to obtain a bus number. so seabios seems to be out of spec. > As far > as I can tell it is only needed if PCI segment group has more then one > pci host bridges. No. Because cfc/cf8 are not aware of _SEG. > > > > > Bios enumerates buses and assign > > > numbers. > > > > There's no standard way to enumerate pci root devices in guest AFAIK. > > The spec says: > > Firmware must configure all Host Bridges in the systems, even if > > they are not connected to a console or boot device. Firmware must > > configure Host Bridges in order to allow operating systems to use the > > devices below the Host Bridges. This is because the Host Bridges > > programming model is not defined by the PCI Specifications. > > > > > Guest should be aware of HW to use it. Be it through bios or driver. Why should it? You get bus number and stiuck it in cf8/cfc, you get a config cycle. No magic HW awareness needed. > > > ACPI, in a base case, describes what BIOS did to OSPM. Qemu sits > > > one layer below all this and does not enumerate PC buses. Even if we make > > > it to do so there is not way to guaranty that guest will enumerate them > > > in the same order since there is more then one way to do enumeration. I > > > repeated this numerous times to you already. > > > > ACPI is really part of the motherboard. Calling it the guest just > > confuses things. Guest OS can override bus numbering for nested buses > > but not for root buses. > > > If calling ACPI part of a guest confuses you then you are already > confused. Guest OS can do whatever it wishes with any enumeration FW did > if it knows better. > > > > > > > > > > > > > > > > > > _SEG > > > > > > > is not what OSPM uses to tie HW resource to ACPI resource. It > > > > > > > used _CRS > > > > > > > (Current Resource Settings) for that just like OF. No surprise > > > > > > > there. > > > > > > > > > > > > OSPM uses both I think. > > > > > > > > > > > > All I see linux do with CRS is get the bus number range. > > > > > So lets assume that HW has two PCI host bridges and ACPI has: > > > > > Device(PCI0) { > > > > > Name (_HID, EisaId ("PNP0A03")) > > > > > Name (_SEG, 0x00) > > > > > } > > > > > Device(PCI1) { > > > > > Name (_HID, EisaId ("PNP0A03")) > > > > > Name (_SEG, 0x01) > > > > > } > > > > > I.e no _CRS to describe resources. How do you think OSPM knows which > > > > > of > > > > > two pci host bridges is PCI0 and which one is PCI1? > > > > > > > > You must be able to uniquely address any bridge using the combination > > > > of _SEG > > > > and _BBN. > > > > > > No at all. And saying "you must be able" without actually show how > > > doesn't prove anything. _SEG is relevant only for those host bridges > > > that support MMCONFIG (not all of them do, and none that qemu support > > > does yet). _SEG points to specific entry in MCFG table and MCFG entry > > > holds base address for MMCONFIG space for the bridge (this address > > > is configured by a guest). This is all _SEG does really, no magic at > > > all. _BBN returns bus number assigned by the BIOS to host bridge. Nothing > > > qemu visible again. > > > So _SEG and _BBN gives you two numbers assigned by > > > a guest FW. Nothing qemu can use to identify a device. > > > > This FW is given to guest by qemu. It only assigns bus numbers > > because qemu told it to do so. > Seabios is just a guest qemu ships. There are other FW for qemu. Bochs > bios, openfirmware, efi. All of them where developed outside of qemu > project and all of them are usable without qemu. You can't consider them > be part of qemu any more then Linux/Windows with virtio drivers. > > > > > > > > > > > > > And the spec says, e.g.: > > > > > > > > > > > > the memory mapped configuration base > > > > > > address (always corresponds to bus number 0) for the PCI > > > > > > Segment Group > > > > > > of the host bridge is provided by _CBA and the bus range > > > > > > covered by the > > > > > > base address is indicated by the corresponding bus range > > > > > > specified in > > > > > > _CRS. > > > > > > > > > > > Don't see how it is relevant. And _CBA is defined only for PCI > > > > > Express. Lets > > > > > solve the problem for PCI first and then move to PCI Express. Jumping > > > > > from one > > > > > to another destruct us from main discussion. > > > > > > > > I think this is what confuses us. As long as you are using cf8/cfc > > > > there's no > > > > concept of a domain really. > > > > Thus: > > > > /p...@i0cf8 > > > > > > > > is probably enough for BIOS boot because we'll need to make root bus > > > > numbers > > > > unique for legacy guests/option ROMs. But this is not a hardware > > > > requirement > > > > and might become easier to ignore eith EFI. > > > > > > > You do not need MMCONFIG to have multiple PCI domains. You can have one > > > configured via standard cf8/cfc and another one on ef8/efc and one more > > > at mmio fce00000 and you can address all of them: > > > /p...@i0cf8 > > > /p...@i0ef8 > > > /p...@fce00000 > > > > > > And each one of those PCI domains can have 256 subbridges. > > > > Will common guests such as windows or linux be able to use them? This > With proper drivers yes. There is HW with more then one PCI bus and I > think qemu emulates some of it (PPC MAC for instance). > > > seems to be outside the scope of the PCI Firmware specification, which > > says that bus numbers must be unique. > They must be unique per PCI segment group. > > > > > > > > > > > > > > > > > > > > > > > > > That should be enough for e.g. device_del. We do have the need > > > > > > > > to > > > > > > > > describe the topology when we interface with firmware, e.g. to > > > > > > > > describe > > > > > > > > the ACPI tables themselves to qemu (this is what Gleb's patches > > > > > > > > deal > > > > > > > > with), but that's probably the only case. > > > > > > > > > > > > > > > Describing HW topology is the only way to unambiguously describe > > > > > > > device to > > > > > > > something or someone outside qemu and have persistent device > > > > > > > naming > > > > > > > between different HW configuration. > > > > > > > > > > > > Not really, since ACPI is a binary blob programmed by qemu. > > > > > > > > > > > APCI is part of the guest, not qemu. > > > > > > > > Yes it runs in the guest but it's generated by qemu. On real hardware, > > > > it's supplied by the motherboard. > > > > > > > It is not generated by qemu. Parts of it depend on HW and other part > > > depend > > > on how BIOS configure HW. _BBN for instance is clearly defined to return > > > address assigned bu the BIOS. > > > > BIOS is supplied on the motherboard and in our case by qemu as well. > You can replace MB bios by coreboot+seabios on some of them. > Manufacturer don't want you to do it and make it hard to do, but > otherwise this is just software, not some magic dust. > > > There's no standard way for BIOS to assign bus number to the pci root, > > so it does it in device-specific way. Why should a management tool > > or a CLI user care about these? As far as they are concerned > > we could use some PV scheme to find root devices and assign bus > > numbers, and it would be exactly the same. > > > Go write KVM userspace that does that. AFAIK there is project out there > that tries to do that. No luck so far. Your world view is very x86/Linux > centric. You need to broaden it a little bit. Next time you propose > something ask yourself will it work with qemu-sparc, qemu-ppc, qemu-amd. > > > > > > > Just saying "not really" doesn't > > > > > prove much. I still haven't seen any proposition from you that > > > > > actually > > > > > solve the problem. No, "lets use guest naming" is not it. There is no > > > > > such thing as "The Guest". > > > > > > > > > > -- > > > > > Gleb. > > > > > > > > I am sorry if I didn't make this clear. I think we should use the > > > > domain:bus > > > > pair to name the root device. As these are unique and > > > > > > > You forgot to complete the sentence :) But you made it clear enough and > > > it is incorrect. domain:bus pair not only not unique they do not exist > > > in qemu at all > > > > Sure they do. domain maps to mcfg address for express. bus is used for > mcfg is optional as far as I can see. You can compile out MMCONFIG > support on Linux. > > > cf8/cfc addressing. They are assigned by BIOS but since BIOS > > is supplied with hardware the point is moot. > Most PC hardware is supplied with Windows, so what? BIOS is a code that > runs in a guest. It is part of a guest. Every line of code executed by > vcpu belongs to a guest. No need to redefine things to prove you point. > > > > > > and as such can't be used to address device. They are > > > product of HW enumeration done by a guest OS just like eth0 or C:. > > > > > > -- > > > Gleb. > > > > There's a huge difference between BIOS and guest OS, > Not true. > > > and between bus > > numbers of pci root and of nested bridges. > Really? What is it? > > > > > Describing hardware io ports makes sense if you are trying to > > communicate data from qemu to the BIOS. But the rest of the world might > > not care. > > > The part of the world that manage HW cares. You may need to add device > from monitor before first line of BIOS is event executed. How can you > rely on BIOS enumerate of devices in this case? > > > -- > Gleb.