On 2014-08-16 10:45, Jan Kiszka wrote: > On 2014-08-16 09:54, Knut Omang wrote: >> On Fri, 2014-08-15 at 19:37 +0800, Le Tan wrote: >>> Hi Knut, >>> >>> 2014-08-15 19:15 GMT+08:00 Knut Omang <knut.om...@oracle.com>: >>>> On Fri, 2014-08-15 at 06:42 +0200, Knut Omang wrote: >>>>> On Thu, 2014-08-14 at 14:10 +0200, Jan Kiszka wrote: >>>>>> On 2014-08-14 13:15, Michael S. Tsirkin wrote: >>>>>>> On Mon, Aug 11, 2014 at 03:04:57PM +0800, Le Tan wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> These patches are intended to introduce Intel IOMMU (VT-d) emulation >>>>>>>> to q35 >>>>>>>> chipset. The major job in these patches is to add support for >>>>>>>> emulating Intel >>>>>>>> IOMMU according to the VT-d specification, including basic responses >>>>>>>> to CSRs >>>>>>>> accesses, the logics of DMAR (DMA remapping) and DMA memory address >>>>>>>> translations. >>>>>>> >>>>>>> Thanks! >>>>>>> Looks very good overall, I noted some coding style issues - I didn't >>>>>>> bother reporting each issue in every place where it appears - reported >>>>>>> each issue once only, so please find and fix all instances of each >>>>>>> issue. >>>>>> >>>>>> BTW, because I was in urgent need for virtual test environment for >>>>>> Jailhouse, I hacked interrupt remapping on top of Le's patches: >>>>>> >>>>>> http://git.kiszka.org/?p=qemu.git;a=shortlog;h=refs/heads/queues/vtd-intremap >>>>>> >>>>>> The approach likely needs further discussions and refinements but it >>>>>> already allows me to work on top with our hypervisor, and also Linux. >>>>>> You can see from the last commit that Le's work made it pretty easy to >>>>>> build this on top. >>>>> >>>>> Le, >>>>> >>>>> I have tried Jan's branch with my device setup which consists of a >>>>> minimal q35 setup, an ioh3420 root port (specified as -device >>>>> ioh3420,slot=0 ) and a pcie device plugged into that root port, which >>>>> gives the following lscpi -t: >>>>> >>>>> -[0000:00]-+-00.0 >>>>> +-01.0 >>>>> +-02.0 >>>>> +-03.0-[01]----00.0 >>>>> +-04.0 >>>>> +-1f.0 >>>>> +-1f.2 >>>>> \-1f.3 >>>>> >>>>> All seems to work beautifully (I see the ISA bridge happily receive >>>>> translations) until the first DMA from my device model (at 1:00.0) >>>>> arrives, at which point I get: >>>>> >>>>> [ 1663.732413] dmar: DMAR:[DMA Write] Request device [00:03.0] fault addr >>>>> fffa0000 >>>>> [ 1663.732413] DMAR:[fault reason 02] Present bit in context entry is >>>>> clear >>>>> >>>>> I would have expected request device 01:00.0 for this. >>>>> It is not clear to me yet if this is a weakness of the implementation of >>>>> ioh3420 or the iommu. Just wanted to let you know right away in case you >>>>> can shed some light to it or it is an easy fix, >>>>> >>>>> The device uses pci_dma_rw with itself as device pointer. >>>> >>>> To verify my hypothesis: with this rude hack my device now works much >>>> better: >>>> >>>> @@ -774,6 +780,8 @@ static void iommu_translate(VTDAddressSpace *vtd_as, >>>> int bus_num, int devfn, >>>> is_fpd_set = ce.lo & VTD_CONTEXT_ENTRY_FPD; >>>> } else { >>>> ret_fr = dev_to_context_entry(s, bus_num, devfn, &ce); >>>> + if (ret_fr) >>>> + ret_fr = dev_to_context_entry(s, 1, 0, &ce); >>>> is_fpd_set = ce.lo & VTD_CONTEXT_ENTRY_FPD; >>>> if (ret_fr) { >>>> ret_fr = -ret_fr; >>>> >>>> Looking at how things look on hardware, multiple devices often receive >>>> overlapping DMA address ranges for different physical addresses. >>>> >>>> So if I understand the way this works, every requester ID would also >>>> need to have it's own unique VTDAddressSpace, as each pci >>>> device/function sees a unique DMA address space.. >>> >>> ioh3420 is a pcie-to-pcie bridge, right? >> >> Yes. >> >>> In my opinion, each pci-e >>> device behind the pcie-to-pcie bridge can be assigned individually. >>> For now I added the VT-d to q35 by just adding it to the root pci bus. >>> You can see here in q35.c: >>> pci_setup_iommu(pci_bus, q35_host_dma_iommu, mch->iommu); >>> So if we add a pcie-to-pcie bridge, we may have to call the >>> pci_setup_iommu() for that new bus. I don't know where to hook into >>> this now. :) If you know the mechanism behind that, you can try to add >>> that for the new bus. (I will dive into this after the clean up.) >>> What do you think? >> >> Thanks for the quick answer, that helped a lot! >> >> Looking into the details here I realize it is slightly more complicated: >> secondary buses are enumerated after device instantiation, as part of >> the host PCI enumeration, so if I add a similar setup call in the bridge >> setup, it will be called for a new device long before it has received >> it's bus number from the OS (via config[PCI_SECONDARY_BUS] ) >> >> I agree that the lookup function for contexts needs to be as efficient >> as possible so the simple <busno,defvn> lookup key may be the best >> solution but then the address_spaces table cannot be populated with the >> secondary bus entries before it receives a nonzero != 255 bus number, >> eg. along the lines of this: >> >> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c >> index 4becdc1..d9a8c23 100644 >> --- a/hw/pci/pci_bridge.c >> +++ b/hw/pci/pci_bridge.c >> @@ -265,6 +265,12 @@ void pci_bridge_write_config(PCIDevice *d, >> pci_bridge_update_mappings(s); >> } >> >> + if (ranges_overlap(address, len, PCI_SECONDARY_BUS, 1)) { >> + int bus_num = pci_bus_num(&s->sec_bus); >> + if (bus_num != 0xff && bus_num != 0x00) >> + <handle bus number change> >> + } >> + >> newctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL); >> if (~oldctl & newctl & PCI_BRIDGE_CTL_BUS_RESET) { >> /* Trigger hot reset on 0->1 transition. */ >> >> but it is getting complicated... >> Thoughts? > > Point to the PCI bus from VTDAddressSpace instead of storing the bus_num > there?
Also, each PCIe bus should hold an array of VTDAddressSpaces, instead of the IntelIOMMUState. Jan
signature.asc
Description: OpenPGP digital signature