On Wed, Feb 17, 2016 at 8:58 PM, Alex Williamson
<alex.william...@redhat.com> wrote:
> On Wed, 17 Feb 2016 17:15:09 +0530
> Jayachandran Chandrashekaran Nair
> <jayachandran.chandrasheka...@broadcom.com> wrote:
>
>> On Wed, Feb 17, 2016 at 3:55 AM, Alex Williamson
>> <alex.william...@redhat.com> wrote:
>> > On Wed, 17 Feb 2016 02:38:24 +0530
>> > Jayachandran Chandrashekaran Nair
>> > <jayachandran.chandrasheka...@broadcom.com> wrote:
>> >
>> >> On Tue, Feb 16, 2016 at 1:09 AM, Alex Williamson
>> >> <alex.william...@redhat.com> wrote:
>> >> > On Mon, 15 Feb 2016 12:20:23 -0600
>> >> > Bjorn Helgaas <helg...@kernel.org> wrote:
>> >> >
>> >> >> [+cc Alex, iommu list]
>> >> >>
>> >> >> On Mon, Feb 15, 2016 at 03:35:00AM +0530, Jayachandran C wrote:
>> >> >> > Add a new flag PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS to indicate bridges
>> >> >> > that should not be considered during DMA alias search. This is
>> >> >> > to support hardware (in this case Broadcom Vulcan PCIe subsystem)
>> >> >> > that has internal bridges which have either missing or wrong PCIe
>> >> >> > capabilities.
>> >> >
>> >> > I figured this would come at some point, the right answer is of course
>> >> > to follow the PCIe spec and implement the required PCIe capability in
>> >> > the hardware.
>> >>
>> >> There are  PCIe controllers on the chip that follows the spec, the issue 
>> >> is
>> >> how it is integrated to the northbridge (equivalent) on the chip, I have
>> >> tried to explain it below.
>> >>
>> >> >> This needs more explanation, like what exactly is wrong with this
>> >> >> device?  A missing PCIe capability might cause other problems.
>> >> >>
>> >> >> What problem does this fix?  Without these patches, do we merely add
>> >> >> aliases that are unnecessary?  Do we crash because something goes
>> >> >> wrong in the pci_pcie_type() switch because of the incorrect
>> >> >> capability?
>> >>
>> >> Here's how (for example) how the PCI enumeration of a 2 node Vulcan
>> >> processor will look like:
>> >>
>> >>
>> >> [0] +-0.0.0--[1]---+--1.a.0----[2]-----2.0.0---[3]----3.0.0
>> >>     |              +--1.a.1----[4]-----4.0.0---[5]----5.0.0
>> >>     |              .
>> >>     |              ... etc...
>> >>     |
>> >>     +-0.0.1--[10]--+-10.a.0----[11]---11.0.0---[12]---12.0.0
>> >>                    +-10.a.1----[13]---13.0.0---[14]---14.0.0
>> >>                    .
>> >>                    ... etc...
>> >
>> > So we have:
>> >
>> >     "Glue Bridge"---"Glue Bridges"---Root Ports---Endpoints
>> >       (no pcie)      (pci/x-pcie)
>>
>> The top level is one glue bridge per chip in a multi-chip board,
>> but otherwise this is accurate.
>>
>> >>
>> >> The devices 0.0.x and x.a.x are glue bridges that are there to
>> >> bring the real PCIe controllers (pcie cap type 4) 2.0.0, 4.0.0,
>> >> 11.0.0, 13.0.0 under a single PCI hierarchy. The x.a.x bridges
>> >> have a PCIe capability (type 8) and 0.0.x does not have any pcie
>> >> capability.
>> >>
>> >> In case of Vulcan, both the GICv3 ITS driver code and the SMMUv3
>> >> driver code does dma alias walk to find the device id to use
>> >> in ITS and SMMU. In both cases it will ignore the x.0.0 bridges
>> >> since they are type root port, but will continue on up and end
>> >> up with incorrect device id.
>> >>
>> >> The flag I have added is to make the pci_for_each_dma_alias()
>> >> ignore the last 2 levels of glue/internal bridges.
>> >
>> > Per the PCIe spec, I'm not even sure what you have is a valid
>> > hierarchy, root ports sit on root complexes, not behind a PCI-to-PCIe
>> > bridge.  So really you're pretending that the downstream "glue bridge"
>> > is your host bridge and therefore root complex, but the PCI topology
>> > would clearly dictate that the top-most bus is conventional PCI and
>> > therefore everything is an alias of everything else and the DMA alias
>> > code is doing exactly what it should.
>>
>> Yes, I am not arguing that there is any issue in the current code. I
>> am trying to figure out the correct way to implement the quirk. We
>> have to support the hardware we have, not the hardware we want to
>> have :)
>>
>> > Also note that the caller of pci_for_each_dma_alias() owns the callback
>> > function triggered for each alias, that function could choose to prune
>> > out these "glue bridges" itself if that's the appropriate thing to do.
>>
>> I had implemented this first, but moved to the current approach because
>> it is needed in multiple places. The problem is: "On vulcan, while
>> going up the pcie bus heirarchy for finding aliases, skip the glue bridges",
>> So the logical place to put the fix is in pci_for_each_dma_alias()
>>
>> > Where do the SMMU and ITS actually reside in the above diagram?  You
>> > can use the callout function to stop the traversal anywhere you wish,
>> > it's free to return a -errno to stop or positive number, which the
>> > caller can interpret as error or non-failure stop condition.
>>
>> The SMMU (for translation requests) and ITS (for MSI writes for
>> interrupts) are connected directly to the proper PCIe controller
>> (2/4/11/13.0.0 above)
>
>
> If the translation unit is on the root port then DMA aliases upstream of
> the translation unit are irrelevant and the callback function should
> stop the traversal at that point.
>
>> > You could even think about changing what Linux considers to be the host
>> > bridge.  Maybe these glue bridges shouldn't even be visible to Linux,
>> > would that also solve the issue with the broken BAR?
>>
>> The glue bridges have to seen by Linux for assigning resources like
>> bus ranges and memory windows. So these are required bridges in
>> the topology and will work with the standard linux code
>> (provided we skip them for aliases, and and ignore the BAR).
>>
>> >> > The change takes the same code path as it would for a real PCIe bridge
>> >> > port (downstream/upstream/root), which means they want to skip adding
>> >> > this bridge as an alias of the device.  So we're adding in aliases that
>> >> > don't exist, the bridge itself.
>> >> >
>> >> > If anything I'd suggest a flag that actually tries to address the
>> >> > problem rather than a symptom of the problem.  For example, maybe the
>> >> > flag should be PCI_DEV_FLAGS_IS_PCIE.  Maybe pci_is_pcie() should even
>> >> > take that into account.  That has some trickle through for
>> >> > pci_pcie_type() and all the accessor functions, but maybe it's a
>> >> > cleaner solution overall (or maybe it explodes further).  Thanks,
>> >>
>> >> I didn't really want to mark the glue bridges as PCIe or have fake
>> >> PCIe capability there, the obvious simple solution was to add
>> >> the flag PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS
>> >>
>> >> Any suggestions/comments on how to do this better would be welcome.
>> >
>> > I definitely don't think either flag idea is the right solution, I
>> > think you've actually already got the tools you need to solve it by
>> > putting the intelligence in the callback function or by going further
>> > down the path of pretending the glue bridge is really a host bridge.
>>
>> If I understand your suggestion, I need to fake the PCIe capability
>> for the glue bridges, which may be reasonable for the second level.
>> But for the first level, there is no pcie capability and faking that
>> becomes more difficult.
>
> That's not what I'm suggesting.
>
>> If the concern is adding a flag for just one platform, I can check
>> for the vendor ID/device ID in the pci_for_each_dma_alias()
>> function, which would be slightly uglier, but still a simple and
>> maintainable solution.
>>
>> Another option is to break the pci_for_each_dma_alias when it
>> reaches to root port (for vulcan), this too can be done without
>> a flag.
>
> Exactly, this can be done by the callback function provided on vulcan
> systems and requires no core changes.  The IOMMU needs to provide a
> callback function that makes sense for collecting aliases on the
> specific platform.  If the translation unit is not on the root bus, the
> callback function should stop the topology walk earlier.  Thanks,

Fixing every callback function is going to be painful and ugly.
I will have to change drivers/irqchip/irq-gic-v3-its-pci-msi.c,
drivers/pci/msi.c, and drivers/iommu/arm-smmu-v3.c with the
same check for vendor/device ids and return early.

I can update the patch to check the vendor/device id in the
case of ROOT_PORT in pci_for_each_dma_alias(). If you
think that will not be acceptable, I don't have any reasonable
options left.

Thanks,
JC.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to