Hello guys, On Tue, Jul 9, 2019 at 7:52 PM Jerin Jacob Kollanukkaran <[email protected]> wrote:
> > -----Original Message----- > > From: Burakov, Anatoly <[email protected]> > > Sent: Tuesday, July 9, 2019 8:07 PM > > To: Jerin Jacob Kollanukkaran <[email protected]>; David Marchand > > <[email protected]> > > Cc: dev <[email protected]>; Thomas Monjalon <[email protected]>; Ben > > Walker <[email protected]> > > Subject: Re: [EXT] Re: [dpdk-dev] [PATCH] bus/pci: fix IOVA as VA mode > > selection > issue. > > >> > > >> I wouldn't classify this as "needing" IOVA. "Need" implies it cannot > > >> work without it, whereas in this case it's more of a "highly > > >> recommended" rather than "need". > > > > > > It is "need" as performance is horrible without it as is per packet SW > > translation. > > > A "need" for DPDK performance perspective. > > > > Would the driver fail to initialize if it detects running as IOVA as PA? > > Yes. > https://git.dpdk.org/dpdk/tree/drivers/net/octeontx2/otx2_ethdev.c#n1191 > > > Also, some other use cases will also require IOVA as PA while having full > > IOMMU support. An example of this would be systems with limited IOMMU > > width (such as VM's) - even though the IOMMU is technically supported, we > > may not have the necessary address width to run all devices in IOVA as VA > > mode, and would need to fall back to IOVA as PA. > > Since we cannot *require* IOVA as VA in current codebase, any driver that > > expects IOVA as VA to always be enabled will presumably not work. > > > > > > > > Again, it is not device attribute, it is system attribute. > > > > If it's a system attribute, why is it a device driver flag then? The > system may > > or may not support IOMMU, the device itself probably doesn't care since > bus > > address looks the same in both cases, *but the driver > > might* (such as would be in your case - requiring IOVA as VA and > disallowing > > IOVA as PA for performance reasons). > > Agree. > > > > > Currently (again, disregarding your interpretation of how IOVA as VA > works > > and looking at the actual commit history), we always seem to imply that > IOVA > > as PA works for all devices, and we use IOVA_AS_VA flag to indicate that > the > > device *also* supports IOVA as VA mode. > > > > But we don't have any way to express a *requirement* for IOVA as VA mode > > - only for IOVA as PA mode. That is the purpose of the new flag. You are > > stating that the IOVA_AS_VA drv flag is an expression of that > requirement, > > but that is not reflected in the codebase - our commit history indicates > that > > we don't treat IOVA as VA as hard requirement whenever this flag is > > specified (and i would argue that we shouldn't). > > No objection to further classify it. > I propose to introduce: * RTE_PCI_DRV_IOVA_AS_PA which means "the combination of the pmd+kmod+hw supports usage of Physical Addresses" * RTE_PCI_DRV_IOVA_AS_VA which means "the combination of the pmd+kmod+hw supports usage of Virtual Addresses" - For the pci bus, the algorigthm would be: devices_want_pa = false devices_want_va = false Foreach pci device Skip blacklisted devices Skip unbound devices (i.e. we only consider devices bound to a known kernel driver) Skip unsupported devices (i.e. we only consider devices that have a pmd that supports them) If the combination pmd+kmod only supports VA (RTE_PCI_DRV_IOVA_AS_VA capability in driver flags), then devices_want_va = true Else if the combination pmd+kmod only supports PA (RTE_PCI_DRV_IOVA_AS_PA capability in driver flags), then devices_want_pa = true If devices_want_va and !devices_want_pa return RTE_IOVA_VA If devices_want_pa and !devices_want_va return RTE_IOVA_PA return RTE_IOVA_DC Notes: * the IOMMU limitations are considered as a per device/driver thing, since the kmod is the one that configures the system IOMMU, * the case "devices_want_pa and devices_want_va" is considered as DC, we leave EAL decide based on the physical addresses availability because we can't comply with all present devices/drivers in the system. This means that at bus probe time for a device, we must add a check that the combination is fulfilled (and avoid this check in the drivers themselves). - For the global bus code, that aggregates the different buses preferences, we need to do the same, while I suspect a bug at the moment. The algorigthm: buses_want_pa = false buses_want_va = false Foreach bus If the bus reports RTE_IOVA_VA, then buses_want_va = true Else if the bus reports RTE_IOVA_PA, then buses_want_pa = true If buses_want_va and !buses_want_pa return RTE_IOVA_VA If buses_want_pa and !buses_want_va return RTE_IOVA_PA return RTE_IOVA_DC - Finally at EAL level, we keep the current code. Hope I did not miss anything. If we agree on this, I will send the changes and an update in the documentation. -- David Marchand

