RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
> From: Alex Williamson > Sent: Wednesday, April 15, 2020 8:36 AM > > On Tue, 14 Apr 2020 23:57:33 + > "Tian, Kevin" wrote: > > > > From: Alex Williamson > > > Sent: Tuesday, April 14, 2020 11:24 PM > > > > > > On Tue, 14 Apr 2020 03:42:42 + > > > "Tian, Kevin" wrote: > > > > > > > > From: Alex Williamson > > > > > Sent: Tuesday, April 14, 2020 11:29 AM > > > > > > > > > > On Tue, 14 Apr 2020 02:40:58 + > > > > > "Tian, Kevin" wrote: > > > > > > > > > > > > From: Alex Williamson > > > > > > > Sent: Tuesday, April 14, 2020 3:21 AM > > > > > > > > > > > > > > On Mon, 13 Apr 2020 08:05:33 + > > > > > > > "Tian, Kevin" wrote: > > > > > > > > > > > > > > > > From: Tian, Kevin > > > > > > > > > Sent: Monday, April 13, 2020 3:55 PM > > > > > > > > > > > > > > > > > > > From: Raj, Ashok > > > > > > > > > > Sent: Monday, April 13, 2020 11:11 AM > > > > > > > > > > > > > > > > > > > > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson > > > wrote: > > > > > > > > > > > On Tue, 7 Apr 2020 21:00:21 -0700 > > > > > > > > > > > "Raj, Ashok" wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > > > > > > > + Bjorn > > > > > > > > > > > > > > > > > > > > > > + Don > > > > > > > > > > > > > > > > > > > > > > > FWIW I can't understand why PCI SIG went different ways > > > with > > > > > ATS, > > > > > > > > > > > > where its enumerated on PF and VF. But for PASID and > PRI its > > > > > only > > > > > > > > > > > > in PF. > > > > > > > > > > > > > > > > > > > > > > > > I'm checking with our internal SIG reps to followup on > that. > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex > Williamson > > > > > wrote: > > > > > > > > > > > > > > Is there vendor guarantee that hidden registers will > locate > > > at > > > > > the > > > > > > > > > > > > > > same offset between PF and VF config space? > > > > > > > > > > > > > > > > > > > > > > > > > > I'm not sure if the spec really precludes hidden > > > > > > > > > > > > > registers, > > > but > > > > > the > > > > > > > > > > > > > fact that these registers are explicitly outside of > > > > > > > > > > > > > the > > > capability > > > > > > > > > > > > > chain implies they're only intended for device > > > > > > > > > > > > > specific > use, > > > so > > > > > I'd > > > > > > > say > > > > > > > > > > > > > there are no guarantees about anything related to > > > > > > > > > > > > > these > > > > > registers. > > > > > > > > > > > > > > > > > > > > > > > > As you had suggested in the other thread, we could > consider > > > > > > > > > > > > using the same offset as in PF, but even that's a better > guess > > > > > > > > > > > > still not reliable. > > > > > > > > > > > > > > > > > > > > > > > > The other option is to maybe extend driver ops in the PF > to > > > > > expose > > > > > > > > > > > > where the offsets should be. Sort of adding the quirk in > the > > > > > > > > > > > > implementation. > > > > > > > > > > > > > > > > > > > > > > > > I'm not sure how prevalent are PASID and PRI in VF > devices. If > > > > > SIG is > > > > > > > > > > resisting > > > > > > > > > > > > making VF's first class citizen, we might ask them to > > > > > > > > > > > > add > > > some > > > > > > > verbiage > > > > > > > > > > > > to suggest leave the same offsets as PF open to help > > > emulation > > > > > > > software. > > > > > > > > > > > > > > > > > > > > > > Even if we know where to expose these capabilities on the > VF, > > > it's > > > > > not > > > > > > > > > > > clear to me how we can actually virtualize the capability > itself. > > > If > > > > > > > > > > > the spec defines, for example, an enable bit as r/w then > > > software > > > > > that > > > > > > > > > > > interacts with that register expects the bit is settable. > There's > > > no > > > > > > > > > > > protocol for "try to set the bit and re-read it to see if > > > > > > > > > > > the > > > hardware > > > > > > > > > > > accepted it". Therefore a capability with a fixed enable > > > > > > > > > > > bit > > > > > > > > > > > representing the state of the PF, not settable by the VF, > > > > > > > > > > > is > > > > > > > > > > > disingenuous to the spec. > > > > > > > > > > > > > > > > > > > > I think we are all in violent agreement. A lot of times the > > > > > > > > > > pci > spec > > > > > gets > > > > > > > > > > defined several years ahead of real products and no one > > > > > remembers > > > > > > > > > > the justification on why they restricted things the way they > did. > > > > > > > > > > > > > > > > > > > > Maybe someone early product wasn't quite exposing these > > > features > > > > > to > > > > > > > the > > > > > > > > > > VF > > > > > > > > > > and hence the spec is bug compatible :-) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If what we're trying to do is expose that PASID and PRI > > > > > > > > > > > are > > > enabled > > > > > on > > > > > > > > > > > the
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Tue, 14 Apr 2020 23:57:33 + "Tian, Kevin" wrote: > > From: Alex Williamson > > Sent: Tuesday, April 14, 2020 11:24 PM > > > > On Tue, 14 Apr 2020 03:42:42 + > > "Tian, Kevin" wrote: > > > > > > From: Alex Williamson > > > > Sent: Tuesday, April 14, 2020 11:29 AM > > > > > > > > On Tue, 14 Apr 2020 02:40:58 + > > > > "Tian, Kevin" wrote: > > > > > > > > > > From: Alex Williamson > > > > > > Sent: Tuesday, April 14, 2020 3:21 AM > > > > > > > > > > > > On Mon, 13 Apr 2020 08:05:33 + > > > > > > "Tian, Kevin" wrote: > > > > > > > > > > > > > > From: Tian, Kevin > > > > > > > > Sent: Monday, April 13, 2020 3:55 PM > > > > > > > > > > > > > > > > > From: Raj, Ashok > > > > > > > > > Sent: Monday, April 13, 2020 11:11 AM > > > > > > > > > > > > > > > > > > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson > > wrote: > > > > > > > > > > On Tue, 7 Apr 2020 21:00:21 -0700 > > > > > > > > > > "Raj, Ashok" wrote: > > > > > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > > > > > + Bjorn > > > > > > > > > > > > > > > > > > > > + Don > > > > > > > > > > > > > > > > > > > > > FWIW I can't understand why PCI SIG went different ways > > with > > > > ATS, > > > > > > > > > > > where its enumerated on PF and VF. But for PASID and PRI > > > > > > > > > > > its > > > > only > > > > > > > > > > > in PF. > > > > > > > > > > > > > > > > > > > > > > I'm checking with our internal SIG reps to followup on > > > > > > > > > > > that. > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Is there vendor guarantee that hidden registers will > > > > > > > > > > > > > locate > > at > > > > the > > > > > > > > > > > > > same offset between PF and VF config space? > > > > > > > > > > > > > > > > > > > > > > > > I'm not sure if the spec really precludes hidden > > > > > > > > > > > > registers, > > but > > > > the > > > > > > > > > > > > fact that these registers are explicitly outside of the > > > > > > > > > > > > > > capability > > > > > > > > > > > > chain implies they're only intended for device specific > > > > > > > > > > > > use, > > so > > > > I'd > > > > > > say > > > > > > > > > > > > there are no guarantees about anything related to these > > > > > > > > > > > > > > > > registers. > > > > > > > > > > > > > > > > > > > > > > As you had suggested in the other thread, we could > > > > > > > > > > > consider > > > > > > > > > > > using the same offset as in PF, but even that's a better > > > > > > > > > > > guess > > > > > > > > > > > still not reliable. > > > > > > > > > > > > > > > > > > > > > > The other option is to maybe extend driver ops in the PF > > > > > > > > > > > to > > > > expose > > > > > > > > > > > where the offsets should be. Sort of adding the quirk in > > > > > > > > > > > the > > > > > > > > > > > implementation. > > > > > > > > > > > > > > > > > > > > > > I'm not sure how prevalent are PASID and PRI in VF > > > > > > > > > > > devices. If > > > > SIG is > > > > > > > > > resisting > > > > > > > > > > > making VF's first class citizen, we might ask them to add > > > > > > > > > > > > > some > > > > > > verbiage > > > > > > > > > > > to suggest leave the same offsets as PF open to help > > emulation > > > > > > software. > > > > > > > > > > > > > > > > > > > > Even if we know where to expose these capabilities on the > > > > > > > > > > VF, > > it's > > > > not > > > > > > > > > > clear to me how we can actually virtualize the capability > > > > > > > > > > itself. > > If > > > > > > > > > > the spec defines, for example, an enable bit as r/w then > > software > > > > that > > > > > > > > > > interacts with that register expects the bit is settable. > > > > > > > > > > There's > > no > > > > > > > > > > protocol for "try to set the bit and re-read it to see if > > > > > > > > > > the > > hardware > > > > > > > > > > accepted it". Therefore a capability with a fixed enable > > > > > > > > > > bit > > > > > > > > > > representing the state of the PF, not settable by the VF, is > > > > > > > > > > disingenuous to the spec. > > > > > > > > > > > > > > > > > > I think we are all in violent agreement. A lot of times the > > > > > > > > > pci spec > > > > gets > > > > > > > > > defined several years ahead of real products and no one > > > > remembers > > > > > > > > > the justification on why they restricted things the way they > > > > > > > > > did. > > > > > > > > > > > > > > > > > > Maybe someone early product wasn't quite exposing these > > features > > > > to > > > > > > the > > > > > > > > > VF > > > > > > > > > and hence the spec is bug compatible :-) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If what we're trying to do is expose that PASID and PRI are > > > > > > > > > >
RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
> From: Alex Williamson > Sent: Tuesday, April 14, 2020 11:24 PM > > On Tue, 14 Apr 2020 03:42:42 + > "Tian, Kevin" wrote: > > > > From: Alex Williamson > > > Sent: Tuesday, April 14, 2020 11:29 AM > > > > > > On Tue, 14 Apr 2020 02:40:58 + > > > "Tian, Kevin" wrote: > > > > > > > > From: Alex Williamson > > > > > Sent: Tuesday, April 14, 2020 3:21 AM > > > > > > > > > > On Mon, 13 Apr 2020 08:05:33 + > > > > > "Tian, Kevin" wrote: > > > > > > > > > > > > From: Tian, Kevin > > > > > > > Sent: Monday, April 13, 2020 3:55 PM > > > > > > > > > > > > > > > From: Raj, Ashok > > > > > > > > Sent: Monday, April 13, 2020 11:11 AM > > > > > > > > > > > > > > > > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson > wrote: > > > > > > > > > On Tue, 7 Apr 2020 21:00:21 -0700 > > > > > > > > > "Raj, Ashok" wrote: > > > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > > > + Bjorn > > > > > > > > > > > > > > > > > > + Don > > > > > > > > > > > > > > > > > > > FWIW I can't understand why PCI SIG went different ways > with > > > ATS, > > > > > > > > > > where its enumerated on PF and VF. But for PASID and PRI its > > > only > > > > > > > > > > in PF. > > > > > > > > > > > > > > > > > > > > I'm checking with our internal SIG reps to followup on that. > > > > > > > > > > > > > > > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson > > > wrote: > > > > > > > > > > > > Is there vendor guarantee that hidden registers will > > > > > > > > > > > > locate > at > > > the > > > > > > > > > > > > same offset between PF and VF config space? > > > > > > > > > > > > > > > > > > > > > > I'm not sure if the spec really precludes hidden > > > > > > > > > > > registers, > but > > > the > > > > > > > > > > > fact that these registers are explicitly outside of the > capability > > > > > > > > > > > chain implies they're only intended for device specific > > > > > > > > > > > use, > so > > > I'd > > > > > say > > > > > > > > > > > there are no guarantees about anything related to these > > > registers. > > > > > > > > > > > > > > > > > > > > As you had suggested in the other thread, we could consider > > > > > > > > > > using the same offset as in PF, but even that's a better > > > > > > > > > > guess > > > > > > > > > > still not reliable. > > > > > > > > > > > > > > > > > > > > The other option is to maybe extend driver ops in the PF to > > > expose > > > > > > > > > > where the offsets should be. Sort of adding the quirk in the > > > > > > > > > > implementation. > > > > > > > > > > > > > > > > > > > > I'm not sure how prevalent are PASID and PRI in VF devices. > > > > > > > > > > If > > > SIG is > > > > > > > > resisting > > > > > > > > > > making VF's first class citizen, we might ask them to add > some > > > > > verbiage > > > > > > > > > > to suggest leave the same offsets as PF open to help > emulation > > > > > software. > > > > > > > > > > > > > > > > > > Even if we know where to expose these capabilities on the VF, > it's > > > not > > > > > > > > > clear to me how we can actually virtualize the capability > > > > > > > > > itself. > If > > > > > > > > > the spec defines, for example, an enable bit as r/w then > software > > > that > > > > > > > > > interacts with that register expects the bit is settable. > > > > > > > > > There's > no > > > > > > > > > protocol for "try to set the bit and re-read it to see if the > hardware > > > > > > > > > accepted it". Therefore a capability with a fixed enable bit > > > > > > > > > representing the state of the PF, not settable by the VF, is > > > > > > > > > disingenuous to the spec. > > > > > > > > > > > > > > > > I think we are all in violent agreement. A lot of times the pci > > > > > > > > spec > > > gets > > > > > > > > defined several years ahead of real products and no one > > > remembers > > > > > > > > the justification on why they restricted things the way they > > > > > > > > did. > > > > > > > > > > > > > > > > Maybe someone early product wasn't quite exposing these > features > > > to > > > > > the > > > > > > > > VF > > > > > > > > and hence the spec is bug compatible :-) > > > > > > > > > > > > > > > > > > > > > > > > > > If what we're trying to do is expose that PASID and PRI are > enabled > > > on > > > > > > > > > the PF to a VF driver, maybe duplicating the PF capabilities > > > > > > > > > on > the > > > VF > > > > > > > > > without the ability to control it is not the right approach. > Maybe > > > we > > > > > > > > > > > > > > > > As long as the capability enable is only provided when the PF > > > > > > > > has > > > > > enabled > > > > > > > > the feature. Then it seems the hardware seems to do the right > thing. > > > > > > > > > > > > > > > > Assume we expose PASID/PRI only when PF has enabled it. It will > be > > > the > > > > > > > > case since the PF driver needs to exist, and IOMMU would have > set > > > the > > > > > > > > PASID/PRI/ATS on PF. > > > > > > > > > > > > > > > >
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Tue, 14 Apr 2020 03:42:42 + "Tian, Kevin" wrote: > > From: Alex Williamson > > Sent: Tuesday, April 14, 2020 11:29 AM > > > > On Tue, 14 Apr 2020 02:40:58 + > > "Tian, Kevin" wrote: > > > > > > From: Alex Williamson > > > > Sent: Tuesday, April 14, 2020 3:21 AM > > > > > > > > On Mon, 13 Apr 2020 08:05:33 + > > > > "Tian, Kevin" wrote: > > > > > > > > > > From: Tian, Kevin > > > > > > Sent: Monday, April 13, 2020 3:55 PM > > > > > > > > > > > > > From: Raj, Ashok > > > > > > > Sent: Monday, April 13, 2020 11:11 AM > > > > > > > > > > > > > > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson wrote: > > > > > > > > On Tue, 7 Apr 2020 21:00:21 -0700 > > > > > > > > "Raj, Ashok" wrote: > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > + Bjorn > > > > > > > > > > > > > > > > + Don > > > > > > > > > > > > > > > > > FWIW I can't understand why PCI SIG went different ways with > > ATS, > > > > > > > > > where its enumerated on PF and VF. But for PASID and PRI its > > only > > > > > > > > > in PF. > > > > > > > > > > > > > > > > > > I'm checking with our internal SIG reps to followup on that. > > > > > > > > > > > > > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson > > wrote: > > > > > > > > > > > Is there vendor guarantee that hidden registers will > > > > > > > > > > > locate at > > the > > > > > > > > > > > same offset between PF and VF config space? > > > > > > > > > > > > > > > > > > > > I'm not sure if the spec really precludes hidden registers, > > > > > > > > > > but > > the > > > > > > > > > > fact that these registers are explicitly outside of the > > > > > > > > > > capability > > > > > > > > > > chain implies they're only intended for device specific > > > > > > > > > > use, so > > I'd > > > > say > > > > > > > > > > there are no guarantees about anything related to these > > registers. > > > > > > > > > > > > > > > > > > As you had suggested in the other thread, we could consider > > > > > > > > > using the same offset as in PF, but even that's a better guess > > > > > > > > > still not reliable. > > > > > > > > > > > > > > > > > > The other option is to maybe extend driver ops in the PF to > > expose > > > > > > > > > where the offsets should be. Sort of adding the quirk in the > > > > > > > > > implementation. > > > > > > > > > > > > > > > > > > I'm not sure how prevalent are PASID and PRI in VF devices. > > > > > > > > > If > > SIG is > > > > > > > resisting > > > > > > > > > making VF's first class citizen, we might ask them to add > > > > > > > > > some > > > > verbiage > > > > > > > > > to suggest leave the same offsets as PF open to help > > > > > > > > > emulation > > > > software. > > > > > > > > > > > > > > > > Even if we know where to expose these capabilities on the VF, > > > > > > > > it's > > not > > > > > > > > clear to me how we can actually virtualize the capability > > > > > > > > itself. If > > > > > > > > the spec defines, for example, an enable bit as r/w then > > > > > > > > software > > that > > > > > > > > interacts with that register expects the bit is settable. > > > > > > > > There's no > > > > > > > > protocol for "try to set the bit and re-read it to see if the > > > > > > > > hardware > > > > > > > > accepted it". Therefore a capability with a fixed enable bit > > > > > > > > representing the state of the PF, not settable by the VF, is > > > > > > > > disingenuous to the spec. > > > > > > > > > > > > > > I think we are all in violent agreement. A lot of times the pci > > > > > > > spec > > gets > > > > > > > defined several years ahead of real products and no one > > remembers > > > > > > > the justification on why they restricted things the way they did. > > > > > > > > > > > > > > Maybe someone early product wasn't quite exposing these features > > to > > > > the > > > > > > > VF > > > > > > > and hence the spec is bug compatible :-) > > > > > > > > > > > > > > > > > > > > > > > If what we're trying to do is expose that PASID and PRI are > > > > > > > > enabled > > on > > > > > > > > the PF to a VF driver, maybe duplicating the PF capabilities on > > > > > > > > the > > VF > > > > > > > > without the ability to control it is not the right approach. > > > > > > > > Maybe > > we > > > > > > > > > > > > > > As long as the capability enable is only provided when the PF has > > > > > > > > > > > enabled > > > > > > > the feature. Then it seems the hardware seems to do the right > > > > > > > thing. > > > > > > > > > > > > > > Assume we expose PASID/PRI only when PF has enabled it. It will > > > > > > > be > > the > > > > > > > case since the PF driver needs to exist, and IOMMU would have set > > > > > > > > > the > > > > > > > PASID/PRI/ATS on PF. > > > > > > > > > > > > > > If the emulation is purely spoofing the capability. Once vIOMMU > > driver > > >
RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
> From: Alex Williamson > Sent: Tuesday, April 14, 2020 11:29 AM > > On Tue, 14 Apr 2020 02:40:58 + > "Tian, Kevin" wrote: > > > > From: Alex Williamson > > > Sent: Tuesday, April 14, 2020 3:21 AM > > > > > > On Mon, 13 Apr 2020 08:05:33 + > > > "Tian, Kevin" wrote: > > > > > > > > From: Tian, Kevin > > > > > Sent: Monday, April 13, 2020 3:55 PM > > > > > > > > > > > From: Raj, Ashok > > > > > > Sent: Monday, April 13, 2020 11:11 AM > > > > > > > > > > > > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson wrote: > > > > > > > On Tue, 7 Apr 2020 21:00:21 -0700 > > > > > > > "Raj, Ashok" wrote: > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > + Bjorn > > > > > > > > > > > > > > + Don > > > > > > > > > > > > > > > FWIW I can't understand why PCI SIG went different ways with > ATS, > > > > > > > > where its enumerated on PF and VF. But for PASID and PRI its > only > > > > > > > > in PF. > > > > > > > > > > > > > > > > I'm checking with our internal SIG reps to followup on that. > > > > > > > > > > > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson > wrote: > > > > > > > > > > Is there vendor guarantee that hidden registers will locate > > > > > > > > > > at > the > > > > > > > > > > same offset between PF and VF config space? > > > > > > > > > > > > > > > > > > I'm not sure if the spec really precludes hidden registers, > > > > > > > > > but > the > > > > > > > > > fact that these registers are explicitly outside of the > > > > > > > > > capability > > > > > > > > > chain implies they're only intended for device specific use, > > > > > > > > > so > I'd > > > say > > > > > > > > > there are no guarantees about anything related to these > registers. > > > > > > > > > > > > > > > > As you had suggested in the other thread, we could consider > > > > > > > > using the same offset as in PF, but even that's a better guess > > > > > > > > still not reliable. > > > > > > > > > > > > > > > > The other option is to maybe extend driver ops in the PF to > expose > > > > > > > > where the offsets should be. Sort of adding the quirk in the > > > > > > > > implementation. > > > > > > > > > > > > > > > > I'm not sure how prevalent are PASID and PRI in VF devices. If > SIG is > > > > > > resisting > > > > > > > > making VF's first class citizen, we might ask them to add some > > > verbiage > > > > > > > > to suggest leave the same offsets as PF open to help emulation > > > software. > > > > > > > > > > > > > > Even if we know where to expose these capabilities on the VF, it's > not > > > > > > > clear to me how we can actually virtualize the capability itself. > > > > > > > If > > > > > > > the spec defines, for example, an enable bit as r/w then software > that > > > > > > > interacts with that register expects the bit is settable. > > > > > > > There's no > > > > > > > protocol for "try to set the bit and re-read it to see if the > > > > > > > hardware > > > > > > > accepted it". Therefore a capability with a fixed enable bit > > > > > > > representing the state of the PF, not settable by the VF, is > > > > > > > disingenuous to the spec. > > > > > > > > > > > > I think we are all in violent agreement. A lot of times the pci spec > gets > > > > > > defined several years ahead of real products and no one > remembers > > > > > > the justification on why they restricted things the way they did. > > > > > > > > > > > > Maybe someone early product wasn't quite exposing these features > to > > > the > > > > > > VF > > > > > > and hence the spec is bug compatible :-) > > > > > > > > > > > > > > > > > > > > If what we're trying to do is expose that PASID and PRI are > > > > > > > enabled > on > > > > > > > the PF to a VF driver, maybe duplicating the PF capabilities on > > > > > > > the > VF > > > > > > > without the ability to control it is not the right approach. > > > > > > > Maybe > we > > > > > > > > > > > > As long as the capability enable is only provided when the PF has > > > enabled > > > > > > the feature. Then it seems the hardware seems to do the right thing. > > > > > > > > > > > > Assume we expose PASID/PRI only when PF has enabled it. It will be > the > > > > > > case since the PF driver needs to exist, and IOMMU would have set > the > > > > > > PASID/PRI/ATS on PF. > > > > > > > > > > > > If the emulation is purely spoofing the capability. Once vIOMMU > driver > > > > > > enables PASID, the context entries for the VF are completely > > > independent > > > > > > from the PF context entries. > > > > > > > > > > > > vIOMMU would enable PASID, and we just spoof the PASID > capability. > > > > > > > > > > > > If vIOMMU or guest for some reason does disable_pasid(), then the > > > > > > vIOMMU driver can disaable PASID on the VF context entries. So the > VF > > > > > > although the capability is blanket enabled on PF, IOMMU gaurantees > > > the > > > > > > transactions are blocked. > > > > > > > > > > > > > > > > > > In the interim, it seems
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Tue, 14 Apr 2020 02:40:58 + "Tian, Kevin" wrote: > > From: Alex Williamson > > Sent: Tuesday, April 14, 2020 3:21 AM > > > > On Mon, 13 Apr 2020 08:05:33 + > > "Tian, Kevin" wrote: > > > > > > From: Tian, Kevin > > > > Sent: Monday, April 13, 2020 3:55 PM > > > > > > > > > From: Raj, Ashok > > > > > Sent: Monday, April 13, 2020 11:11 AM > > > > > > > > > > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson wrote: > > > > > > On Tue, 7 Apr 2020 21:00:21 -0700 > > > > > > "Raj, Ashok" wrote: > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > + Bjorn > > > > > > > > > > > > + Don > > > > > > > > > > > > > FWIW I can't understand why PCI SIG went different ways with ATS, > > > > > > > where its enumerated on PF and VF. But for PASID and PRI its only > > > > > > > in PF. > > > > > > > > > > > > > > I'm checking with our internal SIG reps to followup on that. > > > > > > > > > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson wrote: > > > > > > > > > Is there vendor guarantee that hidden registers will locate > > > > > > > > > at the > > > > > > > > > same offset between PF and VF config space? > > > > > > > > > > > > > > > > I'm not sure if the spec really precludes hidden registers, but > > > > > > > > the > > > > > > > > fact that these registers are explicitly outside of the > > > > > > > > capability > > > > > > > > chain implies they're only intended for device specific use, so > > > > > > > > I'd > > say > > > > > > > > there are no guarantees about anything related to these > > > > > > > > registers. > > > > > > > > > > > > > > As you had suggested in the other thread, we could consider > > > > > > > using the same offset as in PF, but even that's a better guess > > > > > > > still not reliable. > > > > > > > > > > > > > > The other option is to maybe extend driver ops in the PF to expose > > > > > > > where the offsets should be. Sort of adding the quirk in the > > > > > > > implementation. > > > > > > > > > > > > > > I'm not sure how prevalent are PASID and PRI in VF devices. If > > > > > > > SIG is > > > > > resisting > > > > > > > making VF's first class citizen, we might ask them to add some > > verbiage > > > > > > > to suggest leave the same offsets as PF open to help emulation > > software. > > > > > > > > > > > > Even if we know where to expose these capabilities on the VF, it's > > > > > > not > > > > > > clear to me how we can actually virtualize the capability itself. > > > > > > If > > > > > > the spec defines, for example, an enable bit as r/w then software > > > > > > that > > > > > > interacts with that register expects the bit is settable. There's > > > > > > no > > > > > > protocol for "try to set the bit and re-read it to see if the > > > > > > hardware > > > > > > accepted it". Therefore a capability with a fixed enable bit > > > > > > representing the state of the PF, not settable by the VF, is > > > > > > disingenuous to the spec. > > > > > > > > > > I think we are all in violent agreement. A lot of times the pci spec > > > > > gets > > > > > defined several years ahead of real products and no one remembers > > > > > the justification on why they restricted things the way they did. > > > > > > > > > > Maybe someone early product wasn't quite exposing these features to > > the > > > > > VF > > > > > and hence the spec is bug compatible :-) > > > > > > > > > > > > > > > > > If what we're trying to do is expose that PASID and PRI are enabled > > > > > > on > > > > > > the PF to a VF driver, maybe duplicating the PF capabilities on the > > > > > > VF > > > > > > without the ability to control it is not the right approach. Maybe > > > > > > we > > > > > > > > > > As long as the capability enable is only provided when the PF has > > enabled > > > > > the feature. Then it seems the hardware seems to do the right thing. > > > > > > > > > > Assume we expose PASID/PRI only when PF has enabled it. It will be the > > > > > case since the PF driver needs to exist, and IOMMU would have set the > > > > > PASID/PRI/ATS on PF. > > > > > > > > > > If the emulation is purely spoofing the capability. Once vIOMMU driver > > > > > enables PASID, the context entries for the VF are completely > > independent > > > > > from the PF context entries. > > > > > > > > > > vIOMMU would enable PASID, and we just spoof the PASID capability. > > > > > > > > > > If vIOMMU or guest for some reason does disable_pasid(), then the > > > > > vIOMMU driver can disaable PASID on the VF context entries. So the VF > > > > > although the capability is blanket enabled on PF, IOMMU gaurantees > > the > > > > > transactions are blocked. > > > > > > > > > > > > > > > In the interim, it seems like the intent of the virtual capability > > > > > can be honored via help from the IOMMU for the controlling aspect.. > > > > > > > > > > Did i miss anything? > > > > > > > > Above works for emulating the
RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
> From: Alex Williamson > Sent: Tuesday, April 14, 2020 3:21 AM > > On Mon, 13 Apr 2020 08:05:33 + > "Tian, Kevin" wrote: > > > > From: Tian, Kevin > > > Sent: Monday, April 13, 2020 3:55 PM > > > > > > > From: Raj, Ashok > > > > Sent: Monday, April 13, 2020 11:11 AM > > > > > > > > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson wrote: > > > > > On Tue, 7 Apr 2020 21:00:21 -0700 > > > > > "Raj, Ashok" wrote: > > > > > > > > > > > Hi Alex > > > > > > > > > > > > + Bjorn > > > > > > > > > > + Don > > > > > > > > > > > FWIW I can't understand why PCI SIG went different ways with ATS, > > > > > > where its enumerated on PF and VF. But for PASID and PRI its only > > > > > > in PF. > > > > > > > > > > > > I'm checking with our internal SIG reps to followup on that. > > > > > > > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson wrote: > > > > > > > > Is there vendor guarantee that hidden registers will locate at > > > > > > > > the > > > > > > > > same offset between PF and VF config space? > > > > > > > > > > > > > > I'm not sure if the spec really precludes hidden registers, but > > > > > > > the > > > > > > > fact that these registers are explicitly outside of the capability > > > > > > > chain implies they're only intended for device specific use, so > > > > > > > I'd > say > > > > > > > there are no guarantees about anything related to these registers. > > > > > > > > > > > > As you had suggested in the other thread, we could consider > > > > > > using the same offset as in PF, but even that's a better guess > > > > > > still not reliable. > > > > > > > > > > > > The other option is to maybe extend driver ops in the PF to expose > > > > > > where the offsets should be. Sort of adding the quirk in the > > > > > > implementation. > > > > > > > > > > > > I'm not sure how prevalent are PASID and PRI in VF devices. If SIG > > > > > > is > > > > resisting > > > > > > making VF's first class citizen, we might ask them to add some > verbiage > > > > > > to suggest leave the same offsets as PF open to help emulation > software. > > > > > > > > > > Even if we know where to expose these capabilities on the VF, it's not > > > > > clear to me how we can actually virtualize the capability itself. If > > > > > the spec defines, for example, an enable bit as r/w then software that > > > > > interacts with that register expects the bit is settable. There's no > > > > > protocol for "try to set the bit and re-read it to see if the hardware > > > > > accepted it". Therefore a capability with a fixed enable bit > > > > > representing the state of the PF, not settable by the VF, is > > > > > disingenuous to the spec. > > > > > > > > I think we are all in violent agreement. A lot of times the pci spec > > > > gets > > > > defined several years ahead of real products and no one remembers > > > > the justification on why they restricted things the way they did. > > > > > > > > Maybe someone early product wasn't quite exposing these features to > the > > > > VF > > > > and hence the spec is bug compatible :-) > > > > > > > > > > > > > > If what we're trying to do is expose that PASID and PRI are enabled on > > > > > the PF to a VF driver, maybe duplicating the PF capabilities on the VF > > > > > without the ability to control it is not the right approach. Maybe we > > > > > > > > As long as the capability enable is only provided when the PF has > enabled > > > > the feature. Then it seems the hardware seems to do the right thing. > > > > > > > > Assume we expose PASID/PRI only when PF has enabled it. It will be the > > > > case since the PF driver needs to exist, and IOMMU would have set the > > > > PASID/PRI/ATS on PF. > > > > > > > > If the emulation is purely spoofing the capability. Once vIOMMU driver > > > > enables PASID, the context entries for the VF are completely > independent > > > > from the PF context entries. > > > > > > > > vIOMMU would enable PASID, and we just spoof the PASID capability. > > > > > > > > If vIOMMU or guest for some reason does disable_pasid(), then the > > > > vIOMMU driver can disaable PASID on the VF context entries. So the VF > > > > although the capability is blanket enabled on PF, IOMMU gaurantees > the > > > > transactions are blocked. > > > > > > > > > > > > In the interim, it seems like the intent of the virtual capability > > > > can be honored via help from the IOMMU for the controlling aspect.. > > > > > > > > Did i miss anything? > > > > > > Above works for emulating the enable bit (under the assumption that > > > PF driver won't disable pasid when vf is assigned). However, there are > > > also "Execute permission enable" and "Privileged mode enable" bits in > > > PASID control registers. I don't know how those bits could be cleanly > > > emulated when the guest writes a value different from PF's... > > > > sent too quick. the IOMMU also includes control bits for allowing/ > > blocking execute requests and supervisor requests. We can rely
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Thu, 9 Apr 2020 09:35:33 +0200 Jean-Philippe Brucker wrote: > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson wrote: > > On Tue, 7 Apr 2020 21:00:21 -0700 > > "Raj, Ashok" wrote: > > > > > Hi Alex > > > > > > + Bjorn > > > > + Don > > > > > FWIW I can't understand why PCI SIG went different ways with ATS, > > > where its enumerated on PF and VF. But for PASID and PRI its only > > > in PF. > > > > > > I'm checking with our internal SIG reps to followup on that. > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson wrote: > > > > > Is there vendor guarantee that hidden registers will locate at the > > > > > same offset between PF and VF config space? > > > > > > > > I'm not sure if the spec really precludes hidden registers, but the > > > > fact that these registers are explicitly outside of the capability > > > > chain implies they're only intended for device specific use, so I'd say > > > > there are no guarantees about anything related to these registers. > > > > > > As you had suggested in the other thread, we could consider > > > using the same offset as in PF, but even that's a better guess > > > still not reliable. > > > > > > The other option is to maybe extend driver ops in the PF to expose > > > where the offsets should be. Sort of adding the quirk in the > > > implementation. > > > > > > I'm not sure how prevalent are PASID and PRI in VF devices. If SIG is > > > resisting > > > making VF's first class citizen, we might ask them to add some verbiage > > > to suggest leave the same offsets as PF open to help emulation software. > > > > Even if we know where to expose these capabilities on the VF, it's not > > clear to me how we can actually virtualize the capability itself. If > > the spec defines, for example, an enable bit as r/w then software that > > interacts with that register expects the bit is settable. There's no > > protocol for "try to set the bit and re-read it to see if the hardware > > accepted it". Therefore a capability with a fixed enable bit > > representing the state of the PF, not settable by the VF, is > > disingenuous to the spec. > > Would it be OK to implement a lock down mechanism for the PF PASID > capability, preventing changes to the PF cap when the VF is in use by > VFIO? The emulation would still break the spec: since the PF cap would > always be enabled the VF configuration bits would have no effect, but it > seems preferable to having the Enable bit not enable anything. I think we absolutely need some mechanism to make sure the PF driver doesn't change the PASID enable bit while we're using it. A PASID user registration perhaps. And yes, that doesn't necessarily map to being able to actually disable PASID, but it sounds like Kevin and Ashok have some ideas how the emulation could use the IOMMU settings to achieve that more precisely. > > If what we're trying to do is expose that PASID and PRI are enabled on > > the PF to a VF driver, maybe duplicating the PF capabilities on the VF > > without the ability to control it is not the right approach. Maybe we > > need new capabilities exposing these as slave features that cannot be > > controlled? We could define our own vendor capability for this, but of > > course we have both the where to put it in config space issue, as well > > as the issue of trying to push an ad-hoc standard. vfio could expose > > these as device features rather than emulating capabilities, but that > > still leaves a big gap between vfio in the hypervisor and the driver in > > the guest VM. That might still help push the responsibility and policy > > for how to expose it to the VM as a userspace problem though. > > Userspace might have more difficulty working around the issues mentioned > in this thread. They would still need a guarantee that the PF PASID > configuration doesn't change at runtime, and they wouldn't have the > ability to talk to a vendor driver to figure out where to place the fake > PASID capability. Couldn't we do this with the DEVICE_FEATURE ioctl we just added? A user could set a PASID user or get a capability offset, where vfio-pci would plumb these through to whatever PF/vendor driver provides that interface, just as if it was implemented in the vfio-pci driver. But that still lets us leave the policy of inserting a capability and using the IOMMU to implement the bits to the user/QEMU. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Mon, 13 Apr 2020 08:05:33 + "Tian, Kevin" wrote: > > From: Tian, Kevin > > Sent: Monday, April 13, 2020 3:55 PM > > > > > From: Raj, Ashok > > > Sent: Monday, April 13, 2020 11:11 AM > > > > > > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson wrote: > > > > On Tue, 7 Apr 2020 21:00:21 -0700 > > > > "Raj, Ashok" wrote: > > > > > > > > > Hi Alex > > > > > > > > > > + Bjorn > > > > > > > > + Don > > > > > > > > > FWIW I can't understand why PCI SIG went different ways with ATS, > > > > > where its enumerated on PF and VF. But for PASID and PRI its only > > > > > in PF. > > > > > > > > > > I'm checking with our internal SIG reps to followup on that. > > > > > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson wrote: > > > > > > > Is there vendor guarantee that hidden registers will locate at the > > > > > > > same offset between PF and VF config space? > > > > > > > > > > > > I'm not sure if the spec really precludes hidden registers, but the > > > > > > fact that these registers are explicitly outside of the capability > > > > > > chain implies they're only intended for device specific use, so I'd > > > > > > say > > > > > > there are no guarantees about anything related to these registers. > > > > > > > > > > As you had suggested in the other thread, we could consider > > > > > using the same offset as in PF, but even that's a better guess > > > > > still not reliable. > > > > > > > > > > The other option is to maybe extend driver ops in the PF to expose > > > > > where the offsets should be. Sort of adding the quirk in the > > > > > implementation. > > > > > > > > > > I'm not sure how prevalent are PASID and PRI in VF devices. If SIG is > > > > > > > > resisting > > > > > making VF's first class citizen, we might ask them to add some > > > > > verbiage > > > > > to suggest leave the same offsets as PF open to help emulation > > > > > software. > > > > > > > > Even if we know where to expose these capabilities on the VF, it's not > > > > clear to me how we can actually virtualize the capability itself. If > > > > the spec defines, for example, an enable bit as r/w then software that > > > > interacts with that register expects the bit is settable. There's no > > > > protocol for "try to set the bit and re-read it to see if the hardware > > > > accepted it". Therefore a capability with a fixed enable bit > > > > representing the state of the PF, not settable by the VF, is > > > > disingenuous to the spec. > > > > > > I think we are all in violent agreement. A lot of times the pci spec gets > > > defined several years ahead of real products and no one remembers > > > the justification on why they restricted things the way they did. > > > > > > Maybe someone early product wasn't quite exposing these features to the > > > VF > > > and hence the spec is bug compatible :-) > > > > > > > > > > > If what we're trying to do is expose that PASID and PRI are enabled on > > > > the PF to a VF driver, maybe duplicating the PF capabilities on the VF > > > > without the ability to control it is not the right approach. Maybe we > > > > > > As long as the capability enable is only provided when the PF has enabled > > > the feature. Then it seems the hardware seems to do the right thing. > > > > > > Assume we expose PASID/PRI only when PF has enabled it. It will be the > > > case since the PF driver needs to exist, and IOMMU would have set the > > > PASID/PRI/ATS on PF. > > > > > > If the emulation is purely spoofing the capability. Once vIOMMU driver > > > enables PASID, the context entries for the VF are completely independent > > > from the PF context entries. > > > > > > vIOMMU would enable PASID, and we just spoof the PASID capability. > > > > > > If vIOMMU or guest for some reason does disable_pasid(), then the > > > vIOMMU driver can disaable PASID on the VF context entries. So the VF > > > although the capability is blanket enabled on PF, IOMMU gaurantees the > > > transactions are blocked. > > > > > > > > > In the interim, it seems like the intent of the virtual capability > > > can be honored via help from the IOMMU for the controlling aspect.. > > > > > > Did i miss anything? > > > > Above works for emulating the enable bit (under the assumption that > > PF driver won't disable pasid when vf is assigned). However, there are > > also "Execute permission enable" and "Privileged mode enable" bits in > > PASID control registers. I don't know how those bits could be cleanly > > emulated when the guest writes a value different from PF's... > > sent too quick. the IOMMU also includes control bits for allowing/ > blocking execute requests and supervisor requests. We can rely on > IOMMU to block those requests to emulate the disabled cases of > all three control bits in the pasid cap. So if the emulation of the PASID capability takes into account the IOMMU configuration to back that emulation, shouldn't we do that emulation in the
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Sun, 12 Apr 2020 20:29:31 -0700 "Raj, Ashok" wrote: > Hi Alex > > Going through the PCIe Spec, there seems a lot of such capabilities > that are different between PF and VF. Some that make sense > and some don't. > > > On Sun, Apr 12, 2020 at 08:10:43PM -0700, Raj, Ashok wrote: > > > > > > > > I agree though, I don't know why the SIG would preclude implementing > > > per VF control of these features. Thanks, > > > > > For e.g. > > VF doesn't have I/O and Mem space enables, but has BME VFs don't have I/O, so I/O enable is irrelevant. The memory enable bit is emulated, so it doesn't really do anything from the VM perspective. The hypervisor could provide more emulation around this, but it hasn't proven necessary. > Interrupt Status VFs don't have INTx, so this is irrelevant. > Correctable Error Reporting > Almost all of Device Control Register. Are we doing anything to virtualize these for VFs? I think we've addressed access control to these for PFs, but I don't see that we try to virtualize them for the VF. > So it seems like there is a ton of them we have to deal with today for > VF's. How do we manage to emulate them without any support for them > in VF's? The memory enable bit is just access to the MMIO space of the device, the hypervisor could choose to do more, but currently emulating the bit itself is sufficient. This doesn't really affect the device, just access to the device. The device control registers, I don't think we've had a need to virtualize them yet and I think we'd run into many of the same questions. If your point is that there exists gaps in the spec that make things difficult to virtualize, I won't argue with you there. MPS is a nearby one that's difficult to virtualize on the PF since its setting needs to take entire communication channels into account. So far though we aren't inventing new capabilities to add to VF config space and pretending they work, we're just stumbling on what the VF exposes whether on bare metal or in a VM. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
> From: Tian, Kevin > Sent: Monday, April 13, 2020 3:55 PM > > > From: Raj, Ashok > > Sent: Monday, April 13, 2020 11:11 AM > > > > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson wrote: > > > On Tue, 7 Apr 2020 21:00:21 -0700 > > > "Raj, Ashok" wrote: > > > > > > > Hi Alex > > > > > > > > + Bjorn > > > > > > + Don > > > > > > > FWIW I can't understand why PCI SIG went different ways with ATS, > > > > where its enumerated on PF and VF. But for PASID and PRI its only > > > > in PF. > > > > > > > > I'm checking with our internal SIG reps to followup on that. > > > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson wrote: > > > > > > Is there vendor guarantee that hidden registers will locate at the > > > > > > same offset between PF and VF config space? > > > > > > > > > > I'm not sure if the spec really precludes hidden registers, but the > > > > > fact that these registers are explicitly outside of the capability > > > > > chain implies they're only intended for device specific use, so I'd > > > > > say > > > > > there are no guarantees about anything related to these registers. > > > > > > > > As you had suggested in the other thread, we could consider > > > > using the same offset as in PF, but even that's a better guess > > > > still not reliable. > > > > > > > > The other option is to maybe extend driver ops in the PF to expose > > > > where the offsets should be. Sort of adding the quirk in the > > > > implementation. > > > > > > > > I'm not sure how prevalent are PASID and PRI in VF devices. If SIG is > > resisting > > > > making VF's first class citizen, we might ask them to add some verbiage > > > > to suggest leave the same offsets as PF open to help emulation software. > > > > > > Even if we know where to expose these capabilities on the VF, it's not > > > clear to me how we can actually virtualize the capability itself. If > > > the spec defines, for example, an enable bit as r/w then software that > > > interacts with that register expects the bit is settable. There's no > > > protocol for "try to set the bit and re-read it to see if the hardware > > > accepted it". Therefore a capability with a fixed enable bit > > > representing the state of the PF, not settable by the VF, is > > > disingenuous to the spec. > > > > I think we are all in violent agreement. A lot of times the pci spec gets > > defined several years ahead of real products and no one remembers > > the justification on why they restricted things the way they did. > > > > Maybe someone early product wasn't quite exposing these features to the > > VF > > and hence the spec is bug compatible :-) > > > > > > > > If what we're trying to do is expose that PASID and PRI are enabled on > > > the PF to a VF driver, maybe duplicating the PF capabilities on the VF > > > without the ability to control it is not the right approach. Maybe we > > > > As long as the capability enable is only provided when the PF has enabled > > the feature. Then it seems the hardware seems to do the right thing. > > > > Assume we expose PASID/PRI only when PF has enabled it. It will be the > > case since the PF driver needs to exist, and IOMMU would have set the > > PASID/PRI/ATS on PF. > > > > If the emulation is purely spoofing the capability. Once vIOMMU driver > > enables PASID, the context entries for the VF are completely independent > > from the PF context entries. > > > > vIOMMU would enable PASID, and we just spoof the PASID capability. > > > > If vIOMMU or guest for some reason does disable_pasid(), then the > > vIOMMU driver can disaable PASID on the VF context entries. So the VF > > although the capability is blanket enabled on PF, IOMMU gaurantees the > > transactions are blocked. > > > > > > In the interim, it seems like the intent of the virtual capability > > can be honored via help from the IOMMU for the controlling aspect.. > > > > Did i miss anything? > > Above works for emulating the enable bit (under the assumption that > PF driver won't disable pasid when vf is assigned). However, there are > also "Execute permission enable" and "Privileged mode enable" bits in > PASID control registers. I don't know how those bits could be cleanly > emulated when the guest writes a value different from PF's... sent too quick. the IOMMU also includes control bits for allowing/ blocking execute requests and supervisor requests. We can rely on IOMMU to block those requests to emulate the disabled cases of all three control bits in the pasid cap. Thanks Kevin > > Similar problem also exists when talking about PRI emulation, e.g. > to enable PRI the software usually waits until the 'stopped' bit > is set (indicating all previously issued requests have completed). How > to emulate this bit accurately when one guest toggles the enable bit > while the PF and other VFs are actively issuing page requests through > the shared page request interface? from pcie spec I didn't find a way > to catch when all previously-issued
RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
> From: Raj, Ashok > Sent: Monday, April 13, 2020 11:11 AM > > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson wrote: > > On Tue, 7 Apr 2020 21:00:21 -0700 > > "Raj, Ashok" wrote: > > > > > Hi Alex > > > > > > + Bjorn > > > > + Don > > > > > FWIW I can't understand why PCI SIG went different ways with ATS, > > > where its enumerated on PF and VF. But for PASID and PRI its only > > > in PF. > > > > > > I'm checking with our internal SIG reps to followup on that. > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson wrote: > > > > > Is there vendor guarantee that hidden registers will locate at the > > > > > same offset between PF and VF config space? > > > > > > > > I'm not sure if the spec really precludes hidden registers, but the > > > > fact that these registers are explicitly outside of the capability > > > > chain implies they're only intended for device specific use, so I'd say > > > > there are no guarantees about anything related to these registers. > > > > > > As you had suggested in the other thread, we could consider > > > using the same offset as in PF, but even that's a better guess > > > still not reliable. > > > > > > The other option is to maybe extend driver ops in the PF to expose > > > where the offsets should be. Sort of adding the quirk in the > > > implementation. > > > > > > I'm not sure how prevalent are PASID and PRI in VF devices. If SIG is > resisting > > > making VF's first class citizen, we might ask them to add some verbiage > > > to suggest leave the same offsets as PF open to help emulation software. > > > > Even if we know where to expose these capabilities on the VF, it's not > > clear to me how we can actually virtualize the capability itself. If > > the spec defines, for example, an enable bit as r/w then software that > > interacts with that register expects the bit is settable. There's no > > protocol for "try to set the bit and re-read it to see if the hardware > > accepted it". Therefore a capability with a fixed enable bit > > representing the state of the PF, not settable by the VF, is > > disingenuous to the spec. > > I think we are all in violent agreement. A lot of times the pci spec gets > defined several years ahead of real products and no one remembers > the justification on why they restricted things the way they did. > > Maybe someone early product wasn't quite exposing these features to the > VF > and hence the spec is bug compatible :-) > > > > > If what we're trying to do is expose that PASID and PRI are enabled on > > the PF to a VF driver, maybe duplicating the PF capabilities on the VF > > without the ability to control it is not the right approach. Maybe we > > As long as the capability enable is only provided when the PF has enabled > the feature. Then it seems the hardware seems to do the right thing. > > Assume we expose PASID/PRI only when PF has enabled it. It will be the > case since the PF driver needs to exist, and IOMMU would have set the > PASID/PRI/ATS on PF. > > If the emulation is purely spoofing the capability. Once vIOMMU driver > enables PASID, the context entries for the VF are completely independent > from the PF context entries. > > vIOMMU would enable PASID, and we just spoof the PASID capability. > > If vIOMMU or guest for some reason does disable_pasid(), then the > vIOMMU driver can disaable PASID on the VF context entries. So the VF > although the capability is blanket enabled on PF, IOMMU gaurantees the > transactions are blocked. > > > In the interim, it seems like the intent of the virtual capability > can be honored via help from the IOMMU for the controlling aspect.. > > Did i miss anything? Above works for emulating the enable bit (under the assumption that PF driver won't disable pasid when vf is assigned). However, there are also "Execute permission enable" and "Privileged mode enable" bits in PASID control registers. I don't know how those bits could be cleanly emulated when the guest writes a value different from PF's... Similar problem also exists when talking about PRI emulation, e.g. to enable PRI the software usually waits until the 'stopped' bit is set (indicating all previously issued requests have completed). How to emulate this bit accurately when one guest toggles the enable bit while the PF and other VFs are actively issuing page requests through the shared page request interface? from pcie spec I didn't find a way to catch when all previously-issued requests from a specific VF have completed. Can a conservative big-enough timeout value help here? I don't know... similar puzzle also exists for emulating the 'reset' control bit which is supposed to clear the pending request state for the whole page request interface. I feel the main problem in pcie spec is that, while they invent SR-IOV to address I/O virtualization requirement (where strict isolation is required), they blurred the boundary by leaving some shared resource cross PF and VFs which imply
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
Hi Alex Going through the PCIe Spec, there seems a lot of such capabilities that are different between PF and VF. Some that make sense and some don't. On Sun, Apr 12, 2020 at 08:10:43PM -0700, Raj, Ashok wrote: > > > > > I agree though, I don't know why the SIG would preclude implementing > > per VF control of these features. Thanks, > > For e.g. VF doesn't have I/O and Mem space enables, but has BME Interrupt Status Correctable Error Reporting Almost all of Device Control Register. So it seems like there is a ton of them we have to deal with today for VF's. How do we manage to emulate them without any support for them in VF's? > > Cheers, > Ashok ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson wrote: > On Tue, 7 Apr 2020 21:00:21 -0700 > "Raj, Ashok" wrote: > > > Hi Alex > > > > + Bjorn > > + Don > > > FWIW I can't understand why PCI SIG went different ways with ATS, > > where its enumerated on PF and VF. But for PASID and PRI its only > > in PF. > > > > I'm checking with our internal SIG reps to followup on that. > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson wrote: > > > > Is there vendor guarantee that hidden registers will locate at the > > > > same offset between PF and VF config space? > > > > > > I'm not sure if the spec really precludes hidden registers, but the > > > fact that these registers are explicitly outside of the capability > > > chain implies they're only intended for device specific use, so I'd say > > > there are no guarantees about anything related to these registers. > > > > As you had suggested in the other thread, we could consider > > using the same offset as in PF, but even that's a better guess > > still not reliable. > > > > The other option is to maybe extend driver ops in the PF to expose > > where the offsets should be. Sort of adding the quirk in the > > implementation. > > > > I'm not sure how prevalent are PASID and PRI in VF devices. If SIG is > > resisting > > making VF's first class citizen, we might ask them to add some verbiage > > to suggest leave the same offsets as PF open to help emulation software. > > Even if we know where to expose these capabilities on the VF, it's not > clear to me how we can actually virtualize the capability itself. If > the spec defines, for example, an enable bit as r/w then software that > interacts with that register expects the bit is settable. There's no > protocol for "try to set the bit and re-read it to see if the hardware > accepted it". Therefore a capability with a fixed enable bit > representing the state of the PF, not settable by the VF, is > disingenuous to the spec. I think we are all in violent agreement. A lot of times the pci spec gets defined several years ahead of real products and no one remembers the justification on why they restricted things the way they did. Maybe someone early product wasn't quite exposing these features to the VF and hence the spec is bug compatible :-) > > If what we're trying to do is expose that PASID and PRI are enabled on > the PF to a VF driver, maybe duplicating the PF capabilities on the VF > without the ability to control it is not the right approach. Maybe we As long as the capability enable is only provided when the PF has enabled the feature. Then it seems the hardware seems to do the right thing. Assume we expose PASID/PRI only when PF has enabled it. It will be the case since the PF driver needs to exist, and IOMMU would have set the PASID/PRI/ATS on PF. If the emulation is purely spoofing the capability. Once vIOMMU driver enables PASID, the context entries for the VF are completely independent from the PF context entries. vIOMMU would enable PASID, and we just spoof the PASID capability. If vIOMMU or guest for some reason does disable_pasid(), then the vIOMMU driver can disaable PASID on the VF context entries. So the VF although the capability is blanket enabled on PF, IOMMU gaurantees the transactions are blocked. In the interim, it seems like the intent of the virtual capability can be honored via help from the IOMMU for the controlling aspect.. Did i miss anything? > need new capabilities exposing these as slave features that cannot be > controlled? We could define our own vendor capability for this, but of > course we have both the where to put it in config space issue, as well > as the issue of trying to push an ad-hoc standard. vfio could expose > these as device features rather than emulating capabilities, but that > still leaves a big gap between vfio in the hypervisor and the driver in > the guest VM. That might still help push the responsibility and policy > for how to expose it to the VM as a userspace problem though. I think this is a good long term solution, but if the vIOMMU implenentations can carry us for the time being, we can probably defer them unless we are stuck. > > I agree though, I don't know why the SIG would preclude implementing > per VF control of these features. Thanks, > Cheers, Ashok ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson wrote: > On Tue, 7 Apr 2020 21:00:21 -0700 > "Raj, Ashok" wrote: > > > Hi Alex > > > > + Bjorn > > + Don > > > FWIW I can't understand why PCI SIG went different ways with ATS, > > where its enumerated on PF and VF. But for PASID and PRI its only > > in PF. > > > > I'm checking with our internal SIG reps to followup on that. > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson wrote: > > > > Is there vendor guarantee that hidden registers will locate at the > > > > same offset between PF and VF config space? > > > > > > I'm not sure if the spec really precludes hidden registers, but the > > > fact that these registers are explicitly outside of the capability > > > chain implies they're only intended for device specific use, so I'd say > > > there are no guarantees about anything related to these registers. > > > > As you had suggested in the other thread, we could consider > > using the same offset as in PF, but even that's a better guess > > still not reliable. > > > > The other option is to maybe extend driver ops in the PF to expose > > where the offsets should be. Sort of adding the quirk in the > > implementation. > > > > I'm not sure how prevalent are PASID and PRI in VF devices. If SIG is > > resisting > > making VF's first class citizen, we might ask them to add some verbiage > > to suggest leave the same offsets as PF open to help emulation software. > > Even if we know where to expose these capabilities on the VF, it's not > clear to me how we can actually virtualize the capability itself. If > the spec defines, for example, an enable bit as r/w then software that > interacts with that register expects the bit is settable. There's no > protocol for "try to set the bit and re-read it to see if the hardware > accepted it". Therefore a capability with a fixed enable bit > representing the state of the PF, not settable by the VF, is > disingenuous to the spec. Would it be OK to implement a lock down mechanism for the PF PASID capability, preventing changes to the PF cap when the VF is in use by VFIO? The emulation would still break the spec: since the PF cap would always be enabled the VF configuration bits would have no effect, but it seems preferable to having the Enable bit not enable anything. > > If what we're trying to do is expose that PASID and PRI are enabled on > the PF to a VF driver, maybe duplicating the PF capabilities on the VF > without the ability to control it is not the right approach. Maybe we > need new capabilities exposing these as slave features that cannot be > controlled? We could define our own vendor capability for this, but of > course we have both the where to put it in config space issue, as well > as the issue of trying to push an ad-hoc standard. vfio could expose > these as device features rather than emulating capabilities, but that > still leaves a big gap between vfio in the hypervisor and the driver in > the guest VM. That might still help push the responsibility and policy > for how to expose it to the VM as a userspace problem though. Userspace might have more difficulty working around the issues mentioned in this thread. They would still need a guarantee that the PF PASID configuration doesn't change at runtime, and they wouldn't have the ability to talk to a vendor driver to figure out where to place the fake PASID capability. Thanks, Jean > > I agree though, I don't know why the SIG would preclude implementing > per VF control of these features. Thanks, > > Alex > > > > FWIW, vfio started out being more strict about restricting config space > > > access to defined capabilities, until... > > > > > > commit a7d1ea1c11b33bda2691f3294b4d735ed635535a > > > Author: Alex Williamson > > > Date: Mon Apr 1 09:04:12 2013 -0600 > > > > > > > Cheers, > > Ashok > > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
Hi Alex On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson wrote: > On Tue, 7 Apr 2020 21:00:21 -0700 > "Raj, Ashok" wrote: > > > Hi Alex > > > > + Bjorn > > + Don > > > FWIW I can't understand why PCI SIG went different ways with ATS, > > where its enumerated on PF and VF. But for PASID and PRI its only > > in PF. > > > > I'm checking with our internal SIG reps to followup on that. > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson wrote: > > > > Is there vendor guarantee that hidden registers will locate at the > > > > same offset between PF and VF config space? > > > > > > I'm not sure if the spec really precludes hidden registers, but the > > > fact that these registers are explicitly outside of the capability > > > chain implies they're only intended for device specific use, so I'd say > > > there are no guarantees about anything related to these registers. > > > > As you had suggested in the other thread, we could consider > > using the same offset as in PF, but even that's a better guess > > still not reliable. > > > > The other option is to maybe extend driver ops in the PF to expose > > where the offsets should be. Sort of adding the quirk in the > > implementation. > > > > I'm not sure how prevalent are PASID and PRI in VF devices. If SIG is > > resisting > > making VF's first class citizen, we might ask them to add some verbiage > > to suggest leave the same offsets as PF open to help emulation software. > > Even if we know where to expose these capabilities on the VF, it's not > clear to me how we can actually virtualize the capability itself. If > the spec defines, for example, an enable bit as r/w then software that > interacts with that register expects the bit is settable. There's no > protocol for "try to set the bit and re-read it to see if the hardware > accepted it". Therefore a capability with a fixed enable bit > representing the state of the PF, not settable by the VF, is > disingenuous to the spec. > > If what we're trying to do is expose that PASID and PRI are enabled on > the PF to a VF driver, maybe duplicating the PF capabilities on the VF > without the ability to control it is not the right approach. Maybe we > need new capabilities exposing these as slave features that cannot be > controlled? We could define our own vendor capability for this, but of The other option is to say, VFIO would never emulate these fake capablities. If exposing a VF with PASID/PRI is required the PF driver would simply wrap it into a VDCM like model which we do today for Scalable IOV devices. So PF handles all aspects of this interface. I also like the suggestion you propose, maybe an offset where these capabilities are exposed to VF's. Maybe have an architected DEVCAPx which exposes these RO capabilities. No control, and the offset should be preserved by the SIG, so VMM can have a safe place to stash them. > course we have both the where to put it in config space issue, as well > as the issue of trying to push an ad-hoc standard. vfio could expose > these as device features rather than emulating capabilities, but that > still leaves a big gap between vfio in the hypervisor and the driver in > the guest VM. That might still help push the responsibility and policy > for how to expose it to the VM as a userspace problem though. > > I agree though, I don't know why the SIG would preclude implementing > per VF control of these features. Thanks, Even if we ask SIG for clarification, it might affect today's devices So might not be useful to solve our current situation. Ashok ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Tue, 7 Apr 2020 21:00:21 -0700 "Raj, Ashok" wrote: > Hi Alex > > + Bjorn + Don > FWIW I can't understand why PCI SIG went different ways with ATS, > where its enumerated on PF and VF. But for PASID and PRI its only > in PF. > > I'm checking with our internal SIG reps to followup on that. > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson wrote: > > > Is there vendor guarantee that hidden registers will locate at the > > > same offset between PF and VF config space? > > > > I'm not sure if the spec really precludes hidden registers, but the > > fact that these registers are explicitly outside of the capability > > chain implies they're only intended for device specific use, so I'd say > > there are no guarantees about anything related to these registers. > > As you had suggested in the other thread, we could consider > using the same offset as in PF, but even that's a better guess > still not reliable. > > The other option is to maybe extend driver ops in the PF to expose > where the offsets should be. Sort of adding the quirk in the > implementation. > > I'm not sure how prevalent are PASID and PRI in VF devices. If SIG is > resisting > making VF's first class citizen, we might ask them to add some verbiage > to suggest leave the same offsets as PF open to help emulation software. Even if we know where to expose these capabilities on the VF, it's not clear to me how we can actually virtualize the capability itself. If the spec defines, for example, an enable bit as r/w then software that interacts with that register expects the bit is settable. There's no protocol for "try to set the bit and re-read it to see if the hardware accepted it". Therefore a capability with a fixed enable bit representing the state of the PF, not settable by the VF, is disingenuous to the spec. If what we're trying to do is expose that PASID and PRI are enabled on the PF to a VF driver, maybe duplicating the PF capabilities on the VF without the ability to control it is not the right approach. Maybe we need new capabilities exposing these as slave features that cannot be controlled? We could define our own vendor capability for this, but of course we have both the where to put it in config space issue, as well as the issue of trying to push an ad-hoc standard. vfio could expose these as device features rather than emulating capabilities, but that still leaves a big gap between vfio in the hypervisor and the driver in the guest VM. That might still help push the responsibility and policy for how to expose it to the VM as a userspace problem though. I agree though, I don't know why the SIG would preclude implementing per VF control of these features. Thanks, Alex > > FWIW, vfio started out being more strict about restricting config space > > access to defined capabilities, until... > > > > commit a7d1ea1c11b33bda2691f3294b4d735ed635535a > > Author: Alex Williamson > > Date: Mon Apr 1 09:04:12 2013 -0600 > > > > Cheers, > Ashok > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
Hi Alex + Bjorn FWIW I can't understand why PCI SIG went different ways with ATS, where its enumerated on PF and VF. But for PASID and PRI its only in PF. I'm checking with our internal SIG reps to followup on that. On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson wrote: > > Is there vendor guarantee that hidden registers will locate at the > > same offset between PF and VF config space? > > I'm not sure if the spec really precludes hidden registers, but the > fact that these registers are explicitly outside of the capability > chain implies they're only intended for device specific use, so I'd say > there are no guarantees about anything related to these registers. As you had suggested in the other thread, we could consider using the same offset as in PF, but even that's a better guess still not reliable. The other option is to maybe extend driver ops in the PF to expose where the offsets should be. Sort of adding the quirk in the implementation. I'm not sure how prevalent are PASID and PRI in VF devices. If SIG is resisting making VF's first class citizen, we might ask them to add some verbiage to suggest leave the same offsets as PF open to help emulation software. > > FWIW, vfio started out being more strict about restricting config space > access to defined capabilities, until... > > commit a7d1ea1c11b33bda2691f3294b4d735ed635535a > Author: Alex Williamson > Date: Mon Apr 1 09:04:12 2013 -0600 > Cheers, Ashok ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
> From: Alex Williamson > Sent: Tuesday, April 7, 2020 11:58 PM > > On Tue, 7 Apr 2020 04:26:23 + > "Tian, Kevin" wrote: > > > > From: Alex Williamson > > > Sent: Saturday, April 4, 2020 1:26 AM > > [...] > > > > > > + if (!pasid_cap.control_reg.paside) { > > > > > > + pr_debug("%s: its PF's PASID capability is not > enabled\n", > > > > > > + dev_name(>pdev->dev)); > > > > > > + ret = 0; > > > > > > + goto out; > > > > > > + } > > > > > > > > > > What happens if the PF's PASID gets disabled while we're using it?? > > > > > > > > This is actually the open I highlighted in cover letter. Per the reply > > > > from Baolu, this seems to be an open for bare-metal all the same. > > > > https://lkml.org/lkml/2020/3/31/95 > > > > > > Seems that needs to get sorted out before we can expose this. Maybe > > > some sort of registration with the PF driver that PASID is being used > > > by a VF so it cannot be disabled? > > > > I guess we may do vSVA for PF first, and then adding VF vSVA later > > given above additional need. It's not necessarily to enable both > > in one step. > > > > [...] > > > > > > @@ -1604,6 +1901,18 @@ static int vfio_ecap_init(struct > > > vfio_pci_device *vdev) > > > > > > if (!ecaps) > > > > > > *(u32 *)>vconfig[PCI_CFG_SPACE_SIZE] = 0; > > > > > > > > > > > > +#ifdef CONFIG_PCI_ATS > > > > > > + if (pdev->is_virtfn) { > > > > > > + struct pci_dev *physfn = pdev->physfn; > > > > > > + > > > > > > + ret = vfio_pci_add_emulated_cap_for_vf(vdev, > > > > > > + physfn, epos_max, prev); > > > > > > + if (ret) > > > > > > + pr_info("%s, failed to add special caps for > VF %s\n", > > > > > > + __func__, dev_name(>pdev- > >dev)); > > > > > > + } > > > > > > +#endif > > > > > > > > > > I can only imagine that we should place the caps at the same location > > > > > they exist on the PF, we don't know what hidden registers might be > > > > > hiding in config space. > > > > Is there vendor guarantee that hidden registers will locate at the > > same offset between PF and VF config space? > > I'm not sure if the spec really precludes hidden registers, but the > fact that these registers are explicitly outside of the capability > chain implies they're only intended for device specific use, so I'd say > there are no guarantees about anything related to these registers. > > FWIW, vfio started out being more strict about restricting config space > access to defined capabilities, until... > > commit a7d1ea1c11b33bda2691f3294b4d735ed635535a > Author: Alex Williamson > Date: Mon Apr 1 09:04:12 2013 -0600 > > vfio-pci: Enable raw access to unassigned config space > > Devices like be2net hide registers between the gaps in capabilities > and architected regions of PCI config space. Our choices to support > such devices is to either build an ever growing and unmanageable white > list or rely on hardware isolation to protect us. These registers are > really no different than MMIO or I/O port space registers, which we > don't attempt to regulate, so treat PCI config space in the same way. > > > > > but we are not sure whether the same location is available on VF. In > > > > this patch, it actually places the emulated cap physically behind the > > > > cap which lays farthest (its offset is largest) within VF's config space > > > > as the PCIe caps are linked in a chain. > > > > > > But, as we've found on Broadcom NICs (iirc), hardware developers have a > > > nasty habit of hiding random registers in PCI config space, outside of > > > defined capabilities. I feel like IGD might even do this too, is that > > > true? So I don't think we can guarantee that just because a section of > > > config space isn't part of a defined capability that its unused. It > > > only means that it's unused by common code, but it might have device > > > specific purposes. So of the PCIe spec indicates that VFs cannot > > > include these capabilities and virtialization software needs to > > > emulate them, we need somewhere safe to place them in config space, > and > > > simply placing them off the end of known capabilities doesn't give me > > > any confidence. Also, hardware has no requirement to make compact > use > > > of extended config space. The first capability must be at 0x100, the > > > very next capability could consume all the way to the last byte of the > > > 4K extended range, and the next link in the chain could be somewhere in > > > the middle. Thanks, > > > > > > > Then what would be a viable option? Vendor nasty habit implies > > no standard, thus I don't see how VFIO can find a safe location > > by itself. Also curious how those hidden registers are identified > > by VFIO and employed with proper r/w policy today. If sort of quirks > > are used, then could such quirk way be extended to also carry
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Tue, 7 Apr 2020 04:26:23 + "Tian, Kevin" wrote: > > From: Alex Williamson > > Sent: Saturday, April 4, 2020 1:26 AM > [...] > > > > > + if (!pasid_cap.control_reg.paside) { > > > > > + pr_debug("%s: its PF's PASID capability is not > > > > > enabled\n", > > > > > + dev_name(>pdev->dev)); > > > > > + ret = 0; > > > > > + goto out; > > > > > + } > > > > > > > > What happens if the PF's PASID gets disabled while we're using it?? > > > > > > This is actually the open I highlighted in cover letter. Per the reply > > > from Baolu, this seems to be an open for bare-metal all the same. > > > https://lkml.org/lkml/2020/3/31/95 > > > > Seems that needs to get sorted out before we can expose this. Maybe > > some sort of registration with the PF driver that PASID is being used > > by a VF so it cannot be disabled? > > I guess we may do vSVA for PF first, and then adding VF vSVA later > given above additional need. It's not necessarily to enable both > in one step. > > [...] > > > > > @@ -1604,6 +1901,18 @@ static int vfio_ecap_init(struct > > vfio_pci_device *vdev) > > > > > if (!ecaps) > > > > > *(u32 *)>vconfig[PCI_CFG_SPACE_SIZE] = 0; > > > > > > > > > > +#ifdef CONFIG_PCI_ATS > > > > > + if (pdev->is_virtfn) { > > > > > + struct pci_dev *physfn = pdev->physfn; > > > > > + > > > > > + ret = vfio_pci_add_emulated_cap_for_vf(vdev, > > > > > + physfn, epos_max, prev); > > > > > + if (ret) > > > > > + pr_info("%s, failed to add special caps for VF > > > > > %s\n", > > > > > + __func__, dev_name(>pdev->dev)); > > > > > + } > > > > > +#endif > > > > > > > > I can only imagine that we should place the caps at the same location > > > > they exist on the PF, we don't know what hidden registers might be > > > > hiding in config space. > > Is there vendor guarantee that hidden registers will locate at the > same offset between PF and VF config space? I'm not sure if the spec really precludes hidden registers, but the fact that these registers are explicitly outside of the capability chain implies they're only intended for device specific use, so I'd say there are no guarantees about anything related to these registers. FWIW, vfio started out being more strict about restricting config space access to defined capabilities, until... commit a7d1ea1c11b33bda2691f3294b4d735ed635535a Author: Alex Williamson Date: Mon Apr 1 09:04:12 2013 -0600 vfio-pci: Enable raw access to unassigned config space Devices like be2net hide registers between the gaps in capabilities and architected regions of PCI config space. Our choices to support such devices is to either build an ever growing and unmanageable white list or rely on hardware isolation to protect us. These registers are really no different than MMIO or I/O port space registers, which we don't attempt to regulate, so treat PCI config space in the same way. > > > but we are not sure whether the same location is available on VF. In > > > this patch, it actually places the emulated cap physically behind the > > > cap which lays farthest (its offset is largest) within VF's config space > > > as the PCIe caps are linked in a chain. > > > > But, as we've found on Broadcom NICs (iirc), hardware developers have a > > nasty habit of hiding random registers in PCI config space, outside of > > defined capabilities. I feel like IGD might even do this too, is that > > true? So I don't think we can guarantee that just because a section of > > config space isn't part of a defined capability that its unused. It > > only means that it's unused by common code, but it might have device > > specific purposes. So of the PCIe spec indicates that VFs cannot > > include these capabilities and virtialization software needs to > > emulate them, we need somewhere safe to place them in config space, and > > simply placing them off the end of known capabilities doesn't give me > > any confidence. Also, hardware has no requirement to make compact use > > of extended config space. The first capability must be at 0x100, the > > very next capability could consume all the way to the last byte of the > > 4K extended range, and the next link in the chain could be somewhere in > > the middle. Thanks, > > > > Then what would be a viable option? Vendor nasty habit implies > no standard, thus I don't see how VFIO can find a safe location > by itself. Also curious how those hidden registers are identified > by VFIO and employed with proper r/w policy today. If sort of quirks > are used, then could such quirk way be extended to also carry > the information about vendor specific safe location? When no > such quirk info is provided (the majority case), VFIO then finds > out a free location to carry the new cap. See above
RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
> From: Alex Williamson > Sent: Saturday, April 4, 2020 1:26 AM [...] > > > > + if (!pasid_cap.control_reg.paside) { > > > > + pr_debug("%s: its PF's PASID capability is not > > > > enabled\n", > > > > + dev_name(>pdev->dev)); > > > > + ret = 0; > > > > + goto out; > > > > + } > > > > > > What happens if the PF's PASID gets disabled while we're using it?? > > > > This is actually the open I highlighted in cover letter. Per the reply > > from Baolu, this seems to be an open for bare-metal all the same. > > https://lkml.org/lkml/2020/3/31/95 > > Seems that needs to get sorted out before we can expose this. Maybe > some sort of registration with the PF driver that PASID is being used > by a VF so it cannot be disabled? I guess we may do vSVA for PF first, and then adding VF vSVA later given above additional need. It's not necessarily to enable both in one step. [...] > > > > @@ -1604,6 +1901,18 @@ static int vfio_ecap_init(struct > vfio_pci_device *vdev) > > > > if (!ecaps) > > > > *(u32 *)>vconfig[PCI_CFG_SPACE_SIZE] = 0; > > > > > > > > +#ifdef CONFIG_PCI_ATS > > > > + if (pdev->is_virtfn) { > > > > + struct pci_dev *physfn = pdev->physfn; > > > > + > > > > + ret = vfio_pci_add_emulated_cap_for_vf(vdev, > > > > + physfn, epos_max, prev); > > > > + if (ret) > > > > + pr_info("%s, failed to add special caps for VF > > > > %s\n", > > > > + __func__, dev_name(>pdev->dev)); > > > > + } > > > > +#endif > > > > > > I can only imagine that we should place the caps at the same location > > > they exist on the PF, we don't know what hidden registers might be > > > hiding in config space. Is there vendor guarantee that hidden registers will locate at the same offset between PF and VF config space? > > > > but we are not sure whether the same location is available on VF. In > > this patch, it actually places the emulated cap physically behind the > > cap which lays farthest (its offset is largest) within VF's config space > > as the PCIe caps are linked in a chain. > > But, as we've found on Broadcom NICs (iirc), hardware developers have a > nasty habit of hiding random registers in PCI config space, outside of > defined capabilities. I feel like IGD might even do this too, is that > true? So I don't think we can guarantee that just because a section of > config space isn't part of a defined capability that its unused. It > only means that it's unused by common code, but it might have device > specific purposes. So of the PCIe spec indicates that VFs cannot > include these capabilities and virtialization software needs to > emulate them, we need somewhere safe to place them in config space, and > simply placing them off the end of known capabilities doesn't give me > any confidence. Also, hardware has no requirement to make compact use > of extended config space. The first capability must be at 0x100, the > very next capability could consume all the way to the last byte of the > 4K extended range, and the next link in the chain could be somewhere in > the middle. Thanks, > Then what would be a viable option? Vendor nasty habit implies no standard, thus I don't see how VFIO can find a safe location by itself. Also curious how those hidden registers are identified by VFIO and employed with proper r/w policy today. If sort of quirks are used, then could such quirk way be extended to also carry the information about vendor specific safe location? When no such quirk info is provided (the majority case), VFIO then finds out a free location to carry the new cap. Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Fri, 3 Apr 2020 07:53:55 + "Liu, Yi L" wrote: > Hi Alex, > > > From: Alex Williamson > > Sent: Friday, April 3, 2020 7:00 AM > > To: Liu, Yi L > > Subject: Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs > > > > On Sun, 22 Mar 2020 05:33:14 -0700 > > "Liu, Yi L" wrote: > > > > > From: Liu Yi L > > > > > > Per PCIe r5.0, sec 9.3.7.14, if a PF implements the PASID Capability, the > > > PF PASID configuration is shared by its VFs. VFs must not implement their > > > own PASID Capability. > > > > > > Per PCIe r5.0, sec 9.3.7.11, VFs must not implement the PRI Capability. If > > > the PF implements PRI, it is shared by the VFs. > > > > > > On bare metal, it has been fixed by below efforts. > > > to PASID/PRI are > > > https://lkml.org/lkml/2019/9/5/996 > > > https://lkml.org/lkml/2019/9/5/995 > > > > > > This patch adds emulated PASID/PRI capabilities for VFs when assigned to > > > VMs via vfio-pci driver. This is required for enabling vSVA on > > > pass-through > > > VFs as VFs have no PASID/PRI capability structure in its configure space. > > > > > > Cc: Kevin Tian > > > CC: Jacob Pan > > > Cc: Alex Williamson > > > Cc: Eric Auger > > > Cc: Jean-Philippe Brucker > > > Signed-off-by: Liu Yi L > > > --- > > > drivers/vfio/pci/vfio_pci_config.c | 325 > > - > > > 1 file changed, 323 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/vfio/pci/vfio_pci_config.c > > > b/drivers/vfio/pci/vfio_pci_config.c > > > index 4b9af99..84b4ea0 100644 > > > --- a/drivers/vfio/pci/vfio_pci_config.c > > > +++ b/drivers/vfio/pci/vfio_pci_config.c > > > @@ -1509,11 +1509,304 @@ static int vfio_cap_init(struct vfio_pci_device > > > *vdev) > > > return 0; > > > } > > > > > > +static int vfio_fill_custom_vconfig_bytes(struct vfio_pci_device *vdev, > > > + int offset, uint8_t *data, int size) > > > +{ > > > + int ret = 0, data_offset = 0; > > > + > > > + while (size) { > > > + int filled; > > > + > > > + if (size >= 4 && !(offset % 4)) { > > > + __le32 *dwordp = (__le32 *)>vconfig[offset]; > > > + u32 dword; > > > + > > > + memcpy(, data + data_offset, 4); > > > + *dwordp = cpu_to_le32(dword); > > > > Why wouldn't we do: > > > > *dwordp = cpu_to_le32(*(u32 *)(data + data_offset)); > > > > or better yet, increment data on each iteration for: > > > > *dwordp = cpu_to_le32(*(u32 *)data); > > I'll refine it. > > > vfio_fill_vconfig_bytes() does almost this same thing, getting > > the data > > from config space rather than a buffer, so please figure out how to > > avoid duplicating the logic. > > This patch is to emulate the PASID/PRI capability for VF. And per > my understanding, the cap data from PF's config space cannot be > filled to VF's vconfig directly. Take the control register in PASID > capability structure as an example. If PASID cap is enabled in PF, > the PASID enable bit in the control register will be set. If the cap > data is filled to VF's vconfig directly, then guest will see a default > enabled PASID capability in the VF. I guess it is not good. But, I may > be wrong. If no such requirement, I can use vfio_fill_vconfig_bytes() > directly, no need to do copy and modify and then copy. > > Also, if still needs to do the copy and modification, I may modify > the vfio_fill_vconfig_bytes() to pass in the buffer and an indicator > to tell vfio_fill_vconfig_bytes() whether fetch data from buffer or > from config space. Why is vconfig not your buffer? We're building config space emulation before the user has access to the device, so it's really not an issue to copy the PF config into the VF vconfig, then modify the enable bit in the vconfig space. > > > + filled = 4; > > > + } else if (size >= 2 && !(offset % 2)) { > > > + __le16 *wordp = (__le16 *)>vconfig[offset]; > > > + u16 word; > > > + > > > + memcpy(, data + data_offset, 2); > > > + *wordp = cpu_to_le16(word); > > > + filled = 2; > > > + } else {
RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
Hi Alex, > From: Alex Williamson > Sent: Friday, April 3, 2020 7:00 AM > To: Liu, Yi L > Subject: Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs > > On Sun, 22 Mar 2020 05:33:14 -0700 > "Liu, Yi L" wrote: > > > From: Liu Yi L > > > > Per PCIe r5.0, sec 9.3.7.14, if a PF implements the PASID Capability, the > > PF PASID configuration is shared by its VFs. VFs must not implement their > > own PASID Capability. > > > > Per PCIe r5.0, sec 9.3.7.11, VFs must not implement the PRI Capability. If > > the PF implements PRI, it is shared by the VFs. > > > > On bare metal, it has been fixed by below efforts. > > to PASID/PRI are > > https://lkml.org/lkml/2019/9/5/996 > > https://lkml.org/lkml/2019/9/5/995 > > > > This patch adds emulated PASID/PRI capabilities for VFs when assigned to > > VMs via vfio-pci driver. This is required for enabling vSVA on pass-through > > VFs as VFs have no PASID/PRI capability structure in its configure space. > > > > Cc: Kevin Tian > > CC: Jacob Pan > > Cc: Alex Williamson > > Cc: Eric Auger > > Cc: Jean-Philippe Brucker > > Signed-off-by: Liu Yi L > > --- > > drivers/vfio/pci/vfio_pci_config.c | 325 > - > > 1 file changed, 323 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/vfio/pci/vfio_pci_config.c > > b/drivers/vfio/pci/vfio_pci_config.c > > index 4b9af99..84b4ea0 100644 > > --- a/drivers/vfio/pci/vfio_pci_config.c > > +++ b/drivers/vfio/pci/vfio_pci_config.c > > @@ -1509,11 +1509,304 @@ static int vfio_cap_init(struct vfio_pci_device > > *vdev) > > return 0; > > } > > > > +static int vfio_fill_custom_vconfig_bytes(struct vfio_pci_device *vdev, > > + int offset, uint8_t *data, int size) > > +{ > > + int ret = 0, data_offset = 0; > > + > > + while (size) { > > + int filled; > > + > > + if (size >= 4 && !(offset % 4)) { > > + __le32 *dwordp = (__le32 *)>vconfig[offset]; > > + u32 dword; > > + > > + memcpy(, data + data_offset, 4); > > + *dwordp = cpu_to_le32(dword); > > Why wouldn't we do: > > *dwordp = cpu_to_le32(*(u32 *)(data + data_offset)); > > or better yet, increment data on each iteration for: > > *dwordp = cpu_to_le32(*(u32 *)data); > > vfio_fill_vconfig_bytes() does almost this same thing, getting the data > from config space rather than a buffer, so please figure out how to > avoid duplicating the logic. Got another alternative. I may use the vfio_fill_vconfig_bytes() to fill the cap data from PF's config space into VF's vconfig. And after that, I can further modify the data in vconfig. e.g. zero the control reg of pasid cap. would it make more sense? > > + filled = 4; > > + } else if (size >= 2 && !(offset % 2)) { > > + __le16 *wordp = (__le16 *)>vconfig[offset]; > > + u16 word; > > + > > + memcpy(, data + data_offset, 2); > > + *wordp = cpu_to_le16(word); > > + filled = 2; > > + } else { > > + u8 *byte = >vconfig[offset]; > > + > > + memcpy(byte, data + data_offset, 1); [...] > > > + > > + memset(map + epos, vpasid_cap.id, len); > > See below. > > > + ret = vfio_fill_custom_vconfig_bytes(vdev, epos, > > + (u8 *)_cap, len); > > + if (!ret) { > > + /* > > +* Successfully filled in PASID cap, update > > +* the next offset in previous cap header, > > +* and also update caller about the offset > > +* of next cap if any. > > +*/ > > + u32 val = epos; > > + **prevp &= cpu_to_le32(~(0xffcU << 20)); > > + **prevp |= cpu_to_le32(val << 20); > > + *prevp = (__le32 *)>vconfig[epos]; > > + *next = epos + len; > > Could we make this any more complicated? I'm not sure if adding comments addressed this comment. After adding new cap in vconfig, it needs to update the cap.next field of prior cap. And in case of further add other cap, this function needs to update the prevp pointer to the address of the newly added cap. Regards, Yi Liu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
Hi Alex, > From: Alex Williamson > Sent: Friday, April 3, 2020 7:00 AM > To: Liu, Yi L > Subject: Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs > > On Sun, 22 Mar 2020 05:33:14 -0700 > "Liu, Yi L" wrote: > > > From: Liu Yi L > > > > Per PCIe r5.0, sec 9.3.7.14, if a PF implements the PASID Capability, the > > PF PASID configuration is shared by its VFs. VFs must not implement their > > own PASID Capability. > > > > Per PCIe r5.0, sec 9.3.7.11, VFs must not implement the PRI Capability. If > > the PF implements PRI, it is shared by the VFs. > > > > On bare metal, it has been fixed by below efforts. > > to PASID/PRI are > > https://lkml.org/lkml/2019/9/5/996 > > https://lkml.org/lkml/2019/9/5/995 > > > > This patch adds emulated PASID/PRI capabilities for VFs when assigned to > > VMs via vfio-pci driver. This is required for enabling vSVA on pass-through > > VFs as VFs have no PASID/PRI capability structure in its configure space. > > > > Cc: Kevin Tian > > CC: Jacob Pan > > Cc: Alex Williamson > > Cc: Eric Auger > > Cc: Jean-Philippe Brucker > > Signed-off-by: Liu Yi L > > --- > > drivers/vfio/pci/vfio_pci_config.c | 325 > - > > 1 file changed, 323 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/vfio/pci/vfio_pci_config.c > > b/drivers/vfio/pci/vfio_pci_config.c > > index 4b9af99..84b4ea0 100644 > > --- a/drivers/vfio/pci/vfio_pci_config.c > > +++ b/drivers/vfio/pci/vfio_pci_config.c > > @@ -1509,11 +1509,304 @@ static int vfio_cap_init(struct vfio_pci_device > > *vdev) > > return 0; > > } > > > > +static int vfio_fill_custom_vconfig_bytes(struct vfio_pci_device *vdev, > > + int offset, uint8_t *data, int size) > > +{ > > + int ret = 0, data_offset = 0; > > + > > + while (size) { > > + int filled; > > + > > + if (size >= 4 && !(offset % 4)) { > > + __le32 *dwordp = (__le32 *)>vconfig[offset]; > > + u32 dword; > > + > > + memcpy(, data + data_offset, 4); > > + *dwordp = cpu_to_le32(dword); > > Why wouldn't we do: > > *dwordp = cpu_to_le32(*(u32 *)(data + data_offset)); > > or better yet, increment data on each iteration for: > > *dwordp = cpu_to_le32(*(u32 *)data); I'll refine it. > vfio_fill_vconfig_bytes() does almost this same thing, getting > the data > from config space rather than a buffer, so please figure out how to > avoid duplicating the logic. This patch is to emulate the PASID/PRI capability for VF. And per my understanding, the cap data from PF's config space cannot be filled to VF's vconfig directly. Take the control register in PASID capability structure as an example. If PASID cap is enabled in PF, the PASID enable bit in the control register will be set. If the cap data is filled to VF's vconfig directly, then guest will see a default enabled PASID capability in the VF. I guess it is not good. But, I may be wrong. If no such requirement, I can use vfio_fill_vconfig_bytes() directly, no need to do copy and modify and then copy. Also, if still needs to do the copy and modification, I may modify the vfio_fill_vconfig_bytes() to pass in the buffer and an indicator to tell vfio_fill_vconfig_bytes() whether fetch data from buffer or from config space. > > + filled = 4; > > + } else if (size >= 2 && !(offset % 2)) { > > + __le16 *wordp = (__le16 *)>vconfig[offset]; > > + u16 word; > > + > > + memcpy(, data + data_offset, 2); > > + *wordp = cpu_to_le16(word); > > + filled = 2; > > + } else { > > + u8 *byte = >vconfig[offset]; > > + > > + memcpy(byte, data + data_offset, 1); > > This one is really silly. > > vdev->vconfig[offset] = *data; got it. > > + filled = 1; > > + } > > + > > + offset += filled; > > + data_offset += filled; > > + size -= filled; > > + } > > + > > + return ret; > > +} > > + > > +static int vfio_pci_get_ecap_content(struct pci_dev *pdev, > > + int cap, int cap_len, u8 *content) > > +{ > > + int pos, offset, len = cap_len, ret = 0; > > + > > + pos = pci_find_ext_capability(pdev, cap); > > +
Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
On Sun, 22 Mar 2020 05:33:14 -0700 "Liu, Yi L" wrote: > From: Liu Yi L > > Per PCIe r5.0, sec 9.3.7.14, if a PF implements the PASID Capability, the > PF PASID configuration is shared by its VFs. VFs must not implement their > own PASID Capability. > > Per PCIe r5.0, sec 9.3.7.11, VFs must not implement the PRI Capability. If > the PF implements PRI, it is shared by the VFs. > > On bare metal, it has been fixed by below efforts. > to PASID/PRI are > https://lkml.org/lkml/2019/9/5/996 > https://lkml.org/lkml/2019/9/5/995 > > This patch adds emulated PASID/PRI capabilities for VFs when assigned to > VMs via vfio-pci driver. This is required for enabling vSVA on pass-through > VFs as VFs have no PASID/PRI capability structure in its configure space. > > Cc: Kevin Tian > CC: Jacob Pan > Cc: Alex Williamson > Cc: Eric Auger > Cc: Jean-Philippe Brucker > Signed-off-by: Liu Yi L > --- > drivers/vfio/pci/vfio_pci_config.c | 325 > - > 1 file changed, 323 insertions(+), 2 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_config.c > b/drivers/vfio/pci/vfio_pci_config.c > index 4b9af99..84b4ea0 100644 > --- a/drivers/vfio/pci/vfio_pci_config.c > +++ b/drivers/vfio/pci/vfio_pci_config.c > @@ -1509,11 +1509,304 @@ static int vfio_cap_init(struct vfio_pci_device > *vdev) > return 0; > } > > +static int vfio_fill_custom_vconfig_bytes(struct vfio_pci_device *vdev, > + int offset, uint8_t *data, int size) > +{ > + int ret = 0, data_offset = 0; > + > + while (size) { > + int filled; > + > + if (size >= 4 && !(offset % 4)) { > + __le32 *dwordp = (__le32 *)>vconfig[offset]; > + u32 dword; > + > + memcpy(, data + data_offset, 4); > + *dwordp = cpu_to_le32(dword); Why wouldn't we do: *dwordp = cpu_to_le32(*(u32 *)(data + data_offset)); or better yet, increment data on each iteration for: *dwordp = cpu_to_le32(*(u32 *)data); vfio_fill_vconfig_bytes() does almost this same thing, getting the data from config space rather than a buffer, so please figure out how to avoid duplicating the logic. > + filled = 4; > + } else if (size >= 2 && !(offset % 2)) { > + __le16 *wordp = (__le16 *)>vconfig[offset]; > + u16 word; > + > + memcpy(, data + data_offset, 2); > + *wordp = cpu_to_le16(word); > + filled = 2; > + } else { > + u8 *byte = >vconfig[offset]; > + > + memcpy(byte, data + data_offset, 1); This one is really silly. vdev->vconfig[offset] = *data; > + filled = 1; > + } > + > + offset += filled; > + data_offset += filled; > + size -= filled; > + } > + > + return ret; > +} > + > +static int vfio_pci_get_ecap_content(struct pci_dev *pdev, > + int cap, int cap_len, u8 *content) > +{ > + int pos, offset, len = cap_len, ret = 0; > + > + pos = pci_find_ext_capability(pdev, cap); > + if (!pos) > + return -EINVAL; > + > + offset = 0; > + while (len) { > + int fetched; > + > + if (len >= 4 && !(pos % 4)) { > + u32 *dwordp = (u32 *) (content + offset); > + u32 dword; > + __le32 *dwptr = (__le32 *) > + > + ret = pci_read_config_dword(pdev, pos, ); > + if (ret) > + return ret; > + *dwordp = le32_to_cpu(*dwptr); WHAT??? pci_read_config_dword() returns cpu endian data! > + fetched = 4; > + } else if (len >= 2 && !(pos % 2)) { > + u16 *wordp = (u16 *) (content + offset); > + u16 word; > + __le16 *wptr = (__le16 *) > + > + ret = pci_read_config_word(pdev, pos, ); > + if (ret) > + return ret; > + *wordp = le16_to_cpu(*wptr); > + fetched = 2; > + } else { > + u8 *byte = (u8 *) (content + offset); > + > + ret = pci_read_config_byte(pdev, pos, byte); > + if (ret) > + return ret; > + fetched = 1; > + } > + > + pos += fetched; > + offset += fetched; > + len -= fetched; > + } > + > + return ret; > +} > + > +struct vfio_pci_pasid_cap_data { > + u32 id:16; > + u32 version:4; > + u32 next:12; > + union { > + u16 cap_reg_val; > + struct { > + u16 rsv1:1; > + u16 execs:1; > +