-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf
Of
Zhao, Yu
Sent: Thursday, November 06, 2008 11:06 PM
To: Chris Wright
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED];
Matthew Wilcox; Greg KH; [EMAIL PROTECTED];
[EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED];
kvm@vger.kernel.org; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [PATCH 0/16 v6] PCI: Linux kernel SR-IOV support
Chris Wright wrote:
* Greg KH ([EMAIL PROTECTED]) wrote:
On Thu, Nov 06, 2008 at 10:47:41AM -0700, Matthew Wilcox wrote:
On Thu, Nov 06, 2008 at 08:49:19AM -0800, Greg KH wrote:
On Thu, Nov 06, 2008 at 08:41:53AM -0800, H L wrote:
I have not modified any existing drivers, but instead I threw
together
a bare-bones module enabling me to make a call to
pci_iov_register()
and then poke at an SR-IOV adapter's /sys entries for which no
driver
was loaded.
It appears from my perusal thus far that drivers using these new
SR-IOV patches will require modification; i.e. the driver
associated
with the Physical Function (PF) will be required to make the
pci_iov_register() call along with the requisite notify()
function.
Essentially this suggests to me a model for the PF driver to
perform
any global actions or setup on behalf of VFs before enabling
them
after which VF drivers could be associated.
Where would the VF drivers have to be associated? On the
pci_dev
level or on a higher one?
Will all drivers that want to bind to a VF device need to be
rewritten?
The current model being implemented by my colleagues has separate
drivers for the PF (aka native) and VF devices. I don't
personally
believe this is the correct path, but I'm reserving judgement
until I
see some code.
Hm, I would like to see that code before we can properly evaluate
this
interface. Especially as they are all tightly tied together.
I don't think we really know what the One True Usage model is for
VF
devices. Chris Wright has some ideas, I have some ideas and Yu
Zhao
has
some ideas. I bet there's other people who have other ideas too.
I'd love to hear those ideas.
First there's the question of how to represent the VF on the host.
Ideally (IMO) this would show up as a normal interface so that
normal
tools
can configure the interface. This is not exactly how the first
round of
patches were designed.
Whether the VF can show up as a normal interface is decided by VF
driver. VF is represented by 'pci_dev' at PCI level, so VF driver can
be
loaded as normal PCI device driver.
What the software representation (eth, framebuffer, etc.) created by
VF
driver is not controlled by SR-IOV framework.
So you definitely can use normal tool to configure the VF if its
driver
supports that :-)
Second there's the question of reserving the BDF on the host such
that
we don't have two drivers (one in the host and one in a guest)
trying to
drive the same device (an issue that shows up for device assignment
as
well as VF assignment).
If we don't reserve BDF for the device, they can't work neither in the
host nor the guest.
Without BDF, we can't access the config space of the device, the
device
also can't do DMA.
Did I miss your point?
Third there's the question of whether the VF can be used in the host
at
all.
Why can't? My VFs work well in the host as normal PCI devices :-)
Fourth there's the question of whether the VF and PF drivers are the
same or separate.
As I mentioned in another email of this thread. We can't predict how
hardware vendor creates their SR-IOV device. PCI SIG doesn't define
device specific logics.
So I think the answer of this question is up to the device driver
developers. If PF and VF in a SR-IOV device have similar logics, then
they can combine the driver. Otherwise, e.g., if PF doesn't have real
functionality at all -- it only has registers to control internal
resource allocation for VFs, then the drivers should be separate,
right?
Right, this really depends upon the functionality behind a VF. If VF is
done as a subset of netdev interface (for example, a queue pair), then a
split VF/PF driver model and a proprietary communication channel is in
order.
If each VF is done as a complete netdev interface (like in our 10GbE IOV
controllers), then PF and VF drivers could be the same. Each VF can be
independently driven by such native netdev driver; this includes the
ability to run a native driver in a guest in passthru mode.
A PF driver in a privileged domain doesn't even have to be present.
The typical usecase is assigning the VF to the guest directly, so
there's only enough functionality in the host side to allocate a VF,
configure it, and assign it (and propagate AER). This is with
separate
PF and VF driver.
As Anthony mentioned, we are interested in allowing the host to use
the
VF. This could be useful for containers as well as