Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Wed, Jan 16, 2019 at 05:32:45PM +0100, Vincent Whitchurch wrote: > The Virtio-over-PCIe framework living under drivers/misc/mic/vop implements a > generic framework to use virtio between two Linux systems, given shared memory > and a couple of interrupts. It does not actually require the Intel MIC > hardware, x86-64, or even PCIe for that matter. This patch series makes it > buildable on more systems and adds a loopback driver to test it without > special > hardware. > > Note that I don't have access to Intel MIC hardware so some testing of the > patchset (especially the patch "vop: Use consistent DMA") on that platform > would be appreciated, to ensure that the series does not break anything there. > > Vincent Whitchurch (8): > vop: Use %z for size_t > vop: Cast pointers to uintptr_t > vop: Add definition of readq/writeq if missing > vop: Allow building on more systems > vop: vringh: Do not crash if no DMA channel > vop: Fix handling of >32 feature bits > vop: Use consistent DMA > vop: Add loopback I applied a few of these to my tree. Feel free to rebase and fix up patch 2 and resend. thanks, greg k-h
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Fri, Jan 18, 2019 at 04:49:16PM -0700, Stephen Warren wrote: > On 1/16/19 9:32 AM, Vincent Whitchurch wrote: > > The Virtio-over-PCIe framework living under drivers/misc/mic/vop implements > > a > > generic framework to use virtio between two Linux systems, given shared > > memory > > and a couple of interrupts. It does not actually require the Intel MIC > > hardware, x86-64, or even PCIe for that matter. This patch series makes it > > buildable on more systems and adds a loopback driver to test it without > > special > > hardware. > > > > Note that I don't have access to Intel MIC hardware so some testing of the > > patchset (especially the patch "vop: Use consistent DMA") on that platform > > would be appreciated, to ensure that the series does not break anything > > there. > > So a while ago I took a look at running virtio over PCIe. I found virtio > basically had two parts: > > 1) The protocol used to enumerate which virtio devices exist, and perhaps > configure them. > > 2) The ring buffer protocol that actually transfers the data. > > I recall that data transfer was purely based on simple shared memory and > interrupts, and hence could run over PCIe (e.g. via the PCIe endpoint > subsystem in the kernel) without issue. > > However, the enumeration/configuration protocol requires the host to be able > to do all kinds of strange things that can't possibly be emulated over PCIe; > IIRC the configuration data contains "registers" that when written select > the data other "registers" access. When the virtio device is exposed by a > hypervisor, and all the accesses are emulated synchronously through a trap, > this is easy enough to implement. However, if the two ends of this > configuration parsing are on different ends of a PCIe bus, there's no way > this can work. Correct, and that's why the MIC "Virtio-over-PCIe framework" does not try to implement the standard "Virtio Over PCI Bus". (Yes, it's confusing.) > Are you thinking of doing something different for enumeration/configuration, > and just using the virtio ring buffer protocol over PCIe? The mic/vop code already does this. See Documentation/mic/mic_overview.txt for some information. > I did post asking about this quite a while back, but IIRC I didn't receive > much of a response. Yes, here it is: > > > https://lists.linuxfoundation.org/pipermail/virtualization/2018-March/037276.html > "virtio over SW-defined/CPU-driven PCIe endpoint" I came to essentialy the same conclusions before I found the MIC code. (Your "aside" in that email about virtio doing PCIe reads instead of writes is not solved by the MIC code, since that is how the standard virtio devices/drivers work.)
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On 1/16/19 9:32 AM, Vincent Whitchurch wrote: The Virtio-over-PCIe framework living under drivers/misc/mic/vop implements a generic framework to use virtio between two Linux systems, given shared memory and a couple of interrupts. It does not actually require the Intel MIC hardware, x86-64, or even PCIe for that matter. This patch series makes it buildable on more systems and adds a loopback driver to test it without special hardware. Note that I don't have access to Intel MIC hardware so some testing of the patchset (especially the patch "vop: Use consistent DMA") on that platform would be appreciated, to ensure that the series does not break anything there. So a while ago I took a look at running virtio over PCIe. I found virtio basically had two parts: 1) The protocol used to enumerate which virtio devices exist, and perhaps configure them. 2) The ring buffer protocol that actually transfers the data. I recall that data transfer was purely based on simple shared memory and interrupts, and hence could run over PCIe (e.g. via the PCIe endpoint subsystem in the kernel) without issue. However, the enumeration/configuration protocol requires the host to be able to do all kinds of strange things that can't possibly be emulated over PCIe; IIRC the configuration data contains "registers" that when written select the data other "registers" access. When the virtio device is exposed by a hypervisor, and all the accesses are emulated synchronously through a trap, this is easy enough to implement. However, if the two ends of this configuration parsing are on different ends of a PCIe bus, there's no way this can work. Are you thinking of doing something different for enumeration/configuration, and just using the virtio ring buffer protocol over PCIe? I did post asking about this quite a while back, but IIRC I didn't receive much of a response. Yes, here it is: https://lists.linuxfoundation.org/pipermail/virtualization/2018-March/037276.html "virtio over SW-defined/CPU-driven PCIe endpoint"
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On 2019-01-17 8:19 a.m., Vincent Whitchurch wrote: > On the endpoint, the PCIe endpoint driver sets up (hardcoded) BARs and > memory regions as required to allow the endpoint and the root complex to > access each other's memory. This statement describes NTB hardware pretty well. In essence that's what an NTB device is: a BAR that maps to a window in other hosts memory. Right now the entire NTB upstream software stack (ntb_transport and ntb_netdev) is specific to that ecosystem and only exposes a network device so the hosts can communicate. This code works but has some issues and was never able to perform at full PCIe line speeds (which everyone expects). So it's not clear to me if anyone is doing anything real with it. The companies that are working on NTB, that I'm aware of, have mostly done their own out-of-tree stuff. It would be interesting to unify ntb_transport with the virtio stack because I suspect they do very similar things right now and there's a lot more devices above virtio than just a network device. However, the main problem people working on NTB face (besides performance) is trying to get multi-host working in a general and sensible way given that the hardware typically has limited BAR resources (among other limitations). Logan
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 5:26 PM Vincent Whitchurch wrote: > On Thu, Jan 17, 2019 at 04:53:25PM +0100, Arnd Bergmann wrote: > > On Thu, Jan 17, 2019 at 4:19 PM Vincent Whitchurch > > Ok, this seems fine so far. So the vop-host-backend is a regular PCI > > driver that implements the VOP protocol from the host side, and it > > can talk to either a MIC, or another guest-backend written for the PCI-EP > > framework to implement the same protocol, right? > > Yes, but just to clarify: the placement of the device page and the way > to communicate the location of the device page address and any other > information needed by the guest-backend are hardware-specific so there > is no generic vop-host-backend implementation which can talk to both a > MIC and to something else. I'm not sure I understand what is hardware specific about it. Shouldn't it be possible to define at least a vop-host-backend that could work with any guest-backend running on the PCI-EP framework? This may have to be different from the interface used on MIC, but generally speaking that is what I expect from a PCI device. Arnd
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 04:53:25PM +0100, Arnd Bergmann wrote: > On Thu, Jan 17, 2019 at 4:19 PM Vincent Whitchurch > wrote: > > On Thu, Jan 17, 2019 at 01:39:27PM +0100, Arnd Bergmann wrote: > > > Can you describe how you expect a VOP device over NTB or > > > PCIe-endpoint would get created, configured and used? > > > > Assuming PCIe-endpoint: > > > > On the RC, a vop-host-backend driver (PCI driver) sets up some shared > > memory area which the RC and the endpoint can use to communicate the > > location of the MIC device descriptors and other information such as the > > MSI address. It implements vop callbacks to allow the vop framework to > > obtain the address of the MIC descriptors and send/receive interrupts > > to/from the guest. > > > > On the endpoint, the PCIe endpoint driver sets up (hardcoded) BARs and > > memory regions as required to allow the endpoint and the root complex to > > access each other's memory. > > > > On the endpoint, the vop-guest-backend, via the shared memory set up by > > the vop-host-backend, obtains the address of the MIC device page and the > > MSI address, and a method to receive vop interrupts from the host. This > > information is used to implement the vop callbacks allowing the vop > > framework to access to the MIC device page and send/receive interrupts > > from/to the host. > > Ok, this seems fine so far. So the vop-host-backend is a regular PCI > driver that implements the VOP protocol from the host side, and it > can talk to either a MIC, or another guest-backend written for the PCI-EP > framework to implement the same protocol, right? Yes, but just to clarify: the placement of the device page and the way to communicate the location of the device page address and any other information needed by the guest-backend are hardware-specific so there is no generic vop-host-backend implementation which can talk to both a MIC and to something else. > > vop (despite its name) doesn't care about PCIe. The vop-guest-backend > > doesn't actually need to talk to the PCIe endpoint driver. The > > vop-guest-backend can be probed via any means, such as via a device tree > > on the endpoint. > > > > On the RC, userspace opens the vop device and adds the virtio devices, > > which end up in the MIC device page set up by the vop-host-backend. > > > > On the endpoint, when the vop framework (via the vop-guest-backend) sees > > these devices, it registers devices on the virtio bus and the virtio > > drivers are probed. > > Ah, so the direction is fixed, and it's the opposite of what Christoph > and I were expecting. This is probably something we need to discuss > a bit. From what I understand, there is no technical requirement why > it has to be this direction, right? I don't think the vop framework itself has any such requirement. The MIC uses it in this way (see Documentation/mic/mic_overview.txt) and it also makes sense (to me, at least) if one wants to treat the endpoint like one would treat a virtualized guest. > What I mean is that the same vop framework could work with > a PCI-EP driver implementing the vop-host-backend and > a PCI driver implementing the vop-guest-backend? In order > to do this, the PCI-EP configuration would need to pick whether > it wants the EP to be the vop host or guest, but having more > flexibility in it (letting each side add virtio devices) would be > harder to do. Correct, this is my understanding also. > > On the RC, userspace implements the device end of the virtio > > communication in userspace, using the MIC_VIRTIO_COPY_DESC ioctl. I > > also have patches to support vhost. > > This is a part I don't understand yet. Does this mean that the > normal operation is between a user space process on the vop-host > talking to the kernel on the vop-guest? Yes. For example, the guest mounts a 9p filesystem with virtio-9p and the 9p server is implemented in a userspace process on the host. This is again similar to virtualization. > I'm a bit worried about the ioctl interface here, as this combines the > configuration side with the actual data transfer, and that seems > a bit inflexible. > > > > Is there always one master side that is responsible for creating > > > virtio devices on it, with the slave side automatically attaching to > > > them, or can either side create virtio devices? > > > > Only the master can create virtio devices. The virtio drivers run on > > the slave. > > Ok. > > > > Is there any limit on > > > the number of virtio devices or queues within a VOP device? > > > > The virtio device information (mic_device_desc) is put into the MIC > > device page whose size is limited by the ABI header in > > include/uapi/linux/mic_ioctl.h (MIC_DP_SIZE, 4096 bytes). So the number > > of devices is limited by the limit of the number of device descriptors > > that can fit in that size. There is also a per-device limit on the > > number of vrings (MIC_VRING_ENTRIES) and vring entries > > (MIC_VRING_ENTRIES) in the ABI header. > > Ok
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 4:46 PM Christoph Hellwig wrote: > > On Thu, Jan 17, 2019 at 04:32:06PM +0100, Vincent Whitchurch wrote: > > If I understand you correctly, I think you're talking about the RC > > running the virtio drivers and the endpoint implementing the virtio > > device? This vop stuff is used for the other way around: the virtio > > device is implement on the RC and the endpoint runs the virtio drivers. > > Oh. That is really weird and not that way I'd implement it.. It does make sense to me for the very special requirements of the MIC device, which has a regular PC-style server that provides the environment for a special embedded device inside of a PCIe card, so the PCI-EP stuff is just used as a transport here going one way, and then the configuration of the devices implemented through it goes the other way, providing network connectivity and file system to the embedded machine on the PCI-EP. This is actually very similar to a setup that I considered implementing over USB, where one might have an embedded machine (or a bunch of them on a USB hub) connected to a USB host port, and then use it in the opposite way of a regular gadget driver, by providing a virtfs over USB to the gadget with files residing on a disk on the USB host. Apparently Vincent has the same use case that both the Intel MIC folks and I had here, so doing it like this is clearly useful. On the other hand, I agree that there are lots of other use cases that need the opposite, so we should try to come up with a design that can cover both. An example of this might be a PCIe-endpoint device providing network connectivity to the host using a vhost-net device, which ideally just shows up as a device on the host as a virtio-net without requiring any configuration. So for configuring this, I think it'd like to see a way to have either the PCI-EP or the PCI-host side be the one that can create virtio devices that show up on the other end. This configuration is currently done using an ioctl interface, which was probably the easiest to do for the MIC case, but for consistency with the PCI-EP framework, using configfs is probably better. A different matter is the question of what a virtio device talks to. A lot of virtio devices are fundamentally asymmetric (9pfs, rng, block, ...), so you'd have to have the virtio device on one side, and a user space or vhost driver on the other. The VOP driver seems to assume that it's always the slave that uses virtio, while the master side (which could be on the PCI EP or PCI host for the sake of this argument) implements it in user space or otherwise. Is this a safe assumption, or can we imagine cases where this would be reversed as well? Arnd
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 4:19 PM Vincent Whitchurch wrote: > > On Thu, Jan 17, 2019 at 01:39:27PM +0100, Arnd Bergmann wrote: > > Correct, and again we have to see if this is a good interface. The NTB > > and PCIe-endpoint interfaces have a number of differences and a > > number of similarities. In particular they should both be usable with > > virtio-style drivers, but the underlying hardware differs mainly in how > > it is probed by the system: an NTB is seen as a PCI device attached > > to two host bridges, while and endpoint is typically a platform_device > > on one side, but a pci_dev on the other side. > > > > Can you describe how you expect a VOP device over NTB or > > PCIe-endpoint would get created, configured and used? > > Assuming PCIe-endpoint: > > On the RC, a vop-host-backend driver (PCI driver) sets up some shared > memory area which the RC and the endpoint can use to communicate the > location of the MIC device descriptors and other information such as the > MSI address. It implements vop callbacks to allow the vop framework to > obtain the address of the MIC descriptors and send/receive interrupts > to/from the guest. > > On the endpoint, the PCIe endpoint driver sets up (hardcoded) BARs and > memory regions as required to allow the endpoint and the root complex to > access each other's memory. > > On the endpoint, the vop-guest-backend, via the shared memory set up by > the vop-host-backend, obtains the address of the MIC device page and the > MSI address, and a method to receive vop interrupts from the host. This > information is used to implement the vop callbacks allowing the vop > framework to access to the MIC device page and send/receive interrupts > from/to the host. Ok, this seems fine so far. So the vop-host-backend is a regular PCI driver that implements the VOP protocol from the host side, and it can talk to either a MIC, or another guest-backend written for the PCI-EP framework to implement the same protocol, right? > vop (despite its name) doesn't care about PCIe. The vop-guest-backend > doesn't actually need to talk to the PCIe endpoint driver. The > vop-guest-backend can be probed via any means, such as via a device tree > on the endpoint. > > On the RC, userspace opens the vop device and adds the virtio devices, > which end up in the MIC device page set up by the vop-host-backend. > > On the endpoint, when the vop framework (via the vop-guest-backend) sees > these devices, it registers devices on the virtio bus and the virtio > drivers are probed. Ah, so the direction is fixed, and it's the opposite of what Christoph and I were expecting. This is probably something we need to discuss a bit. From what I understand, there is no technical requirement why it has to be this direction, right? What I mean is that the same vop framework could work with a PCI-EP driver implementing the vop-host-backend and a PCI driver implementing the vop-guest-backend? In order to do this, the PCI-EP configuration would need to pick whether it wants the EP to be the vop host or guest, but having more flexibility in it (letting each side add virtio devices) would be harder to do. > On the RC, userspace implements the device end of the virtio > communication in userspace, using the MIC_VIRTIO_COPY_DESC ioctl. I > also have patches to support vhost. This is a part I don't understand yet. Does this mean that the normal operation is between a user space process on the vop-host talking to the kernel on the vop-guest? I'm a bit worried about the ioctl interface here, as this combines the configuration side with the actual data transfer, and that seems a bit inflexible. > > Is there always one master side that is responsible for creating > > virtio devices on it, with the slave side automatically attaching to > > them, or can either side create virtio devices? > > Only the master can create virtio devices. The virtio drivers run on > the slave. Ok. > > Is there any limit on > > the number of virtio devices or queues within a VOP device? > > The virtio device information (mic_device_desc) is put into the MIC > device page whose size is limited by the ABI header in > include/uapi/linux/mic_ioctl.h (MIC_DP_SIZE, 4096 bytes). So the number > of devices is limited by the limit of the number of device descriptors > that can fit in that size. There is also a per-device limit on the > number of vrings (MIC_VRING_ENTRIES) and vring entries > (MIC_VRING_ENTRIES) in the ABI header. Ok, so you can have multiple virtio devices (e.g. a virtio-net and virtio-console) but not an arbitrary number? I suppose we can always extend it later if that becomes a problem. Arnd
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 04:32:06PM +0100, Vincent Whitchurch wrote: > If I understand you correctly, I think you're talking about the RC > running the virtio drivers and the endpoint implementing the virtio > device? This vop stuff is used for the other way around: the virtio > device is implement on the RC and the endpoint runs the virtio drivers. Oh. That is really weird and not that way I'd implement it..
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 07:21:42AM -0800, Christoph Hellwig wrote: > On Thu, Jan 17, 2019 at 04:19:06PM +0100, Vincent Whitchurch wrote: > > On the RC, a vop-host-backend driver (PCI driver) sets up some shared > > memory area which the RC and the endpoint can use to communicate the > > location of the MIC device descriptors and other information such as the > > MSI address. It implements vop callbacks to allow the vop framework to > > obtain the address of the MIC descriptors and send/receive interrupts > > to/from the guest. > > Why would we require any work on the RC / host side? A properly > setup software controlled virtio device should just show up as a > normal PCIe device, and the virtio-pci device should bind to it. If I understand you correctly, I think you're talking about the RC running the virtio drivers and the endpoint implementing the virtio device? This vop stuff is used for the other way around: the virtio device is implement on the RC and the endpoint runs the virtio drivers.
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 4:19 PM Christoph Hellwig wrote: > > On Thu, Jan 17, 2019 at 07:15:29AM -0800, Christoph Hellwig wrote: > > On Thu, Jan 17, 2019 at 01:39:27PM +0100, Arnd Bergmann wrote: > > > Can you describe how you expect a VOP device over NTB or > > > PCIe-endpoint would get created, configured and used? > > > Is there always one master side that is responsible for creating > > > virtio devices on it, with the slave side automatically attaching to > > > them, or can either side create virtio devices? Is there any limit on > > > the number of virtio devices or queues within a VOP device? > > > > For VOP device over NTB your configure your device using configfs > > on one side, and for the other side it will just show up like any > > other PCIe device, because it is. > > Sorry, I mean over the PCI-EP infratructure of course. NTB actually > is rather hairy and complicated. My understanding was that with virtio, we would be able to have multiple virtio devices on a single PCI-EP port, so you need a multi-step configuration: You first set up the PCI-EP to instantiate a VOP device, which is then seen on both ends of the connection. The question is how to create a particular virtio device instance (or a set of those) inside of it. Arnd
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 04:19:06PM +0100, Vincent Whitchurch wrote: > On the RC, a vop-host-backend driver (PCI driver) sets up some shared > memory area which the RC and the endpoint can use to communicate the > location of the MIC device descriptors and other information such as the > MSI address. It implements vop callbacks to allow the vop framework to > obtain the address of the MIC descriptors and send/receive interrupts > to/from the guest. Why would we require any work on the RC / host side? A properly setup software controlled virtio device should just show up as a normal PCIe device, and the virtio-pci device should bind to it.
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 07:15:29AM -0800, Christoph Hellwig wrote: > On Thu, Jan 17, 2019 at 01:39:27PM +0100, Arnd Bergmann wrote: > > Can you describe how you expect a VOP device over NTB or > > PCIe-endpoint would get created, configured and used? > > Is there always one master side that is responsible for creating > > virtio devices on it, with the slave side automatically attaching to > > them, or can either side create virtio devices? Is there any limit on > > the number of virtio devices or queues within a VOP device? > > For VOP device over NTB your configure your device using configfs > on one side, and for the other side it will just show up like any > other PCIe device, because it is. Sorry, I mean over the PCI-EP infratructure of course. NTB actually is rather hairy and complicated.
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 01:39:27PM +0100, Arnd Bergmann wrote: > Correct, and again we have to see if this is a good interface. The NTB > and PCIe-endpoint interfaces have a number of differences and a > number of similarities. In particular they should both be usable with > virtio-style drivers, but the underlying hardware differs mainly in how > it is probed by the system: an NTB is seen as a PCI device attached > to two host bridges, while and endpoint is typically a platform_device > on one side, but a pci_dev on the other side. > > Can you describe how you expect a VOP device over NTB or > PCIe-endpoint would get created, configured and used? Assuming PCIe-endpoint: On the RC, a vop-host-backend driver (PCI driver) sets up some shared memory area which the RC and the endpoint can use to communicate the location of the MIC device descriptors and other information such as the MSI address. It implements vop callbacks to allow the vop framework to obtain the address of the MIC descriptors and send/receive interrupts to/from the guest. On the endpoint, the PCIe endpoint driver sets up (hardcoded) BARs and memory regions as required to allow the endpoint and the root complex to access each other's memory. On the endpoint, the vop-guest-backend, via the shared memory set up by the vop-host-backend, obtains the address of the MIC device page and the MSI address, and a method to receive vop interrupts from the host. This information is used to implement the vop callbacks allowing the vop framework to access to the MIC device page and send/receive interrupts from/to the host. vop (despite its name) doesn't care about PCIe. The vop-guest-backend doesn't actually need to talk to the PCIe endpoint driver. The vop-guest-backend can be probed via any means, such as via a device tree on the endpoint. On the RC, userspace opens the vop device and adds the virtio devices, which end up in the MIC device page set up by the vop-host-backend. On the endpoint, when the vop framework (via the vop-guest-backend) sees these devices, it registers devices on the virtio bus and the virtio drivers are probed. On the RC, userspace implements the device end of the virtio communication in userspace, using the MIC_VIRTIO_COPY_DESC ioctl. I also have patches to support vhost. > Is there always one master side that is responsible for creating > virtio devices on it, with the slave side automatically attaching to > them, or can either side create virtio devices? Only the master can create virtio devices. The virtio drivers run on the slave. > Is there any limit on > the number of virtio devices or queues within a VOP device? The virtio device information (mic_device_desc) is put into the MIC device page whose size is limited by the ABI header in include/uapi/linux/mic_ioctl.h (MIC_DP_SIZE, 4096 bytes). So the number of devices is limited by the limit of the number of device descriptors that can fit in that size. There is also a per-device limit on the number of vrings (MIC_VRING_ENTRIES) and vring entries (MIC_VRING_ENTRIES) in the ABI header.
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 01:39:27PM +0100, Arnd Bergmann wrote: > Can you describe how you expect a VOP device over NTB or > PCIe-endpoint would get created, configured and used? > Is there always one master side that is responsible for creating > virtio devices on it, with the slave side automatically attaching to > them, or can either side create virtio devices? Is there any limit on > the number of virtio devices or queues within a VOP device? For VOP device over NTB your configure your device using configfs on one side, and for the other side it will just show up like any other PCIe device, because it is.
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Thu, Jan 17, 2019 at 11:54 AM Vincent Whitchurch wrote: > > On Wed, Jan 16, 2019 at 06:07:53PM +0100, Arnd Bergmann wrote: > > On Wed, Jan 16, 2019 at 5:33 PM Vincent Whitchurch > > wrote: > > > The Virtio-over-PCIe framework living under drivers/misc/mic/vop > > > implements a > > > generic framework to use virtio between two Linux systems, given shared > > > memory > > > and a couple of interrupts. It does not actually require the Intel MIC > > > hardware, x86-64, or even PCIe for that matter. This patch series makes > > > it > > > buildable on more systems and adds a loopback driver to test it without > > > special > > > hardware. > > > > > > Note that I don't have access to Intel MIC hardware so some testing of the > > > patchset (especially the patch "vop: Use consistent DMA") on that platform > > > would be appreciated, to ensure that the series does not break anything > > > there. > > > > I think we need to take a step back though and discuss what combinations > > we actually do want to support. I have not actually read the whole mic/vop > > driver, so I don't know if this would be a good fit as a generic interface > > -- > > it may or may not be, and any other input would be helpful. > > The MIC driver as a whole is uninteresting as a generic interface since > it is quite tied to the Intel hardware. The VOP parts though are > logically separated and have no relation to that hardware, even if the > ioctls are called MIC_VIRTIO_*. > > The samples/mic/mpssd/mpssd.c code handles both the boot of the MIC > (sysfs) and the VOP parts (ioctls). Right, I wasn't talking about the MIC driver here, just the VOP stuff. Since that comes with an ioctl interface that you want to keep using on other hardware, this still means we have to review if it is a good fit as a general-purpose API. > > Aside from that, I should note that we have two related subsystems > > in the kernel: the PCIe endpoint subsystem maintained by Kishon and > > Lorenzo, and the NTB subsystem maintained by Jon, Dave and Allen. > > > > In order to properly support virtio over PCIe, I would hope we can come > > up with a user space interface that looks the same way for configuring > > virtio drivers in mic, pcie-endpoint and ntb, if at all possible. Have > > you looked at those two subsystems? > > pcie-endpoint is a generic framework that allows Linux to act as an > endpoint and set up the BARs, etc. mic appears to have Intel > MIC-specific code for this (pre-dating pcie-endpoint) but this is > separate from the vop code. pcie-endpoint and vop do not have > overlapping functionality and can be used together. What we need to find out though is whether the combination of vop with pcie-endpoint provides a good abstraction for what users actually need when want to use e.g. a virtio-net connection on top of PCIe endpoint hardware. > I'm not familiar with NTB, but from a quick look it seems to be tied to > special hardware, and I don't see any virtio-related code there. A vop > backend for NTB-backend would presumably work to allow virtio > functionality there. Correct, and again we have to see if this is a good interface. The NTB and PCIe-endpoint interfaces have a number of differences and a number of similarities. In particular they should both be usable with virtio-style drivers, but the underlying hardware differs mainly in how it is probed by the system: an NTB is seen as a PCI device attached to two host bridges, while and endpoint is typically a platform_device on one side, but a pci_dev on the other side. Can you describe how you expect a VOP device over NTB or PCIe-endpoint would get created, configured and used? Is there always one master side that is responsible for creating virtio devices on it, with the slave side automatically attaching to them, or can either side create virtio devices? Is there any limit on the number of virtio devices or queues within a VOP device? Arnd
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Wed, Jan 16, 2019 at 06:07:53PM +0100, Arnd Bergmann wrote: > On Wed, Jan 16, 2019 at 5:33 PM Vincent Whitchurch > wrote: > > The Virtio-over-PCIe framework living under drivers/misc/mic/vop implements > > a > > generic framework to use virtio between two Linux systems, given shared > > memory > > and a couple of interrupts. It does not actually require the Intel MIC > > hardware, x86-64, or even PCIe for that matter. This patch series makes it > > buildable on more systems and adds a loopback driver to test it without > > special > > hardware. > > > > Note that I don't have access to Intel MIC hardware so some testing of the > > patchset (especially the patch "vop: Use consistent DMA") on that platform > > would be appreciated, to ensure that the series does not break anything > > there. > > I think we need to take a step back though and discuss what combinations > we actually do want to support. I have not actually read the whole mic/vop > driver, so I don't know if this would be a good fit as a generic interface -- > it may or may not be, and any other input would be helpful. The MIC driver as a whole is uninteresting as a generic interface since it is quite tied to the Intel hardware. The VOP parts though are logically separated and have no relation to that hardware, even if the ioctls are called MIC_VIRTIO_*. The samples/mic/mpssd/mpssd.c code handles both the boot of the MIC (sysfs) and the VOP parts (ioctls). > Aside from that, I should note that we have two related subsystems > in the kernel: the PCIe endpoint subsystem maintained by Kishon and > Lorenzo, and the NTB subsystem maintained by Jon, Dave and Allen. > > In order to properly support virtio over PCIe, I would hope we can come > up with a user space interface that looks the same way for configuring > virtio drivers in mic, pcie-endpoint and ntb, if at all possible. Have > you looked at those two subsystems? pcie-endpoint is a generic framework that allows Linux to act as an endpoint and set up the BARs, etc. mic appears to have Intel MIC-specific code for this (pre-dating pcie-endpoint) but this is separate from the vop code. pcie-endpoint and vop do not have overlapping functionality and can be used together. I'm not familiar with NTB, but from a quick look it seems to be tied to special hardware, and I don't see any virtio-related code there. A vop backend for NTB-backend would presumably work to allow virtio functionality there.
Re: [PATCH 0/8] Virtio-over-PCIe on non-MIC
On Wed, Jan 16, 2019 at 5:33 PM Vincent Whitchurch wrote: > > The Virtio-over-PCIe framework living under drivers/misc/mic/vop implements a > generic framework to use virtio between two Linux systems, given shared memory > and a couple of interrupts. It does not actually require the Intel MIC > hardware, x86-64, or even PCIe for that matter. This patch series makes it > buildable on more systems and adds a loopback driver to test it without > special > hardware. > > Note that I don't have access to Intel MIC hardware so some testing of the > patchset (especially the patch "vop: Use consistent DMA") on that platform > would be appreciated, to ensure that the series does not break anything there. Hi Vincent, First of all, I think it is a very good idea to make virtio over PCIe avaialable more generally. Your patches also make sense here, they mostly fix portability bugs, so no objection there. I think we need to take a step back though and discuss what combinations we actually do want to support. I have not actually read the whole mic/vop driver, so I don't know if this would be a good fit as a generic interface -- it may or may not be, and any other input would be helpful. Aside from that, I should note that we have two related subsystems in the kernel: the PCIe endpoint subsystem maintained by Kishon and Lorenzo, and the NTB subsystem maintained by Jon, Dave and Allen. In order to properly support virtio over PCIe, I would hope we can come up with a user space interface that looks the same way for configuring virtio drivers in mic, pcie-endpoint and ntb, if at all possible. Have you looked at those two subsystems? Arnd
[PATCH 0/8] Virtio-over-PCIe on non-MIC
The Virtio-over-PCIe framework living under drivers/misc/mic/vop implements a generic framework to use virtio between two Linux systems, given shared memory and a couple of interrupts. It does not actually require the Intel MIC hardware, x86-64, or even PCIe for that matter. This patch series makes it buildable on more systems and adds a loopback driver to test it without special hardware. Note that I don't have access to Intel MIC hardware so some testing of the patchset (especially the patch "vop: Use consistent DMA") on that platform would be appreciated, to ensure that the series does not break anything there. Vincent Whitchurch (8): vop: Use %z for size_t vop: Cast pointers to uintptr_t vop: Add definition of readq/writeq if missing vop: Allow building on more systems vop: vringh: Do not crash if no DMA channel vop: Fix handling of >32 feature bits vop: Use consistent DMA vop: Add loopback drivers/misc/mic/Kconfig| 14 +- drivers/misc/mic/bus/vop_bus.h | 2 + drivers/misc/mic/host/mic_boot.c| 46 drivers/misc/mic/vop/Makefile | 2 + drivers/misc/mic/vop/vop_loopback.c | 390 drivers/misc/mic/vop/vop_main.c | 36 +-- drivers/misc/mic/vop/vop_vringh.c | 151 ++- 7 files changed, 549 insertions(+), 92 deletions(-) create mode 100644 drivers/misc/mic/vop/vop_loopback.c -- 2.20.0