On Wed, Apr 18, 2018 at 07:20:10PM +0300, Michael S. Tsirkin wrote: > On Wed, Apr 18, 2018 at 08:47:10AM +0530, Anshuman Khandual wrote: > > On 04/15/2018 05:41 PM, Christoph Hellwig wrote: > > > On Fri, Apr 06, 2018 at 06:37:18PM +1000, Benjamin Herrenschmidt wrote: > > >>>> implemented as DMA API which the virtio core understands. There is no > > >>>> need for an IOMMU to be involved for the device representation in this > > >>>> case IMHO. > > >>> > > >>> This whole virtio translation issue is a mess. I think we need to > > >>> switch it to the dma API, and then quirk the legacy case to always > > >>> use the direct mapping inside the dma API. > > >> > > >> Fine with using a dma API always on the Linux side, but we do want to > > >> special case virtio still at the arch and qemu side to have a "direct > > >> mapping" mode. Not sure how (special flags on PCI devices) to avoid > > >> actually going through an emulated IOMMU on the qemu side, because that > > >> slows things down, esp. with vhost. > > >> > > >> IE, we can't I think just treat it the same as a physical device. > > > > > > We should have treated it like a physical device from the start, but > > > that device has unfortunately sailed. > > > > > > But yes, we'll need a per-device quirk that says 'don't attach an > > > iommu'. > > > > How about doing it per platform basis as suggested in this RFC through > > an arch specific callback. Because all the virtio devices in the given > > platform would require and exercise this option (to avail bounce buffer > > mechanism for secure guests as an example). So the flag basically is a > > platform specific one not a device specific one. > > That's not the case. A single platform can have a mix of virtio and > non-virtio devices. Same applies even within virtio, e.g. the balloon > device always bypasses an iommu. Further, QEMU supports out of process > devices some of which might bypass the IOMMU.
Given that each virtio device has to behave differently depending on (a) what it does? (balloon, block, net etc ) (b) what platform it is on? (pseries, x86, ....) (c) what environment it is on? (secure, insecure...) I think, we should let the virtio device decide what it wants, instead of forcing it to NOT use dma_ops when VIRTIO_F_IOMMU_PLATFORM is NOT enabled. Currently, virtio generic code, has an assumption that a device must NOT use dma operations if the hypervisor has NOT enabled VIRTIO_F_IOMMU_PLATFORM. This assumption is baked into vring_use_dma_api(); though there is a special exception for xen_domain(). This assumption is restricting us from using the dma_ops abstraction for virtio devices on secure VM. BTW: VIRTIO_F_IOMMU_PLATFORM may or may not be set on this platform. On our secure VM, virtio devices; by default, do not share pages with hypervisor. In other words, hypervisor cannot access the secure VM pages. The secure VM with the help of the hardware enables some pages to be shared with the hypervisor. Secure VM then uses these pages to bounce virtio data with the hypervisor. One elegant way to impliment this functionality is to abstract it under our special dma_ops and wire it to the virtio devices. However the restriction imposed by the generic virtio code, contrains us from doing so. If we can enrich vring_use_dma_api() to take multiple factors into consideration and not just VIRTIO_F_IOMMU_PLATFORM; perferrably by consulting a arch-dependent function, we could seemlessly integrate into the existing virtio infrastructure. RP > > -- > MST -- Ram Pai