On Wed, Apr 22, 2015 at 07:38:27PM +1000, Alexey Kardashevskiy wrote: > On 04/22/2015 03:53 PM, David Gibson wrote: > >On Sat, Apr 11, 2015 at 01:24:36AM +1000, Alexey Kardashevskiy wrote: > >>This makes use of the new "memory registering" feature. The idea is > >>to provide the userspace ability to notify the host kernel about pages > >>which are going to be used for DMA. Having this information, the host > >>kernel can pin them all once per user process, do locked pages > >>accounting (once) and not spent time on doing that in real time with > >>possible failures which cannot be handled nicely in some cases. > >> > >>This adds a guest RAM memory listener which notifies a VFIO container > >>about memory which needs to be pinned/unpinned. VFIO MMIO regions > >>(i.e. "skip dump" regions) are skipped. > >> > >>The feature is only enabled for SPAPR IOMMU v2. The host kernel changes > >>are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does > >>not call it when v2 is detected and enabled. > >> > >>This does not change the guest visible interface. > >> > >>Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > > > >Reviewed-by: David Gibson <da...@gibson.dropbear.id.au> > > > >Albeit with some nits described below.
[snip] > >>+ vfio_ram_do_region(container, section, > >>VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY); > >>+} > >>+ > >>+static const MemoryListener vfio_spapr_ram_memory_listener = { > >>+ .region_add = vfio_spapr_ram_listener_region_add, > >>+ .region_del = vfio_spapr_ram_listener_region_del, > >>+}; > >>+ > >> static void vfio_spapr_listener_release(VFIOContainer *container) > >> { > >> memory_listener_unregister(&container->iommu_data.spapr.listener); > >> } > >> > >>-void spapr_memory_listener_register(VFIOContainer *container) > >>+static void vfio_spapr_listener_release_v2(VFIOContainer *container) > >>+{ > >>+ memory_listener_unregister(&container->iommu_data.spapr.listener); > >>+ vfio_spapr_listener_release(container); > >>+} > >>+ > >>+int spapr_memory_listener_register(VFIOContainer *container, int ver) > >> { > >> container->iommu_data.spapr.listener = vfio_spapr_memory_listener; > >> container->iommu_data.release = vfio_spapr_listener_release; > >>- > >> memory_listener_register(&container->iommu_data.spapr.listener, > >> container->space->as); > >>+ if (ver < 2) { > >>+ return 0; > >>+ } > > > >I wonder if it would make sense to store the IOMMU type value into the > >VFIOContainer (from non-arch specific code). It would avoid the > >ad-hoc passing of version here and also allow for some sanity checks. > > > What kind of checks? Cannot think of any which would make sense to do here > and not in spapr_(pci|rtas)*.c Well, all I was thinking of was something like assert(container->iommu_type == SPAPR_TCE_TYPE); In spapr (or x86) callbacks from the core vfio code to make sure we haven't somehow gotten here with a Type1 container. > >>+ > >>+ container->iommu_data.spapr.ramlistener = > >>vfio_spapr_ram_memory_listener; > >>+ container->iommu_data.release = vfio_spapr_listener_release_v2; > >>+ memory_listener_register(&container->iommu_data.spapr.ramlistener, > >>+ &address_space_memory); > >>+ > >>+ container->iommu_data.spapr.ram_reg_initialized = true; > >>+ > >>+ return container->iommu_data.spapr.ram_reg_error; > >> } > >>diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h > >>index d0b831c..b5ef446 100644 > >>--- a/include/hw/vfio/vfio-common.h > >>+++ b/include/hw/vfio/vfio-common.h > >>@@ -71,6 +71,9 @@ typedef struct VFIOType1 { > >> > >> typedef struct VFIOSPAPR { > >> MemoryListener listener; > >>+ MemoryListener ramlistener; > > > >The names "listener" and "ramlistener" don't exactly give a clear hint > >as to what the difference between them is. Maybe "register_listener" > >for the new one since its purpose is to register dma memory. > > > It is listening for RAM changes. If/when we decide to add something else to > this listener, we won't have to change its name :) Still change? Ah, I see. I was forgetting that the other listener wasn't really about RAM (I think an understandable error for a *Memory*Listener). But actually I'm having trouble thinking of names that are clearly better than the current ones. So, I guess I don't really care. > > > >I forget > >what the purpose of the old one is, and so what a better name for it > >might be. > > It is listening on PHB address space for IOMMU table updates and translates > those to MAP/UNMAP ioctls to VFIO containers. > > > > > > >>+ int ram_reg_error; > >>+ bool ram_reg_initialized; > >> } VFIOSPAPR; > >> > >> typedef struct VFIOContainer { > >>@@ -156,6 +159,6 @@ extern int vfio_dma_unmap(VFIOContainer *container, > >> hwaddr iova, ram_addr_t size); > >> bool vfio_listener_skipped_section(MemoryRegionSection *section); > >> > >>-extern void spapr_memory_listener_register(VFIOContainer *container); > >>+extern int spapr_memory_listener_register(VFIOContainer *container, int > >>ver); > >> > >> #endif /* !HW_VFIO_VFIO_COMMON_H */ > >>diff --git a/trace-events b/trace-events > >>index 1231ba4..2739140 100644 > >>--- a/trace-events > >>+++ b/trace-events > >>@@ -1563,6 +1563,7 @@ vfio_disconnect_container(int fd) "close > >>container->fd=%d" > >> vfio_put_group(int fd) "close group->fd=%d" > >> vfio_get_device(const char * name, unsigned int flags, unsigned int > >> num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, > >> irqs: %u" > >> vfio_put_base_device(int fd) "close vdev->fd=%d" > >>+vfio_ram_register(int req, uint64_t va, uint64_t size, int ret) "req=%d > >>va=%"PRIx64" size=%"PRIx64" ret=%d" > >> > >> #hw/acpi/memory_hotplug.c > >> mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32 > > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
pgpljIQDGMFOI.pgp
Description: PGP signature