Re: [PATCH 4/9] xen: Add xen_create_contiguous_region
On Wed, Nov 13, 2013 at 10:50:58AM +, Liu, Jijiang wrote: Hi, I have a question about xen_create_contiguous_region function, in which kernel version was it introduced? You can use 'git annotate' to figure out which commit introduced it. Then you can use 'git tag --contains commit' to find the tags that have it. Thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/fsl_pamu: use physical cpu index to find the matched cpu nodes
On Thu, 2013-11-14 at 21:16 -0600, Sethi Varun-B16395 wrote: Haiying/Scott, Forgot to mention this, the PAMU driver has to handle stash destination settings both for power and dsp cores (on B4 platform). For the dsp cores we would expect the physical core id (not controlled by Linux). To make the interface consistent, I would expect the caller (for iommu_set_attr) to pass the physical core id. That sounds like you need two different interfaces. -Scott ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC PATCH] Crashdump Accepting Active IOMMU
Thank you for testing this RFC patch. It is great to have confirmation that the code works in a different test environment. You asked: What is the status of this patch? I have made a few changes since the RFC version of this patch: 1. Consolidated all of the operational code into the copy... functions. The process... functions were primarily used for diagnostics and exploration; however, there was a small amount of operational code that used the process... functions. This operational code has been moved into the copy... functions. 2. Removed the Process ... functions and the diagnostic code that ran on that function set. This removed about 1/4 of the code -- which this operational patch no longer needs. These portions of the RFC patch could be formatted as a separate patch and submitted independently at a later date. 3. Re-formatted the code to the Linux Coding Standards. The checkpatch script still finds some lines to complain about; however these lines are either (1) lines that I did not change, or (2) lines that only changed by adding a level of indent which pushed them over 80-characters, or (3) new lines whose intent is far clearer when longer than 80-characters (allowed by the Linux Coding Standards.) 4. Updated the remaining debug print to be significantly more flexible. This allows control over the amount of debug print to the console -- which can vary widely. 5. Fixed a couple of minor bugs found by testing on a machine with a very large IO configuration. You asked: Do you have a plan to post new version? Yes. I am in the process of dividing the code into a set of 6 or 7 patches, and completing the due-diligence on these patches before submitting them. Bill -Original Message- From: Takao Indoh [mailto:indou.ta...@jp.fujitsu.com] Sent: Tuesday, November 12, 2013 12:45 AM To: Sumner, William; bhelg...@google.com; alex.william...@redhat.com; ddut...@redhat.com Cc: linux-...@vger.kernel.org; ke...@lists.infradead.org; linux-ker...@vger.kernel.org; iommu@lists.linux-foundation.org; ishii.hiron...@jp.fujitsu.com; dw...@infradead.org Subject: Re: [RFC PATCH] Crashdump Accepting Active IOMMU Hi Bill, What is the status of this patch? It works and DMA problems on kdump are solved as far as I tested. Do you have a plan to post new version? Thanks, Takao Indoh (2013/09/27 8:25), Sumner, William wrote: This Request For Comment submission is primarily to solicit comments on a concept for how kdump can handle legacy DMA IO leftover from the panicked kernel and comments on early prototype code to implement it. Some level of interest was noted when I proposed this concept in June; however, for generating serious discussion there is no substitute for a working prototype. This concept modifies the behavior of the iommu in the (new) crashdump kernel: 1. to accept the iommu hardware in an active state, 2. to leave the current translations in-place so that legacy DMA will continue using its current buffers until the device drivers in the crashdump kernel initialize and initialize their devices, 3. to use different portions of the iova address ranges for the device drivers in the crashdump kernel than the iova ranges that were in-use at the time of the panic. Advantages of this concept: 1. All manipulation of the IO-device is done by the Linux device-driver for that device. 2. This concept behaves in a very similar manner to operation without an active iommu. 3. Any activity between the IO-device and its RMRR areas is handled by the device-driver in the same manner as during a non-kdump boot. 4. If an IO-device has no driver in the kdump kernel, it is simply left alone. This supports the practice of creating a special kdump kernel without drivers for any devices that are not required for taking a crashdump. About the early-prototype code in the patch below: -- 1. It works on one machine that reproduced the original problem -- still need to test it on a lot of other machines with various IO configurations. 2. Currently implemented for intel-iommu architecture only, 3. It is based near TOT from kernel.org. The TOT version of 'crash' reads the dump that is produced. 4. It is definitely prototype-only and not yet ready to propose as a patch for inclusion into Linux proper. 5. Although this patch is not yet intended for incorporation into mainstream Linux, it should install and operate for anyone who wants to experiment with it. Because this patch changes the low-level IO-operation, and because of its very-limited testing, I strongly advise against installing this patch on any system that contains production data. 6. For this RFC, I decided to leave-in all of the debugging, diagnostic, temporary, and test code so that it would be readily available. In a (future) patch submission, much of this would need to be either eliminated, separated into a
RE: [PATCH] iommu/fsl_pamu: use physical cpu index to find the matched cpu nodes
For the DSP case again we have to set up the stash attribute. Are you saying that this should be a separate attribute? -Varun -Original Message- From: Wood Scott-B07421 Sent: Tuesday, November 19, 2013 1:07 AM To: Sethi Varun-B16395 Cc: Wang Haiying-R54964; j...@8bytes.org; iommu@lists.linux- foundation.org; linuxppc-...@lists.ozlabs.org Subject: Re: [PATCH] iommu/fsl_pamu: use physical cpu index to find the matched cpu nodes On Thu, 2013-11-14 at 21:16 -0600, Sethi Varun-B16395 wrote: Haiying/Scott, Forgot to mention this, the PAMU driver has to handle stash destination settings both for power and dsp cores (on B4 platform). For the dsp cores we would expect the physical core id (not controlled by Linux). To make the interface consistent, I would expect the caller (for iommu_set_attr) to pass the physical core id. That sounds like you need two different interfaces. -Scott ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/fsl_pamu: use physical cpu index to find the matched cpu nodes
On Mon, 2013-11-18 at 20:42 -0600, Varun Sethi wrote: For the DSP case again we have to set up the stash attribute. Are you saying that this should be a separate attribute? Not necessarily a separate attribute, but there should be some way to distinguish whether you're providing a Linux cpu number or some external stash destination. -Scott ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH] iommu/fsl_pamu: use physical cpu index to find the matched cpu nodes
-Original Message- From: Wood Scott-B07421 Sent: Tuesday, November 19, 2013 8:34 AM To: Sethi Varun-B16395 Cc: Wang Haiying-R54964; j...@8bytes.org; iommu@lists.linux- foundation.org; linuxppc-...@lists.ozlabs.org Subject: Re: [PATCH] iommu/fsl_pamu: use physical cpu index to find the matched cpu nodes On Mon, 2013-11-18 at 20:42 -0600, Varun Sethi wrote: For the DSP case again we have to set up the stash attribute. Are you saying that this should be a separate attribute? Not necessarily a separate attribute, but there should be some way to distinguish whether you're providing a Linux cpu number or some external stash destination. Yes, the current idea is to use a separate L2 cache type for the DSP cores (PAMU_DSP_L2_CACHE). DSP cores can stash only to the L2 cache. -Varun ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/9 v2] pci: msi: expose msi region information functions
So by now we have defined all the interfaces for getting the msi region, this patch expose the interface to linux subsystem. These will be used by vfio subsystem for setting up iommu for MSI interrupt of direct assignment devices. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v1-v2 - None include/linux/pci.h | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/include/linux/pci.h b/include/linux/pci.h index da172f9..c587034 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1142,6 +1142,7 @@ struct msix_entry { u16 entry; /* driver uses to specify entry, OS writes */ }; +struct msi_region; #ifndef CONFIG_PCI_MSI static inline int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec) @@ -1184,6 +1185,16 @@ static inline int pci_msi_enabled(void) { return 0; } + +static inline int msi_get_region_count(void) +{ + return 0; +} + +static inline int msi_get_region(int region_num, struct msi_region *region) +{ + return 0; +} #else int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec); int pci_enable_msi_block_auto(struct pci_dev *dev, unsigned int *maxvec); @@ -1196,6 +1207,8 @@ void pci_disable_msix(struct pci_dev *dev); void msi_remove_pci_irq_vectors(struct pci_dev *dev); void pci_restore_msi_state(struct pci_dev *dev); int pci_msi_enabled(void); +int msi_get_region_count(void); +int msi_get_region(int region_num, struct msi_region *region); #endif #ifdef CONFIG_PCIEPORTBUS -- 1.7.0.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/9 v2] pci:msi: add weak function for returning msi region info
In Aperture type of IOMMU (like FSL PAMU), VFIO-iommu system need to know the MSI region to map its window in h/w. This patch just defines the required weak functions only and will be used by followup patches. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v1-v2 - Added description on struct msi_region drivers/pci/msi.c | 22 ++ include/linux/msi.h | 14 ++ 2 files changed, 36 insertions(+), 0 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index d5f90d6..2643a29 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -67,6 +67,28 @@ int __weak arch_msi_check_device(struct pci_dev *dev, int nvec, int type) return chip-check_device(chip, dev, nvec, type); } +int __weak arch_msi_get_region_count(void) +{ + return 0; +} + +int __weak arch_msi_get_region(int region_num, struct msi_region *region) +{ + return 0; +} + +int msi_get_region_count(void) +{ + return arch_msi_get_region_count(); +} +EXPORT_SYMBOL(msi_get_region_count); + +int msi_get_region(int region_num, struct msi_region *region) +{ + return arch_msi_get_region(region_num, region); +} +EXPORT_SYMBOL(msi_get_region); + int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) { struct msi_desc *entry; diff --git a/include/linux/msi.h b/include/linux/msi.h index b17ead8..ade1480 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -51,6 +51,18 @@ struct msi_desc { }; /* + * This structure is used to get + * - physical address + * - size + * of a msi region + */ +struct msi_region { + int region_num; /* MSI region number */ + dma_addr_t addr; /* Address of MSI region */ + size_t size; /* Size of MSI region */ +}; + +/* * The arch hooks to setup up msi irqs. Those functions are * implemented as weak symbols so that they /can/ be overriden by * architecture specific code if needed. @@ -64,6 +76,8 @@ void arch_restore_msi_irqs(struct pci_dev *dev, int irq); void default_teardown_msi_irqs(struct pci_dev *dev); void default_restore_msi_irqs(struct pci_dev *dev, int irq); +int arch_msi_get_region_count(void); +int arch_msi_get_region(int region_num, struct msi_region *region); struct msi_chip { struct module *owner; -- 1.7.0.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
From: Bharat Bhushan bharat.bhus...@freescale.com PAMU (FSL IOMMU) has a concept of primary window and subwindows. Primary window corresponds to the complete guest iova address space (including MSI space), with respect to IOMMU_API this is termed as geometry. IOVA Base of subwindow is determined from the number of subwindows (configurable using iommu API). MSI I/O page must be within the geometry and maximum supported subwindows, so MSI IO-page is setup just after guest memory iova space. So patch 1/9-4/9(inclusive) are for defining the interface to get: - Number of MSI regions (which is number of MSI banks for powerpc) - MSI-region address range: Physical page which have the address/addresses used for generating MSI interrupt and size of the page. Patch 5/9-7/9(inclusive) is defining the interface of setting up MSI iova-base for a msi region(bank) for a device. so that when msi-message will be composed then this configured iova will be used. Earlier we were using iommu interface for getting the configured iova which was not currect and Alex Williamson suggeested this type of interface. patch 8/9 moves some common functions in a separate file so that these can be used by FSL_PAMU implementation (next patch uses this). These will be used later for iommu-none implementation. I believe we can do more of this but will take step by step. Finally last patch actually adds the support for FSL-PAMU :) v1-v2 - Added interface for setting msi iova for a msi region for a device. Earlier I added iommu interface for same but as per comment that is removed and now created a direct interface between vfio and msi. - Incorporated review comments (details is in individual patch) Bharat Bhushan (9): pci:msi: add weak function for returning msi region info pci: msi: expose msi region information functions powerpc: pci: Add arch specific msi region interface powerpc: msi: Extend the msi region interface to get info from fsl_msi pci/msi: interface to set an iova for a msi region powerpc: pci: Extend msi iova page setup to arch specific pci: msi: Extend msi iova setting interface to powerpc arch vfio: moving some functions in common file vfio pci: Add vfio iommu implementation for FSL_PAMU arch/powerpc/include/asm/machdep.h | 10 + arch/powerpc/kernel/msi.c | 28 + arch/powerpc/sysdev/fsl_msi.c | 132 +- arch/powerpc/sysdev/fsl_msi.h | 25 +- drivers/pci/msi.c | 35 ++ drivers/vfio/Kconfig |6 + drivers/vfio/Makefile |5 +- drivers/vfio/vfio_iommu_common.c | 227 drivers/vfio/vfio_iommu_common.h | 27 + drivers/vfio/vfio_iommu_fsl_pamu.c | 1003 drivers/vfio/vfio_iommu_type1.c| 206 + include/linux/msi.h| 14 + include/linux/pci.h| 21 + include/uapi/linux/vfio.h | 100 14 files changed, 1623 insertions(+), 216 deletions(-) create mode 100644 drivers/vfio/vfio_iommu_common.c create mode 100644 drivers/vfio/vfio_iommu_common.h create mode 100644 drivers/vfio/vfio_iommu_fsl_pamu.c ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 5/9 v2] pci/msi: interface to set an iova for a msi region
This patch defines an interface by which a msi page can be mapped to a specific iova page. This is a requirement in aperture type of IOMMUs (like Freescale PAMU), where we map msi iova page just after guest memory iova address. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2 - new patch drivers/pci/msi.c | 13 + include/linux/pci.h |8 2 files changed, 21 insertions(+), 0 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 2643a29..040609f 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -77,6 +77,19 @@ int __weak arch_msi_get_region(int region_num, struct msi_region *region) return 0; } +int __weak arch_msi_set_iova(struct pci_dev *pdev, int region_num, +dma_addr_t iova, bool set) +{ + return 0; +} + +int msi_set_iova(struct pci_dev *pdev, int region_num, +dma_addr_t iova, bool set) +{ + return arch_msi_set_iova(pdev, region_num, iova, set); +} +EXPORT_SYMBOL(msi_set_iova); + int msi_get_region_count(void) { return arch_msi_get_region_count(); diff --git a/include/linux/pci.h b/include/linux/pci.h index c587034..c6d3e58 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1195,6 +1195,12 @@ static inline int msi_get_region(int region_num, struct msi_region *region) { return 0; } + +static inline int msi_set_iova(struct pci_dev *pdev, int region_num, + dma_addr_t iova, bool set) +{ + return 0; +} #else int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec); int pci_enable_msi_block_auto(struct pci_dev *dev, unsigned int *maxvec); @@ -1209,6 +1215,8 @@ void pci_restore_msi_state(struct pci_dev *dev); int pci_msi_enabled(void); int msi_get_region_count(void); int msi_get_region(int region_num, struct msi_region *region); +int msi_set_iova(struct pci_dev *pdev, int region_num, +dma_addr_t iova, bool set); #endif #ifdef CONFIG_PCIEPORTBUS -- 1.7.0.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 4/9 v2] powerpc: msi: Extend the msi region interface to get info from fsl_msi
The FSL MSI will provide the interface to get: - Number of MSI regions (which is number of MSI banks for powerpc) - Get the region address range: Physical page which have the address/addresses used for generating MSI interrupt and size of the page. These are required to create IOMMU (Freescale PAMU) mapping for devices which are directly assigned using VFIO. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v1-v2 - Atomic increment of bank index for parallel probe of msi node arch/powerpc/sysdev/fsl_msi.c | 42 +++- arch/powerpc/sysdev/fsl_msi.h | 11 - 2 files changed, 45 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c index 77efbae..eeebbf0 100644 --- a/arch/powerpc/sysdev/fsl_msi.c +++ b/arch/powerpc/sysdev/fsl_msi.c @@ -109,6 +109,34 @@ static int fsl_msi_init_allocator(struct fsl_msi *msi_data) return 0; } +static int fsl_msi_get_region_count(void) +{ + int count = 0; + struct fsl_msi *msi_data; + + list_for_each_entry(msi_data, msi_head, list) + count++; + + return count; +} + +static int fsl_msi_get_region(int region_num, struct msi_region *region) +{ + struct fsl_msi *msi_data; + + list_for_each_entry(msi_data, msi_head, list) { + if (msi_data-bank_index == region_num) { + region-region_num = msi_data-bank_index; + /* Setting PAGE_SIZE as MSIIR is a 4 byte register */ + region-size = PAGE_SIZE; + region-addr = msi_data-msiir ~(region-size - 1); + return 0; + } + } + + return -ENODEV; +} + static int fsl_msi_check_device(struct pci_dev *pdev, int nvec, int type) { if (type == PCI_CAP_ID_MSIX) @@ -150,7 +178,8 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int hwirq, if (reg (len == sizeof(u64))) address = be64_to_cpup(reg); else - address = fsl_pci_immrbar_base(hose) + msi_data-msiir_offset; + address = fsl_pci_immrbar_base(hose) + + (msi_data-msiir 0xf); msg-address_lo = lower_32_bits(address); msg-address_hi = upper_32_bits(address); @@ -393,6 +422,7 @@ static int fsl_of_msi_probe(struct platform_device *dev) const struct fsl_msi_feature *features; int len; u32 offset; + static atomic_t bank_index = ATOMIC_INIT(-1); match = of_match_device(fsl_of_msi_ids, dev-dev); if (!match) @@ -436,18 +466,15 @@ static int fsl_of_msi_probe(struct platform_device *dev) dev-dev.of_node-full_name); goto error_out; } - msi-msiir_offset = - features-msiir_offset + (res.start 0xf); /* * First read the MSIIR/MSIIR1 offset from dts * On failure use the hardcode MSIIR offset */ if (of_address_to_resource(dev-dev.of_node, 1, msiir)) - msi-msiir_offset = features-msiir_offset + - (res.start MSIIR_OFFSET_MASK); + msi-msiir = res.start + features-msiir_offset; else - msi-msiir_offset = msiir.start MSIIR_OFFSET_MASK; + msi-msiir = msiir.start; } msi-feature = features-fsl_pic_ip; @@ -521,6 +548,7 @@ static int fsl_of_msi_probe(struct platform_device *dev) } } + msi-bank_index = atomic_inc_return(bank_index); list_add_tail(msi-list, msi_head); /* The multiple setting ppc_md.setup_msi_irqs will not harm things */ @@ -528,6 +556,8 @@ static int fsl_of_msi_probe(struct platform_device *dev) ppc_md.setup_msi_irqs = fsl_setup_msi_irqs; ppc_md.teardown_msi_irqs = fsl_teardown_msi_irqs; ppc_md.msi_check_device = fsl_msi_check_device; + ppc_md.msi_get_region_count = fsl_msi_get_region_count; + ppc_md.msi_get_region = fsl_msi_get_region; } else if (ppc_md.setup_msi_irqs != fsl_setup_msi_irqs) { dev_err(dev-dev, Different MSI driver already installed!\n); err = -ENODEV; diff --git a/arch/powerpc/sysdev/fsl_msi.h b/arch/powerpc/sysdev/fsl_msi.h index df9aa9f..a2cc5a2 100644 --- a/arch/powerpc/sysdev/fsl_msi.h +++ b/arch/powerpc/sysdev/fsl_msi.h @@ -31,14 +31,21 @@ struct fsl_msi { struct irq_domain *irqhost; unsigned long cascade_irq; - - u32 msiir_offset; /* Offset of MSIIR, relative to start of CCSR */ + phys_addr_t msiir; /* MSIIR Address in CCSR */ u32 ibs_shift; /* Shift of interrupt bit select */ u32 srs_shift; /* Shift of the shared interrupt
[PATCH 3/9 v2] powerpc: pci: Add arch specific msi region interface
This patch adds the interface to get the msi region information from arch specific code. The machine spicific code is not yet defined. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v1-v2 - None arch/powerpc/include/asm/machdep.h |8 arch/powerpc/kernel/msi.c | 18 ++ 2 files changed, 26 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index 8b48090..8d1b787 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -30,6 +30,7 @@ struct file; struct pci_controller; struct kimage; struct pci_host_bridge; +struct msi_region; struct machdep_calls { char*name; @@ -124,6 +125,13 @@ struct machdep_calls { int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type); void(*teardown_msi_irqs)(struct pci_dev *dev); + + /* Returns the number of MSI regions (banks) */ + int (*msi_get_region_count)(void); + + /* Returns the requested region's address and size */ + int (*msi_get_region)(int region_num, + struct msi_region *region); #endif void(*restart)(char *cmd); diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c index 8bbc12d..1a67787 100644 --- a/arch/powerpc/kernel/msi.c +++ b/arch/powerpc/kernel/msi.c @@ -13,6 +13,24 @@ #include asm/machdep.h +int arch_msi_get_region_count(void) +{ + if (ppc_md.msi_get_region_count) { + pr_debug(msi: Using platform get_region_count routine.\n); + return ppc_md.msi_get_region_count(); + } + return 0; +} + +int arch_msi_get_region(int region_num, struct msi_region *region) +{ + if (ppc_md.msi_get_region) { + pr_debug(msi: Using platform get_region routine.\n); + return ppc_md.msi_get_region(region_num, region); + } + return 0; +} + int arch_msi_check_device(struct pci_dev* dev, int nvec, int type) { if (!ppc_md.setup_msi_irqs || !ppc_md.teardown_msi_irqs) { -- 1.7.0.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 8/9 v2] vfio: moving some functions in common file
Some function defined in vfio_iommu_type1.c are generic (not specific or type1 iommu) and we want to use these for FSL IOMMU (PAMU) and going forward in iommu-none driver. So I have created a new file naming vfio_iommu_common.c and moved some of generic functions into this file. I Agree (with Alex Williamson and myself :-)) that some more functions can be moved to this new common file (with some changes in type1/fsl_pamu and others). But in this patch i avoided doing these changes and just moved functions which are straight forward and allow me to get fsl-powerpc vfio framework in place. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v1-v2 - removed un-necessary header file inclusion - mark static function which are internal to *common.c drivers/vfio/Makefile|4 +- drivers/vfio/vfio_iommu_common.c | 227 ++ drivers/vfio/vfio_iommu_common.h | 27 + drivers/vfio/vfio_iommu_type1.c | 206 +-- 4 files changed, 257 insertions(+), 207 deletions(-) create mode 100644 drivers/vfio/vfio_iommu_common.c create mode 100644 drivers/vfio/vfio_iommu_common.h diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 72bfabc..c5792ec 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -1,4 +1,4 @@ obj-$(CONFIG_VFIO) += vfio.o -obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o -obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o +obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_common.o vfio_iommu_type1.o +obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_common.o vfio_iommu_spapr_tce.o obj-$(CONFIG_VFIO_PCI) += pci/ diff --git a/drivers/vfio/vfio_iommu_common.c b/drivers/vfio/vfio_iommu_common.c new file mode 100644 index 000..08eea71 --- /dev/null +++ b/drivers/vfio/vfio_iommu_common.c @@ -0,0 +1,227 @@ +/* + * VFIO: Common code for vfio IOMMU support + * + * Copyright (C) 2012 Red Hat, Inc. All rights reserved. + * Author: Alex Williamson alex.william...@redhat.com + * Author: Bharat Bhushan bharat.bhus...@freescale.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * Derived from original vfio: + * Copyright 2010 Cisco Systems, Inc. All rights reserved. + * Author: Tom Lyon, p...@cisco.com + */ + +#include linux/compat.h +#include linux/iommu.h +#include linux/module.h +#include linux/mm.h +#include linux/slab.h +#include linux/workqueue.h + +static bool disable_hugepages; +module_param_named(disable_hugepages, + disable_hugepages, bool, S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(disable_hugepages, +Disable VFIO IOMMU support for IOMMU hugepages.); + +struct vwork { + struct mm_struct*mm; + longnpage; + struct work_struct work; +}; + +/* delayed decrement/increment for locked_vm */ +static void vfio_lock_acct_bg(struct work_struct *work) +{ + struct vwork *vwork = container_of(work, struct vwork, work); + struct mm_struct *mm; + + mm = vwork-mm; + down_write(mm-mmap_sem); + mm-locked_vm += vwork-npage; + up_write(mm-mmap_sem); + mmput(mm); + kfree(vwork); +} + +void vfio_lock_acct(long npage) +{ + struct vwork *vwork; + struct mm_struct *mm; + + if (!current-mm || !npage) + return; /* process exited or nothing to do */ + + if (down_write_trylock(current-mm-mmap_sem)) { + current-mm-locked_vm += npage; + up_write(current-mm-mmap_sem); + return; + } + + /* +* Couldn't get mmap_sem lock, so must setup to update +* mm-locked_vm later. If locked_vm were atomic, we +* wouldn't need this silliness +*/ + vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL); + if (!vwork) + return; + mm = get_task_mm(current); + if (!mm) { + kfree(vwork); + return; + } + INIT_WORK(vwork-work, vfio_lock_acct_bg); + vwork-mm = mm; + vwork-npage = npage; + schedule_work(vwork-work); +} + +/* + * Some mappings aren't backed by a struct page, for example an mmap'd + * MMIO range for our own or another device. These use a different + * pfn conversion and shouldn't be tracked as locked pages. + */ +static bool is_invalid_reserved_pfn(unsigned long pfn) +{ + if (pfn_valid(pfn)) { + bool reserved; + struct page *tail = pfn_to_page(pfn); + struct page *head = compound_trans_head(tail); + reserved = !!(PageReserved(head)); + if (head != tail) { + /* +* head is not a dangling pointer +* (compound_trans_head takes care of that) +* but the