RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
-Original Message- From: Bhushan Bharat-R65777 Sent: Wednesday, November 27, 2013 9:39 PM To: 'Alex Williamson' Cc: Wood Scott-B07421; linux-...@vger.kernel.org; ag...@suse.de; Yoder Stuart- B08248; iommu@lists.linux-foundation.org; bhelg...@google.com; linuxppc- d...@lists.ozlabs.org; linux-ker...@vger.kernel.org Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) -Original Message- From: Alex Williamson [mailto:alex.william...@redhat.com] Sent: Monday, November 25, 2013 10:08 PM To: Bhushan Bharat-R65777 Cc: Wood Scott-B07421; linux-...@vger.kernel.org; ag...@suse.de; Yoder Stuart- B08248; iommu@lists.linux-foundation.org; bhelg...@google.com; linuxppc- d...@lists.ozlabs.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) On Mon, 2013-11-25 at 05:33 +, Bharat Bhushan wrote: -Original Message- From: Alex Williamson [mailto:alex.william...@redhat.com] Sent: Friday, November 22, 2013 2:31 AM To: Wood Scott-B07421 Cc: Bhushan Bharat-R65777; linux-...@vger.kernel.org; ag...@suse.de; Yoder Stuart-B08248; iommu@lists.linux-foundation.org; bhelg...@google.com; linuxppc- d...@lists.ozlabs.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote: On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote: On Thu, 2013-11-21 at 11:20 +, Bharat Bhushan wrote: -Original Message- From: Alex Williamson [mailto:alex.william...@redhat.com] Sent: Thursday, November 21, 2013 12:17 AM To: Bhushan Bharat-R65777 Cc: j...@8bytes.org; bhelg...@google.com; ag...@suse.de; Wood Scott-B07421; Yoder Stuart-B08248; iommu@lists.linux-foundation.org; linux- p...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; linux- ker...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU) Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each vfio user has $COUNT regions at their disposal exclusively)? Number of msi-bank count is system wide and not per aperture, But will be setting windows for banks in the device aperture. So say if we are direct assigning 2 pci device (both have different iommu group, so 2 aperture in iommu) to VM. Now qemu can make only one call to know how many msi-banks are there but it must set sub-windows for all banks for both pci device in its respective aperture. I'm still confused. What I want to make sure of is that the banks are independent per aperture. For instance, if we have two separate userspace processes operating independently and they both chose to use msi bank zero for their device, that's bank zero within each aperture and doesn't interfere. Or another way to ask is can a malicious user interfere with other users by using the wrong bank. Thanks, They can interfere. Want to be sure of how they can interfere? What happens if more than one user selects the same MSI bank? Minimally, wouldn't that result in the IOMMU blocking transactions from the previous user once the new user activates their mapping? Yes and no; With current implementation yes but with a minor change no. Later in this response I will explain how. With this hardware, the only way to prevent that is to make sure that a bank is not shared by multiple protection contexts. For some of our users, though, I believe preventing this is less important than the performance benefit. So should we let this patch series in without protection? No. I think we need some sort of ownership model around the msi banks then. Otherwise there's nothing preventing another userspace from attempting an MSI based attack on other users, or perhaps even on the host. VFIO can't allow that. Thanks, We have very few (3 MSI bank on most of chips), so we can not assign one to each userspace. What we can do is host and userspace does not share a MSI bank while userspace will share a MSI bank. Then you probably need VFIO to own the MSI bank and program devices into it rather than exposing the MSI banks to userspace to let them have direct access. Overall idea of exposing the details of msi regions to userspace are 1) User space can define the aperture size to fit MSI mapping in IOMMU. 2) setup iova for a MSI banks; which is just after guest memory. But currently we expose the size and address of MSI banks, passing address is of no use and can be problematic. I am sorry, above information is not correct. Currently neither we
RE: [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info
-Original Message- From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci-ow...@vger.kernel.org] On Behalf Of Bjorn Helgaas Sent: Tuesday, November 26, 2013 5:06 AM To: Bhushan Bharat-R65777 Cc: alex.william...@redhat.com; j...@8bytes.org; ag...@suse.de; Wood Scott- B07421; Yoder Stuart-B08248; iommu@lists.linux-foundation.org; linux- p...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; linux- ker...@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info On Tue, Nov 19, 2013 at 10:47:05AM +0530, Bharat Bhushan wrote: In Aperture type of IOMMU (like FSL PAMU), VFIO-iommu system need to know the MSI region to map its window in h/w. This patch just defines the required weak functions only and will be used by followup patches. Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v1-v2 - Added description on struct msi_region drivers/pci/msi.c | 22 ++ include/linux/msi.h | 14 ++ 2 files changed, 36 insertions(+), 0 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index d5f90d6..2643a29 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -67,6 +67,28 @@ int __weak arch_msi_check_device(struct pci_dev *dev, int nvec, int type) return chip-check_device(chip, dev, nvec, type); } +int __weak arch_msi_get_region_count(void) { + return 0; +} + +int __weak arch_msi_get_region(int region_num, struct msi_region +*region) { + return 0; +} + +int msi_get_region_count(void) +{ + return arch_msi_get_region_count(); +} +EXPORT_SYMBOL(msi_get_region_count); + +int msi_get_region(int region_num, struct msi_region *region) { + return arch_msi_get_region(region_num, region); } +EXPORT_SYMBOL(msi_get_region); + int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) { struct msi_desc *entry; diff --git a/include/linux/msi.h b/include/linux/msi.h index b17ead8..ade1480 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -51,6 +51,18 @@ struct msi_desc { }; /* + * This structure is used to get + * - physical address + * - size + * of a msi region + */ +struct msi_region { + int region_num; /* MSI region number */ + dma_addr_t addr; /* Address of MSI region */ + size_t size; /* Size of MSI region */ }; + +/* * The arch hooks to setup up msi irqs. Those functions are * implemented as weak symbols so that they /can/ be overriden by * architecture specific code if needed. @@ -64,6 +76,8 @@ void arch_restore_msi_irqs(struct pci_dev *dev, int irq); void default_teardown_msi_irqs(struct pci_dev *dev); void default_restore_msi_irqs(struct pci_dev *dev, int irq); +int arch_msi_get_region_count(void); +int arch_msi_get_region(int region_num, struct msi_region *region); It doesn't look like any of this (struct msi_region, msi_get_region(), msi_get_region_count()) is actually used by drivers/pci/msi.c, so I don't think it needs to be declared in generic code. It looks like it's only used in drivers/vfio/vfio_iommu_fsl_pamu.c, where you already know you have an FSL IOMMU, and you can just call FSL-specific interfaces directly. Thanks Bjorn, Want to be sure of what you are suggesting. What I understood is that we define these (struct msi_region, msi_get_region(), msi_get_region_count()) in arch/powerpc/include/fsl_msi.h (a new file). Include this header file directly in driver/vfio/vfio_iommu_fsl_pamu.c Same also applies for msi_set_iova() in patch-5 ? -Bharat Bjorn struct msi_chip { struct module *owner; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe linux-pci in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC][PATCHv6++ 01/13] of: introduce of_property_for_earch_phandle_with_args()
Stephen Warren swar...@wwwdotorg.org wrote @ Thu, 21 Nov 2013 19:57:00 +0100: On 11/21/2013 10:17 AM, Hiroshi Doyu wrote: Iterating over a property containing a list of phandles with arguments is a common operation for device drivers. This patch adds a new of_property_for_each_phandle_with_args() macro to make the iteration simpler. Signed-off-by: Hiroshi Doyu hd...@nvidia.com --- v6+: Use the description, which Grant Likely proposed, to be full enough that a future reader can figure out why a patch was written. http://lists.linuxfoundation.org/pipermail/iommu/2013-November/007062.html This new version only addresses one of the concerns that Grant had, namely the commit message. diff --git a/include/linux/of.h b/include/linux/of.h +#define of_property_for_each_phandle_with_args(np, list, cells, i, args) \ + for (i = 0; !of_parse_phandle_with_args(np, list, cells, i, args); i++) + Grant also wanted the actual implementation fixed so that it wasn't so inefficient. What this current patch does is basically: for every entry in the property: for every entry in the property before the current index: parse the phandle+specifier That's roughly O(n^2). (n is # entries in the property) Instead, what should happen is: for every entry in the property: parse the phandle+specifier yield the result That's roughly O(n). In other words, an implementation more along the lines of include/linux/of.h's: #define of_property_for_each_u32(np, propname, prop, p, u) \ for (prop = of_find_property(np, propname, NULL), \ p = of_prop_next_u32(prop, NULL, u); \ p; \ p = of_prop_next_u32(prop, p, u)) ... so you'd need functions like of_prop_first_specifier() and of_prop_next_specifier(), and perhaps some associated set of state variables, perhaps with all the state wrapped into a single struct for simplicity. Although I couldn't invent any struct to hold params and state here, I'd like you to review the following interface is ok or not. At first, I thought to refactor __of_parse_phandle_with_args() but it's a bit highly optimized by Stephen and it looked a bit hard to refactor without perf regressions. Instread, I introduced 2 new functions of_parse_{first,next}_phandle_with_args() to parse phandles. If this interface is ok, I'll include this into the next v7 series. -8-8-8-8-8-8-8-8-8- From: Hiroshi Doyu hd...@nvidia.com Iterating over a property containing a list of phandles with arguments is a common operation for device drivers. This patch adds a new of_property_for_each_phandle_with_args() macro to make the iteration simpler. Introduced of_parse_{first,next}_phandle_with_args(), where const __be32 **list is used to hold the next list to be processed as a state pramameter, and both of_parse_{first,next}_phandle_with_args() returns the remaining list in the number of cell(4 byte). If any error happens, list is set NULL not to proceed the rest. Signed-off-by: Hiroshi Doyu hd...@nvidia.com --- v6++: Optimized to avoid O(n^2), suggested by Stephen Warren. http://lists.linuxfoundation.org/pipermail/iommu/2013-November/007066.html v6+: Use the description, which Grant Likely proposed, to be full enough that a future reader can figure out why a patch was written. v5: New patch for v5. Signed-off-by: Hiroshi Doyu hd...@nvidia.com --- drivers/of/base.c | 82 ++ include/linux/of.h | 52 ++ 2 files changed, 134 insertions(+) diff --git a/drivers/of/base.c b/drivers/of/base.c index f807d0e..3e29b10 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1201,6 +1201,88 @@ void of_print_phandle_args(const char *msg, const struct of_phandle_args *args) printk(\n); } +int __of_parse_next_phandle_with_args(const __be32 **plist, + const char *cells_name, int cell_count, + struct of_phandle_args *out_args) +{ + phandle phandle; + int i, count = 0, err; + struct device_node *dn; + const __be32 *list; + + BUG_ON(!out_args); + BUG_ON(!cells_name !cell_count); + + /* +* *plist should hold phandle, and it's updated to the next +* phandle at return if no error. +*/ + list = *plist; + out_args-np = NULL; + + phandle = be32_to_cpup(list); + if (!phandle) + goto err_out; + + dn = of_find_node_by_phandle(phandle); + if (!dn) + goto err_out; + + if (cells_name) { + err = of_property_read_u32(dn, cells_name, count); + if (err) + goto err_out; + } else { + count = cell_count; + } + + out_args-np = dn;