RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

2013-11-28 Thread Bharat Bhushan


 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Wednesday, November 27, 2013 9:39 PM
 To: 'Alex Williamson'
 Cc: Wood Scott-B07421; linux-...@vger.kernel.org; ag...@suse.de; Yoder Stuart-
 B08248; iommu@lists.linux-foundation.org; bhelg...@google.com; linuxppc-
 d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
 Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)
 
 
 
  -Original Message-
  From: Alex Williamson [mailto:alex.william...@redhat.com]
  Sent: Monday, November 25, 2013 10:08 PM
  To: Bhushan Bharat-R65777
  Cc: Wood Scott-B07421; linux-...@vger.kernel.org; ag...@suse.de; Yoder
  Stuart- B08248; iommu@lists.linux-foundation.org; bhelg...@google.com;
  linuxppc- d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
  Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU
  (PAMU)
 
  On Mon, 2013-11-25 at 05:33 +, Bharat Bhushan wrote:
  
-Original Message-
From: Alex Williamson [mailto:alex.william...@redhat.com]
Sent: Friday, November 22, 2013 2:31 AM
To: Wood Scott-B07421
Cc: Bhushan Bharat-R65777; linux-...@vger.kernel.org;
ag...@suse.de; Yoder Stuart-B08248;
iommu@lists.linux-foundation.org; bhelg...@google.com; linuxppc-
d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
IOMMU (PAMU)
   
On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
 On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
  On Thu, 2013-11-21 at 11:20 +, Bharat Bhushan wrote:
  
-Original Message-
From: Alex Williamson [mailto:alex.william...@redhat.com]
Sent: Thursday, November 21, 2013 12:17 AM
To: Bhushan Bharat-R65777
Cc: j...@8bytes.org; bhelg...@google.com; ag...@suse.de;
Wood Scott-B07421; Yoder Stuart-B08248;
iommu@lists.linux-foundation.org; linux-
p...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; linux-
ker...@vger.kernel.org; Bhushan Bharat-R65777
Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
Freescale IOMMU (PAMU)
   
Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie.
each vfio user has $COUNT regions at their disposal 
exclusively)?
  
   Number of msi-bank count is system wide and not per
   aperture, But will be
setting windows for banks in the device aperture.
   So say if we are direct assigning 2 pci device (both have
   different iommu
group, so 2 aperture in iommu) to VM.
   Now qemu can make only one call to know how many msi-banks
   are there but
it must set sub-windows for all banks for both pci device in its
respective aperture.
 
  I'm still confused.  What I want to make sure of is that the
  banks are independent per aperture.  For instance, if we have
  two separate userspace processes operating independently and
  they both chose to use msi bank zero for their device, that's
  bank zero within each aperture and doesn't interfere.  Or
  another way to ask is can a malicious user interfere with
  other users by
  using the wrong bank.
  Thanks,

 They can interfere.
  
   Want to be sure of how they can interfere?
 
  What happens if more than one user selects the same MSI bank?
  Minimally, wouldn't that result in the IOMMU blocking transactions
  from the previous user once the new user activates their mapping?
 
 Yes and no; With current implementation yes but with a minor change no. Later 
 in
 this response I will explain how.
 
 
 With this hardware, the only way to prevent that
 is to make sure that a bank is not shared by multiple protection
 contexts.
 For some of our users, though, I believe preventing this is less
 important than the performance benefit.
  
   So should we let this patch series in without protection?
 
  No.
 
   
I think we need some sort of ownership model around the msi banks then.
Otherwise there's nothing preventing another userspace from
attempting an MSI based attack on other users, or perhaps even on
the host.  VFIO can't allow that.  Thanks,
  
   We have very few (3 MSI bank on most of chips), so we can not assign
   one to each userspace. What we can do is host and userspace does not
   share a MSI bank while userspace will share a MSI bank.
 
  Then you probably need VFIO to own the MSI bank and program devices
  into it rather than exposing the MSI banks to userspace to let them have
 direct access.
 
 Overall idea of exposing the details of msi regions to userspace are
  1) User space can define the aperture size to fit MSI mapping in IOMMU.
  2) setup iova for a MSI banks; which is just after guest memory.
 
 But currently we expose the size and address of MSI banks, passing address
 is of no use and can be problematic.

I am sorry, above information is not correct. Currently neither we 

RE: [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info

2013-11-28 Thread Bharat Bhushan


 -Original Message-
 From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci-ow...@vger.kernel.org]
 On Behalf Of Bjorn Helgaas
 Sent: Tuesday, November 26, 2013 5:06 AM
 To: Bhushan Bharat-R65777
 Cc: alex.william...@redhat.com; j...@8bytes.org; ag...@suse.de; Wood Scott-
 B07421; Yoder Stuart-B08248; iommu@lists.linux-foundation.org; linux-
 p...@vger.kernel.org; linuxppc-...@lists.ozlabs.org; linux-
 ker...@vger.kernel.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 1/9 v2] pci:msi: add weak function for returning msi 
 region
 info
 
 On Tue, Nov 19, 2013 at 10:47:05AM +0530, Bharat Bhushan wrote:
  In Aperture type of IOMMU (like FSL PAMU), VFIO-iommu system need to
  know the MSI region to map its window in h/w. This patch just defines
  the required weak functions only and will be used by followup patches.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v1-v2
   - Added description on struct msi_region
 
   drivers/pci/msi.c   |   22 ++
   include/linux/msi.h |   14 ++
   2 files changed, 36 insertions(+), 0 deletions(-)
 
  diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index
  d5f90d6..2643a29 100644
  --- a/drivers/pci/msi.c
  +++ b/drivers/pci/msi.c
  @@ -67,6 +67,28 @@ int __weak arch_msi_check_device(struct pci_dev *dev, int
 nvec, int type)
  return chip-check_device(chip, dev, nvec, type);  }
 
  +int __weak arch_msi_get_region_count(void) {
  +   return 0;
  +}
  +
  +int __weak arch_msi_get_region(int region_num, struct msi_region
  +*region) {
  +   return 0;
  +}
  +
  +int msi_get_region_count(void)
  +{
  +   return arch_msi_get_region_count();
  +}
  +EXPORT_SYMBOL(msi_get_region_count);
  +
  +int msi_get_region(int region_num, struct msi_region *region) {
  +   return arch_msi_get_region(region_num, region); }
  +EXPORT_SYMBOL(msi_get_region);
  +
   int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int
  type)  {
  struct msi_desc *entry;
  diff --git a/include/linux/msi.h b/include/linux/msi.h index
  b17ead8..ade1480 100644
  --- a/include/linux/msi.h
  +++ b/include/linux/msi.h
  @@ -51,6 +51,18 @@ struct msi_desc {
   };
 
   /*
  + * This structure is used to get
  + * - physical address
  + * - size
  + * of a msi region
  + */
  +struct msi_region {
  +   int region_num; /* MSI region number */
  +   dma_addr_t addr; /* Address of MSI region */
  +   size_t size; /* Size of MSI region */ };
  +
  +/*
* The arch hooks to setup up msi irqs. Those functions are
* implemented as weak symbols so that they /can/ be overriden by
* architecture specific code if needed.
  @@ -64,6 +76,8 @@ void arch_restore_msi_irqs(struct pci_dev *dev, int
  irq);
 
   void default_teardown_msi_irqs(struct pci_dev *dev);  void
  default_restore_msi_irqs(struct pci_dev *dev, int irq);
  +int arch_msi_get_region_count(void);
  +int arch_msi_get_region(int region_num, struct msi_region *region);
 
 It doesn't look like any of this (struct msi_region, msi_get_region(),
 msi_get_region_count()) is actually used by drivers/pci/msi.c, so I don't 
 think
 it needs to be declared in generic code.  It looks like it's only used in
 drivers/vfio/vfio_iommu_fsl_pamu.c, where you already know you have an FSL
 IOMMU, and you can just call FSL-specific interfaces directly.

Thanks Bjorn,

Want to be sure of what you are suggesting.

What I understood is that we define these (struct msi_region, msi_get_region(), 
msi_get_region_count()) in arch/powerpc/include/fsl_msi.h (a new file). Include 
this header file directly in driver/vfio/vfio_iommu_fsl_pamu.c

Same also applies for msi_set_iova() in patch-5 ?

-Bharat

 
 Bjorn
 
 
   struct msi_chip {
  struct module *owner;
  --
  1.7.0.4
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-pci in the 
 body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC][PATCHv6++ 01/13] of: introduce of_property_for_earch_phandle_with_args()

2013-11-28 Thread Hiroshi Doyu
Stephen Warren swar...@wwwdotorg.org wrote @ Thu, 21 Nov 2013 19:57:00 +0100:

 On 11/21/2013 10:17 AM, Hiroshi Doyu wrote:
  Iterating over a property containing a list of phandles with arguments
  is a common operation for device drivers. This patch adds a new
  of_property_for_each_phandle_with_args() macro to make the iteration
  simpler.
  
  Signed-off-by: Hiroshi Doyu hd...@nvidia.com
  ---
  v6+:
  Use the description, which Grant Likely proposed, to be full enough
  that a future reader can figure out why a patch was written.
http://lists.linuxfoundation.org/pipermail/iommu/2013-November/007062.html
 
 This new version only addresses one of the concerns that Grant had,
 namely the commit message.
 
  diff --git a/include/linux/of.h b/include/linux/of.h
 
  +#define of_property_for_each_phandle_with_args(np, list, cells, i, args) \
  +   for (i = 0; !of_parse_phandle_with_args(np, list, cells, i, args); i++)
  +
 
 Grant also wanted the actual implementation fixed so that it wasn't so
 inefficient.
 
 What this current patch does is basically:
 
 for every entry in the property:
 for every entry in the property before the current index:
 parse the phandle+specifier
 
 That's roughly O(n^2). (n is # entries in the property)
 
 Instead, what should happen is:
 
 for every entry in the property:
 parse the phandle+specifier
 yield the result
 
 That's roughly O(n).
 
 In other words, an implementation more along the lines of
 include/linux/of.h's:
 
 #define of_property_for_each_u32(np, propname, prop, p, u)  \
 for (prop = of_find_property(np, propname, NULL),   \
 p = of_prop_next_u32(prop, NULL, u);   \
 p;  \
 p = of_prop_next_u32(prop, p, u))
 
 ... so you'd need functions like of_prop_first_specifier() and
 of_prop_next_specifier(), and perhaps some associated set of state
 variables, perhaps with all the state wrapped into a single struct for
 simplicity.

Although I couldn't invent any struct to hold params and state here,
I'd like you to review the following interface is ok or not.

At first, I thought to refactor __of_parse_phandle_with_args() but
it's a bit highly optimized by Stephen and it looked a bit hard to
refactor without perf regressions. Instread, I introduced 2 new
functions of_parse_{first,next}_phandle_with_args() to parse
phandles.

If this interface is ok, I'll include this into the next v7 series.

-8-8-8-8-8-8-8-8-8-
From: Hiroshi Doyu hd...@nvidia.com

Iterating over a property containing a list of phandles with arguments
is a common operation for device drivers. This patch adds a new
of_property_for_each_phandle_with_args() macro to make the iteration
simpler.

Introduced of_parse_{first,next}_phandle_with_args(), where const
__be32 **list is used to hold the next list to be processed as a
state pramameter, and both of_parse_{first,next}_phandle_with_args()
returns the remaining list in the number of cell(4 byte). If any error
happens, list is set NULL not to proceed the rest.

Signed-off-by: Hiroshi Doyu hd...@nvidia.com
---
v6++:
Optimized to avoid O(n^2), suggested by Stephen Warren.
http://lists.linuxfoundation.org/pipermail/iommu/2013-November/007066.html

v6+:
Use the description, which Grant Likely proposed, to be full enough
that a future reader can figure out why a patch was written.

v5:
New patch for v5.

Signed-off-by: Hiroshi Doyu hd...@nvidia.com
---
 drivers/of/base.c  | 82 ++
 include/linux/of.h | 52 ++
 2 files changed, 134 insertions(+)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index f807d0e..3e29b10 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1201,6 +1201,88 @@ void of_print_phandle_args(const char *msg, const struct 
of_phandle_args *args)
printk(\n);
 }
 
+int __of_parse_next_phandle_with_args(const __be32 **plist,
+ const char *cells_name, int cell_count,
+ struct of_phandle_args *out_args)
+{
+   phandle phandle;
+   int i, count = 0, err;
+   struct device_node *dn;
+   const __be32 *list;
+
+   BUG_ON(!out_args);
+   BUG_ON(!cells_name  !cell_count);
+
+   /*
+* *plist should hold phandle, and it's updated to the next
+* phandle at return if no error.
+*/
+   list = *plist;
+   out_args-np = NULL;
+
+   phandle = be32_to_cpup(list);
+   if (!phandle)
+   goto err_out;
+
+   dn = of_find_node_by_phandle(phandle);
+   if (!dn)
+   goto err_out;
+
+   if (cells_name) {
+   err = of_property_read_u32(dn, cells_name, count);
+   if (err)
+   goto err_out;
+   } else {
+   count = cell_count;
+   }
+
+   out_args-np = dn;