Re: [PATCH] PCI/ATS: PASID and PRI are only enumerated in PF devices.

2020-07-20 Thread kernel test robot
Hi Ashok,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on pci/next]
[also build test ERROR on iommu/next linux/master linus/master v5.8-rc6 
next-20200720]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Ashok-Raj/PCI-ATS-PASID-and-PRI-are-only-enumerated-in-PF-devices/20200721-004510
base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
config: arm64-defconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All error/warnings (new ones prefixed by >>):

>> drivers/pci/ats.c:471:6: warning: no previous prototype for 
>> 'pci_pri_supported' [-Wmissing-prototypes]
 471 | bool pci_pri_supported(struct pci_dev *pdev)
 |  ^
   drivers/pci/ats.c: In function 'pci_pri_supported':
>> drivers/pci/ats.c:474:30: error: 'struct pci_dev' has no member named 
>> 'pri_cap'; did you mean 'pcie_cap'?
 474 |  return !!(pci_physfn(pdev)->pri_cap);
 |  ^~~
 |  pcie_cap
>> drivers/pci/ats.c:475:1: warning: control reaches end of non-void function 
>> [-Wreturn-type]
 475 | }
 | ^

vim +474 drivers/pci/ats.c

   463  
   464  /**
   465   * pci_pri_supported - Check if PRI is supported.
   466   * @pdev: PCI device structure
   467   *
   468   * Returns false when no PRI capability is present.
   469   * Returns true if PRI feature is supported and enabled
   470   */
 > 471  bool pci_pri_supported(struct pci_dev *pdev)
   472  {
   473  /* VFs share the PF PRI configuration */
 > 474  return !!(pci_physfn(pdev)->pri_cap);
 > 475  }
   476  EXPORT_SYMBOL_GPL(pci_pri_supported);
   477  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [EXT] Re: [PATCH v2 00/12] ACPI/OF: Upgrade MSI/IOMMU ID mapping APIs

2020-07-20 Thread Makarand Pawagi



> -Original Message-
> From: Lorenzo Pieralisi 
> Sent: Monday, July 20, 2020 10:25 PM
> To: linux-arm-ker...@lists.infradead.org; Rob Herring ;
> Rafael J. Wysocki ; Bjorn Helgaas ;
> Catalin Marinas ; Will Deacon ;
> j...@8bytes.org
> Cc: Hanjun Guo ; Sudeep Holla
> ; Robin Murphy ; Marc
> Zyngier ; iommu@lists.linux-foundation.org; linux-
> a...@vger.kernel.org; devicet...@vger.kernel.org; linux-...@vger.kernel.org;
> Makarand Pawagi ; Diana Madalina Craciun (OSS)
> ; Laurentiu Tudor 
> Subject: [EXT] Re: [PATCH v2 00/12] ACPI/OF: Upgrade MSI/IOMMU ID mapping
> APIs
> 
> Caution: EXT Email
> 
> On Fri, Jun 19, 2020 at 09:20:01AM +0100, Lorenzo Pieralisi wrote:
> > This series is a v2 of a previous posting:
> >
> > v1 -> v2
> >
> > - Removed _rid() wrappers
> > - Fixed !CONFIG_ACPI compilation issue
> > - Converted of_pci_iommu_init() to use of_iommu_configure_dev_id()
> >
> > v1:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Flinux-arm-kernel%2F20200521130008.8266-1-lorenzo.pierali
> >
> si%40arm.com%2F&data=02%7C01%7Cmakarand.pawagi%40nxp.com%7C
> da7bade
> >
> c592846a1478808d82ccd9eb1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> 7C1%7
> >
> C637308608924853281&sdata=ZoP3e2Q5Ijkz%2FtsGfgu9MYkmKXSHQExs5
> J64Sb
> > h51YA%3D&reserved=0
> >
> > Original cover letter
> > -
> >
> > Firmware bindings provided in the ACPI IORT table[1] and device tree
> > bindings define rules to carry out input/output ID mappings - ie
> > retrieving an IOMMU/MSI controller input ID for a device with a given
> > ID.
> >
> > At the moment these firmware bindings are used exclusively for PCI
> > devices and their requester ID to IOMMU/MSI id mapping but there is
> > nothing PCI specific in the ACPI and devicetree bindings that prevent
> > the firmware and kernel from using the firmware bindings to traslate
> > device IDs for any bus that requires its devices to carry out
> > input/output id translations.
> >
> > The Freescale FSL bus is an example whereby the input/output ID
> > translation kernel code put in place for PCI can be reused for devices
> > attached to the bus that are not PCI devices.
> >
> > This series updates the kernel code to make the MSI/IOMMU input/output
> > ID translation PCI agnostic and apply the resulting changes to the
> > device ID space provided by the Freescale FSL bus.
> >
> > [1]
> > https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Finfoc
> >
> enter.arm.com%2Fhelp%2Ftopic%2Fcom.arm.doc.den0049d%2FDEN0049D_IO_
> Rema
> >
> pping_Table.pdf&data=02%7C01%7Cmakarand.pawagi%40nxp.com%7Cda
> 7bade
> >
> c592846a1478808d82ccd9eb1%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%
> 7C1%7
> >
> C637308608924853281&sdata=RzpIo4AfAFsi2pdEuY%2FbnPAyase5cSmFIr5
> 2SX
> > aOrTg%3D&reserved=0
> >
> > Cc: Rob Herring 
> > Cc: "Rafael J. Wysocki" 
> > Cc: "Joerg Roedel 
> > Cc: Hanjun Guo 
> > Cc: Bjorn Helgaas 
> > Cc: Sudeep Holla 
> > Cc: Robin Murphy 
> > Cc: Catalin Marinas 
> > Cc: Will Deacon 
> > Cc: Marc Zyngier 
> >
> > Diana Craciun (2):
> >   of/irq: make of_msi_map_get_device_domain() bus agnostic
> >   bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver
> >
> > Laurentiu Tudor (1):
> >   dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc
> > bus
> >
> > Lorenzo Pieralisi (8):
> >   ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for
> > NC
> >   ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic
> >   ACPI/IORT: Make iort_msi_map_rid() PCI agnostic
> >   ACPI/IORT: Remove useless PCI bus walk
> >   ACPI/IORT: Add an input ID to acpi_dma_configure()
> >   of/iommu: Make of_map_rid() PCI agnostic
> >   of/device: Add input id to of_dma_configure()
> >   of/irq: Make of_msi_map_rid() PCI bus agnostic
> >
> > Makarand Pawagi (1):
> >   bus: fsl-mc: Add ACPI support for fsl-mc
> >
> >  .../devicetree/bindings/misc/fsl,qoriq-mc.txt |  50 +++-
> >  drivers/acpi/arm64/iort.c | 108 --
> >  drivers/acpi/scan.c   |   8 +-
> >  drivers/bus/fsl-mc/dprc-driver.c  |  31 ++---
> >  drivers/bus/fsl-mc/fsl-mc-bus.c   |  79 +
> >  drivers/bus/fsl-mc/fsl-mc-msi.c   |  36 --
> >  drivers/bus/fsl-mc/fsl-mc-private.h   |   6 +-
> >  drivers/iommu/of_iommu.c  |  81 +++--
> >  drivers/irqchip/irq-gic-v3-its-fsl-mc-msi.c   | 105 ++---
> >  drivers/of/base.c |  42 +++
> >  drivers/of/device.c   |   8 +-
> >  drivers/of/irq.c  |  34 +++---
> >  drivers/pci/msi.c |   9 +-
> >  include/acpi/acpi_bus.h   |   9 +-
> >  include/linux/acpi.h  |   7 ++
> >  include/linux/acpi_iort.h |  20 ++--
> >  include/linux/of.h|   4 +-
> >  include/li

Re: Re: [PATCH 04/21] dt-binding: mediatek: Add binding for mt8192 IOMMU and SMI

2020-07-20 Thread Yong Wu
On Mon, 2020-07-20 at 17:16 -0600, Rob Herring wrote:
> On Sat, Jul 11, 2020 at 02:48:29PM +0800, Yong Wu wrote:
> > This patch adds decriptions for mt8192 IOMMU and SMI.
> > 
> > mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor translation
> > table format. The M4U-SMI HW diagram is as below:
> > 
> >   EMI
> >|
> >   M4U
> >|
> >   
> >SMI Common
> >   
> >|
> >   +---+--+--+--+---+
> >   |   |  |  |   .. |   |
> >   |   |  |  |  |   |
> > larb0   larb1  larb2  larb4 ..  larb19   larb20
> > disp0   disp1   mdpvdec   IPE  IPE
> > 
> > All the connections are HW fixed, SW can NOT adjust it.
> > 
> > mt8192 M4U support 0~16GB iova range. we preassign different engines
> > into different iova ranges:
> > 
> > domain-id  module iova-range  larbs
> >0   disp0 ~ 4G  larb0/1
> >1   vcodec  4G ~ 8G larb4/5/7
> >2   cam/mdp 8G ~ 12G larb2/9/11/13/14/16/17/18/19/20
> >3   CCU00x4000_ ~ 0x43ff_ larb13: port 9/10
> >4   CCU10x4400_ ~ 0x47ff_ larb14: port 4/5
> 
> You probably want to use dma-ranges for defining these 
> address restrictions. 

Yes. Please see the commit message of [18/21] in this patchset.

> 
> How is the domain-id used or needed?

Here we assign different larb/ports in different iova ranges.
In the iommu driver, we will list the iova ranges as above and use the
domain-id to get the detailed iova range, then create a iommu domain for
each a iova range.

For the iommu masters, it only need use its special port in its dtsi
node, then the iova got from dma_alloc_attrs for that device will locate
in the special iova ranges.

> 
> Rob 
> 
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] iommu/mediatek: check 4GB mode by reading infracfg

2020-07-20 Thread Miles Chen
In previous discussion [1] and [2], we found that it is risky to
use max_pfn or totalram_pages to tell if 4GB mode is enabled.

Check 4GB mode by reading infracfg register, remove the usage
of the un-exported symbol max_pfn.

This is a step towards building mtk_iommu as a kernel module.

Change since v1:
1. remove the phandle usage, search for infracfg instead [3]
2. use infracfg instead of infracfg_regmap
3. move infracfg definitaions to linux/soc/mediatek/infracfg.h
4. update enable_4GB only when has_4gb_mode

[1] https://lkml.org/lkml/2020/6/3/733
[2] https://lkml.org/lkml/2020/6/4/136
[3] https://lkml.org/lkml/2020/7/15/1147

Cc: Mike Rapoport 
Cc: David Hildenbrand 
Cc: Yong Wu 
Cc: Yingjoe Chen 
Cc: Christoph Hellwig 
Cc: Yong Wu 
Cc: Chao Hao 
Cc: Rob Herring 
Cc: Matthias Brugger 
Signed-off-by: Miles Chen 
---
 drivers/iommu/mtk_iommu.c | 26 +-
 include/linux/soc/mediatek/infracfg.h |  3 +++
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 2be96f1cdbd2..16765f532853 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -3,7 +3,6 @@
  * Copyright (c) 2015-2016 MediaTek Inc.
  * Author: Yong Wu 
  */
-#include 
 #include 
 #include 
 #include 
@@ -15,13 +14,16 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -599,8 +601,10 @@ static int mtk_iommu_probe(struct platform_device *pdev)
struct resource *res;
resource_size_t ioaddr;
struct component_match  *match = NULL;
+   struct regmap   *infracfg;
void*protect;
int i, larb_nr, ret;
+   u32 val;
 
data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
if (!data)
@@ -614,10 +618,22 @@ static int mtk_iommu_probe(struct platform_device *pdev)
return -ENOMEM;
data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN);
 
-   /* Whether the current dram is over 4GB */
-   data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT));
-   if (!data->plat_data->has_4gb_mode)
-   data->enable_4GB = false;
+   data->enable_4GB = false;
+   if (data->plat_data->has_4gb_mode) {
+   infracfg = syscon_regmap_lookup_by_compatible(
+   "mediatek,mt8173-infracfg");
+   if (IS_ERR(infracfg)) {
+   infracfg = syscon_regmap_lookup_by_compatible(
+   "mediatek,mt2712-infracfg");
+   if (IS_ERR(infracfg))
+   return PTR_ERR(infracfg);
+
+   }
+   ret = regmap_read(infracfg, REG_INFRA_MISC, &val);
+   if (ret)
+   return ret;
+   data->enable_4GB = !!(val & F_DDR_4GB_SUPPORT_EN);
+   }
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
data->base = devm_ioremap_resource(dev, res);
diff --git a/include/linux/soc/mediatek/infracfg.h 
b/include/linux/soc/mediatek/infracfg.h
index fd25f0148566..233463d789c6 100644
--- a/include/linux/soc/mediatek/infracfg.h
+++ b/include/linux/soc/mediatek/infracfg.h
@@ -32,6 +32,9 @@
 #define MT7622_TOP_AXI_PROT_EN_WB  (BIT(2) | BIT(6) | \
 BIT(7) | BIT(8))
 
+#define REG_INFRA_MISC 0xf00
+#define F_DDR_4GB_SUPPORT_EN   BIT(13)
+
 int mtk_infracfg_set_bus_protection(struct regmap *infracfg, u32 mask,
bool reg_update);
 int mtk_infracfg_clear_bus_protection(struct regmap *infracfg, u32 mask,
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 6/9] dt-bindings: gpio: renesas,rcar-gpio: Add r8a774e1 support

2020-07-20 Thread Rob Herring
On Mon, 13 Jul 2020 22:35:17 +0100, Lad Prabhakar wrote:
> Document Renesas RZ/G2H (R8A774E1) GPIO blocks compatibility within the
> relevant dt-bindings.
> 
> Signed-off-by: Lad Prabhakar 
> ---
>  Documentation/devicetree/bindings/gpio/renesas,rcar-gpio.yaml | 1 +
>  1 file changed, 1 insertion(+)
> 

Acked-by: Rob Herring 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 8/9] dt-bindings: net: renesas,ravb: Add support for r8a774e1 SoC

2020-07-20 Thread Rob Herring
On Mon, 13 Jul 2020 22:35:19 +0100, Lad Prabhakar wrote:
> From: Marian-Cristian Rotariu 
> 
> Document RZ/G2H (R8A774E1) SoC bindings.
> 
> Signed-off-by: Marian-Cristian Rotariu 
> 
> Signed-off-by: Lad Prabhakar 
> ---
>  Documentation/devicetree/bindings/net/renesas,ravb.txt | 1 +
>  1 file changed, 1 insertion(+)
> 

Acked-by: Rob Herring 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/9] dt-bindings: iommu: renesas, ipmmu-vmsa: Add r8a774e1 support

2020-07-20 Thread Rob Herring
On Mon, Jul 13, 2020 at 10:35:12PM +0100, Lad Prabhakar wrote:
> Document RZ/G2H (R8A774E1) SoC bindings.
> 
> Signed-off-by: Lad Prabhakar 
> ---
>  Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml | 1 +
>  1 file changed, 1 insertion(+)

Acked-by: Rob Herring 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/9] arm64: dts: renesas: r8a774e1: Add IPMMU device nodes

2020-07-20 Thread Rob Herring
On Mon, Jul 13, 2020 at 10:35:14PM +0100, Lad Prabhakar wrote:
> From: Marian-Cristian Rotariu 
> 
> Add RZ/G2H (R8A774E1) IPMMU nodes.
> 
> Signed-off-by: Marian-Cristian Rotariu 
> 
> Signed-off-by: Lad Prabhakar 
> ---
>  arch/arm64/boot/dts/renesas/r8a774e1.dtsi | 121 ++
>  1 file changed, 121 insertions(+)

Acked-by: Rob Herring 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 2/5] iommu/uapi: Add argsz for user filled data

2020-07-20 Thread Jacob Pan
On Fri, 17 Jul 2020 15:44:23 +0200
Auger Eric  wrote:

> Hi Jacob,
> 
> On 7/16/20 8:45 PM, Jacob Pan wrote:
> > As IOMMU UAPI gets extended, user data size may increase. To support
> > backward compatibiliy, this patch introduces a size field to each
> > UAPI data structures. It is *always* the responsibility for the
> > user to fill in the correct size. Padding fields are adjusted to
> > ensure 8 byte alignment.
> > 
> > Specific scenarios for user data handling are documented in:
> > Documentation/userspace-api/iommu.rst
> > 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> >  include/uapi/linux/iommu.h | 12 +---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> > index e907b7091a46..d5e9014f690e 100644
> > --- a/include/uapi/linux/iommu.h
> > +++ b/include/uapi/linux/iommu.h
> > @@ -135,6 +135,7 @@ enum iommu_page_response_code {
> >  
> >  /**
> >   * struct iommu_page_response - Generic page response information
> > + * @argsz: User filled size of this data
> >   * @version: API version of this structure
> >   * @flags: encodes whether the corresponding fields are valid
> >   * (IOMMU_FAULT_PAGE_RESPONSE_* values)
> > @@ -143,6 +144,7 @@ enum iommu_page_response_code {
> >   * @code: response code from &enum iommu_page_response_code
> >   */
> >  struct iommu_page_response {
> > +   __u32   argsz;
> >  #define IOMMU_PAGE_RESP_VERSION_1  1  
> Don't you need to incr the version for all the modified structs?
not literal "flags" but @cache and @granularity are flags in reality. I
think that is OK. I also updated document to say "flags or equivalent".

> > __u32   version;
> >  #define IOMMU_PAGE_RESP_PASID_VALID(1 << 0)
> > @@ -218,6 +220,7 @@ struct iommu_inv_pasid_info {
> >  /**
> >   * struct iommu_cache_invalidate_info - First level/stage
> > invalidation
> >   * information
> > + * @argsz: User filled size of this data
> >   * @version: API version of this structure
> >   * @cache: bitfield that allows to select which caches to
> > invalidate
> >   * @granularity: defines the lowest granularity used for the
> > invalidation: @@ -246,6 +249,7 @@ struct iommu_inv_pasid_info {
> >   * must support the used granularity.
> >   */
> >  struct iommu_cache_invalidate_info {
> > +   __u32   argsz;
> >  #define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1
> > __u32   version;  
> so there is no "flags" field in this struct. Is it OK?
> >  /* IOMMU paging structure cache */
> > @@ -255,7 +259,7 @@ struct iommu_cache_invalidate_info {
> >  #define IOMMU_CACHE_INV_TYPE_NR(3)
> > __u8cache;
> > __u8granularity;
> > -   __u8padding[2];
> > +   __u8padding[6];
> > union {
> > struct iommu_inv_pasid_info pasid_info;
> > struct iommu_inv_addr_info addr_info;
> > @@ -292,6 +296,7 @@ struct iommu_gpasid_bind_data_vtd {
> >  
> >  /**
> >   * struct iommu_gpasid_bind_data - Information about device and
> > guest PASID binding
> > + * @argsz: User filled size of this data
> >   * @version:   Version of this data structure
> >   * @format:PASID table entry format
> >   * @flags: Additional information on guest bind request
> > @@ -309,17 +314,18 @@ struct iommu_gpasid_bind_data_vtd {
> >   * PASID to host PASID based on this bind data.
> >   */
> >  struct iommu_gpasid_bind_data {
> > +   __u32 argsz;
> >  #define IOMMU_GPASID_BIND_VERSION_11
> > __u32 version;
> >  #define IOMMU_PASID_FORMAT_INTEL_VTD   1
> > __u32 format;
> > +   __u32 addr_width;
> >  #define IOMMU_SVA_GPASID_VAL   (1 << 0) /* guest PASID valid
> > */ __u64 flags;
> > __u64 gpgd;
> > __u64 hpasid;
> > __u64 gpasid;
> > -   __u32 addr_width;
> > -   __u8  padding[12];
> > +   __u8  padding[8];
> > /* Vendor specific data */
> > union {
> > struct iommu_gpasid_bind_data_vtd vtd;
> >   
> Thanks
> 
> Eric
> 

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 1/5] docs: IOMMU user API

2020-07-20 Thread Jacob Pan
On Fri, 17 Jul 2020 13:37:25 -0600
Alex Williamson  wrote:

> On Thu, 16 Jul 2020 11:45:13 -0700
> Jacob Pan  wrote:
> 
> > IOMMU UAPI is newly introduced to support communications between
> > guest virtual IOMMU and host IOMMU. There has been lots of
> > discussions on how it should work with VFIO UAPI and userspace in
> > general.
> > 
> > This document is indended to clarify the UAPI design and usage. The
> > mechanics of how future extensions should be achieved are also
> > covered in this documentation.
> > 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> >  Documentation/userspace-api/iommu.rst | 339
> > ++ 1 file changed, 339 insertions(+)
> >  create mode 100644 Documentation/userspace-api/iommu.rst
> > 
> > diff --git a/Documentation/userspace-api/iommu.rst
> > b/Documentation/userspace-api/iommu.rst new file mode 100644
> > index ..efc3e1838235
> > --- /dev/null
> > +++ b/Documentation/userspace-api/iommu.rst
> > @@ -0,0 +1,339 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +.. iommu:
> > +
> > +=
> > +IOMMU Userspace API
> > +=
> > +
> > +IOMMU UAPI is used for virtualization cases where communications
> > are +needed between physical and virtual IOMMU drivers. For native
> > +usage, IOMMU is a system device which does not need to communicate
> > +with user space directly.
> > +
> > +The primary use cases are guest Shared Virtual Address (SVA) and
> > +guest IO virtual address (IOVA), wherein a virtual IOMMU (vIOMMU)
> > is +required to communicate with the physical IOMMU in the host.
> > +
> > +.. contents:: :local:
> > +
> > +Functionalities
> > +===
> > +Communications of user and kernel involve both directions. The
> > +supported user-kernel APIs are as follows:
> > +
> > +1. Alloc/Free PASID
> > +2. Bind/unbind guest PASID (e.g. Intel VT-d)
> > +3. Bind/unbind guest PASID table (e.g. ARM sMMU)
> > +4. Invalidate IOMMU caches
> > +5. Service IO page faults (page request and response)
> > +
> > +Requirements
> > +
> > +The IOMMU UAPIs are generic and extensible to meet the following
> > +requirements:
> > +
> > +1. Emulated and para-virtualised vIOMMUs
> > +2. Multiple vendors (Intel VT-d, ARM sMMU, etc.)
> > +3. Extensions to the UAPI shall not break existing user space
> > +
> > +Interfaces
> > +==
> > +Although the data structures defined in IOMMU UAPI are
> > self-contained, +there is no user API functions introduced.
> > Instead, IOMMU UAPI is +designed to work with existing user driver
> > frameworks such as VFIO. +
> > +Extension Rules & Precautions
> > +-
> > +When IOMMU UAPI gets extended, the data structures can *only* be
> > +modified in two ways:
> > +
> > +1. Adding new fields by re-purposing the padding[] field. No size
> > change. +2. Adding new union members at the end. May increase in
> > size. +
> > +No new fields can be added *after* the variable sized union in
> > that it +will break backward compatibility when offset moves. In
> > both cases, a +new flag must be accompanied with a new field such
> > that the IOMMU +driver can process the data based on the new flag.
> > Version field is +only reserved for the unlikely event of UAPI
> > upgrade at its entirety. +
> > +It's *always* the caller's responsibility to indicate the size of
> > the +structure passed by setting argsz appropriately.
> > +Though at the same time, argsz is user provided data which is not
> > +trusted. The argsz field allows the user to indicate how much data
> > +they're providing, it's still the kernel's responsibility to
> > validate +whether it's correct and sufficient for the requested
> > operation. +
> > +Compatibility Checking
> > +--
> > +When IOMMU UAPI extension results in size increase, IOMMU UAPI core
> > +and vendor driver shall handle the following cases:
> > +
> > +1. User and kernel has exact size match
> > +2. An older user with older kernel header (smaller UAPI size)
> > running on a
> > +   newer kernel (larger UAPI size)
> > +3. A newer user with newer kernel header (larger UAPI size) running
> > +   on an older kernel.
> > +4. A malicious/misbehaving user pass illegal/invalid size but
> > within
> > +   range. The data may contain garbage.
> > +
> > +Feature Checking
> > +
> > +While launching a guest with vIOMMU, it is important to ensure
> > that host +can support the UAPI data structures to be used for
> > vIOMMU-pIOMMU +communications. Without upfront compatibility
> > checking, future faults +are difficult to report even in normal
> > conditions. For example, TLB +invalidations should always succeed.
> > There is no architectural way to +report back to the vIOMMU if the
> > UAPI data is incompatible. If that +happens, in order to protect
> > IOMMU iosolation guarantee, we have to +resort to not giving
> > completion status in vIOMMU. This may resu

RE: [PATCH] iommu/arm-smmu-v3: remove the approach of MSI polling for CMD SYNC

2020-07-20 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: Song Bao Hua (Barry Song)
> Sent: Friday, July 17, 2020 9:06 PM
> To: 'Robin Murphy' ; w...@kernel.org;
> j...@8bytes.org
> Cc: linux-ker...@vger.kernel.org; Linuxarm ;
> linux-arm-ker...@lists.infradead.org; iommu@lists.linux-foundation.org;
> Zengtao (B) 
> Subject: RE: [PATCH] iommu/arm-smmu-v3: remove the approach of MSI
> polling for CMD SYNC
> 
> 
> 
> > -Original Message-
> > From: Robin Murphy [mailto:robin.mur...@arm.com]
> > Sent: Friday, July 17, 2020 8:55 PM
> > To: Song Bao Hua (Barry Song) ;
> w...@kernel.org;
> > j...@8bytes.org
> > Cc: linux-ker...@vger.kernel.org; Linuxarm ;
> > linux-arm-ker...@lists.infradead.org; iommu@lists.linux-foundation.org;
> > Zengtao (B) 
> > Subject: Re: [PATCH] iommu/arm-smmu-v3: remove the approach of MSI
> > polling for CMD SYNC
> >
> > On 2020-07-17 00:07, Barry Song wrote:
> > > Before commit 587e6c10a7ce ("iommu/arm-smmu-v3: Reduce contention
> > during
> > > command-queue insertion"), msi polling perhaps performed better since
> > > it could run outside the spin_lock_irqsave() while the code polling cons
> > > reg was running in the lock.
> > >
> > > But after the great reorganization of smmu queue, neither of these two
> > > polling methods are running in a spinlock. And real tests show polling
> > > cons reg via sev means smaller latency. It is probably because polling
> > > by msi will ask hardware to write memory but sev polling depends on the
> > > update of register only.
> > >
> > > Using 16 threads to run netperf on hns3 100G NIC with UDP packet size
> > > in 32768bytes and set iommu to strict, TX throughput can improve from
> > > 25227.74Mbps to 27145.59Mbps by this patch. In this case, SMMU is
> super
> > > busy as hns3 sends map/unmap requests extremely frequently.
> >
> > How many different systems and SMMU implementations are those numbers
> > representative of? Given that we may have cases where the SMMU can use
> > MSIs but can't use SEV, so would have to fall back to inefficient
> > busy-polling, I'd be wary of removing this entirely. Allowing particular
> > platforms or SMMU implementations to suppress MSI functionality if they
> > know for sure it makes sense seems like a safer bet.
> >
> Hello Robin,
> 
> Thanks for taking a look. Actually I was really struggling with the good way 
> to
> make every platform happy.
> And I don't have other platforms to test and check if those platforms run
> better by sev polling. Even two
> platforms have completely same SMMU features, it is still possible they
> behave differently.
> So I simply sent this patch to get the discussion started to get opinions.
> 
> At the first beginning, I wanted to have a module parameter for users to 
> decide
> if msi polling should be disabled.
> But the module parameter might be totally ignored by linux distro.
> 
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -418,6 +418,11 @@ module_param_named(disable_bypass,
> disable_bypass, bool, S_IRUGO);  MODULE_PARM_DESC(disable_bypass,
>   "Disable bypass streams such that incoming transactions from devices
> that are not attached to an iommu domain will report an abort back to the
> device and will not be allowed to pass through the SMMU.");
> 
> +static bool disable_msipolling = 1;
> +module_param_named(disable_msipolling, disable_msipolling, bool,
> +S_IRUGO); MODULE_PARM_DESC(disable_msipolling,
> + "Don't use MSI to poll the completion of CMD_SYNC if it is slower than
> +SEV");
> +
>  enum pri_resp {
>   PRI_RESP_DENY = 0,
>   PRI_RESP_FAIL = 1,
> @@ -992,7 +997,7 @@ static void arm_smmu_cmdq_build_sync_cmd(u64
> *cmd, struct arm_smmu_device *smmu,
>* Beware that Hi16xx adds an extra 32 bits of goodness to its MSI
>* payload, so the write will zero the entire command on that platform.
>*/
> - if (smmu->features & ARM_SMMU_FEAT_MSI &&
> + if (!disable_msipolling && smmu->features & ARM_SMMU_FEAT_MSI &&
>   smmu->features & ARM_SMMU_FEAT_COHERENCY) {
>   ent.sync.msiaddr = q->base_dma + Q_IDX(&q->llq, prod) *
>  q->ent_dwords * 8;
> @@ -1332,7 +1337,7 @@ static int
> __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu,
> static int arm_smmu_cmdq_poll_until_sync(struct arm_smmu_device *smmu,
>struct arm_smmu_ll_queue *llq)
>  {
> - if (smmu->features & ARM_SMMU_FEAT_MSI &&
> + if (!disable_msipolling && smmu->features & ARM_SMMU_FEAT_MSI &&
>   smmu->features & ARM_SMMU_FEAT_COHERENCY)
>   return __arm_smmu_cmdq_poll_until_msi(smmu, llq);
> 
> 
> Another option is that we don't use module parameter, alternatively, we check
> the vendor/chip ID,
> if the chip has better performance on sev polling, it may set 
> disable_msipolling
> to true.
> 
> You are very welcome to give your suggestions.

A possible way to do some chip-specific configuration would be setting 
sm

[PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu

2020-07-20 Thread Lu Baolu
The VT-d spec requires (10.4.4 Global Command Register, TE field) that:

Hardware implementations supporting DMA draining must drain any in-flight
DMA read/write requests queued within the Root-Complex before completing
the translation enable command and reflecting the status of the command
through the TES field in the Global Status register.

Unfortunately, some integrated graphic devices fail to do so after some
kind of power state transition. As the result, the system might stuck in
iommu_disable_translation(), waiting for the completion of TE transition.

This provides a quirk list for those devices and skips TE disabling if
the qurik hits.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208363
Tested-by: Koba Ko 
Cc: Ashok Raj 
Cc: sta...@vger.kernel.org
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel/dmar.c  |  1 +
 drivers/iommu/intel/iommu.c | 27 +++
 include/linux/dmar.h|  1 +
 include/linux/intel-iommu.h |  2 ++
 4 files changed, 31 insertions(+)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 683b812c5c47..16f47041f1bf 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1102,6 +1102,7 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)
}
 
drhd->iommu = iommu;
+   iommu->drhd = drhd;
 
return 0;
 
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 98390a6d8113..11418b14cc3f 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -356,6 +356,7 @@ static int intel_iommu_strict;
 static int intel_iommu_superpage = 1;
 static int iommu_identity_mapping;
 static int intel_no_bounce;
+static int iommu_skip_te_disable;
 
 #define IDENTMAP_GFX   2
 #define IDENTMAP_AZALIA4
@@ -1633,6 +1634,10 @@ static void iommu_disable_translation(struct intel_iommu 
*iommu)
u32 sts;
unsigned long flag;
 
+   if (iommu_skip_te_disable && iommu->drhd->gfx_dedicated &&
+   (cap_read_drain(iommu->cap) || cap_write_drain(iommu->cap)))
+   return;
+
raw_spin_lock_irqsave(&iommu->register_lock, flag);
iommu->gcmd &= ~DMA_GCMD_TE;
writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
@@ -4043,6 +4048,7 @@ static void __init init_no_remapping_devices(void)
 
/* This IOMMU has *only* gfx devices. Either bypass it or
   set the gfx_mapped flag, as appropriate */
+   drhd->gfx_dedicated = 1;
if (!dmar_map_gfx) {
drhd->ignored = 1;
for_each_active_dev_scope(drhd->devices,
@@ -6160,6 +6166,27 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0044, 
quirk_calpella_no_shadow_g
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0062, 
quirk_calpella_no_shadow_gtt);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x006a, 
quirk_calpella_no_shadow_gtt);
 
+static void quirk_igfx_skip_te_disable(struct pci_dev *dev)
+{
+   unsigned short ver;
+
+   if (!IS_GFX_DEVICE(dev))
+   return;
+
+   ver = (dev->device >> 8) & 0xff;
+   if (ver != 0x45 && ver != 0x46 && ver != 0x4c &&
+   ver != 0x4e && ver != 0x8a && ver != 0x98 &&
+   ver != 0x9a)
+   return;
+
+   if (risky_device(dev))
+   return;
+
+   pci_info(dev, "Skip IOMMU disabling for graphics\n");
+   iommu_skip_te_disable = 1;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, 
quirk_igfx_skip_te_disable);
+
 /* On Tylersburg chipsets, some BIOSes have been known to enable the
ISOCH DMAR unit for the Azalia sound device, but not give it any
TLB entries, which causes it to deadlock. Check for that.  We do
diff --git a/include/linux/dmar.h b/include/linux/dmar.h
index d7bf029df737..65565820328a 100644
--- a/include/linux/dmar.h
+++ b/include/linux/dmar.h
@@ -48,6 +48,7 @@ struct dmar_drhd_unit {
u16 segment;/* PCI domain   */
u8  ignored:1;  /* ignore drhd  */
u8  include_all:1;
+   u8  gfx_dedicated:1;/* graphic dedicated*/
struct intel_iommu *iommu;
 };
 
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index bf6009a344f5..329629e1e9de 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -600,6 +600,8 @@ struct intel_iommu {
struct iommu_device iommu;  /* IOMMU core code handle */
int node;
u32 flags;  /* Software defined flags */
+
+   struct dmar_drhd_unit *drhd;
 };
 
 /* PCI domain-device relationship */
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 00/12] PCI: brcmstb: enable PCIe for STB chips

2020-07-20 Thread Florian Fainelli
On 7/15/20 7:35 AM, Jim Quinlan wrote:
> Patchset Summary:
>   Enhance a PCIe host controller driver.  Because of its unusual design
>   we are foced to change dev->dma_pfn_offset into a more general role
>   allowing multiple offsets.  See the 'v1' notes below for more info.

Christoph, Robin, are you happy with this version?

> 
> v8:
>   Commit: "device core: Introduce DMA range map, supplanting ..."
>   -- To satisfy a specific m68 compile configuration, I moved the 'struct
>  bus_dma_region; definition out of #ifdef CONFIG_HAS_DMA and also defined
>  three inline functions for !CONFIG_HAS_DMA (kernel test robot).
>   -- The sunXi drivers -- suc4i_csi, sun6i_csi, cedrus_hw -- set
>  a pfn_offset outside of_dma_configure() but the code offers no 
>  insight on the size of the translation window.  V7 had me using
>  SIZE_MAX as the size.  I have since contacted the sunXi maintainer and
>  he said that using a size of SZ_4G would cover sunXi configurations.
> 
> v7:
>   Commit: "device core: Introduce DMA range map, supplanting ..."
>   -- remove second kcalloc/copy in device.c (AndyS)
>   -- use PTR_ERR_OR_ZERO() and PHYS_PFN() (AndyS)
>   -- indentation, sizeof(struct ...) => sizeof(*r) (AndyS)
>   -- add pfn.h definitions: PFN_DMA_ADDR(), DMA_ADDR_PFN() (AndyS)
>   -- Fixed compile error in "sun6i_csi.c" (kernel test robot)
>   Commit "ata: ahci_brcm: Fix use of BCM7216 reset controller"
>   -- correct name of function in the commit msg (SergeiS)
>   
> v6:
>   Commit "device core: Introduce DMA range map":
>   -- of_dma_get_range() now takes a single argument and returns either
>  NULL, a valid map, or an ERR_PTR. (Robin)
>   -- offsets are no longer a PFN value but an actual address. (Robin)
>   -- the bus_dma_region struct stores the range size instead of
>  the cpu_end and pci_end values. (Robin)
>   -- devices that were setting a single offset with no boundaries
>  have been modified to have boundaries; in a few places
>  where this informatino was unavilable a /* FIXME: ... */
>  comment was added. (Robin)
>   -- dma_attach_offset_range() can be called when an offset
>  map already exists; if it's range is already present
>  nothing is done and success is returned. (Robin)
>   All commits:
>   -- Man name/style/corrections/etc changed (Bjorn)
>   -- rebase to Torvalds master
> 
> v5:
>   Commit "device core: Introduce multiple dma pfn offsets"
>   -- in of/address.c: "map_size = 0" => "*map_size = 0"
>   -- use kcalloc instead of kzalloc (AndyS)
>   -- use PHYS_ADDR_MAX instead of "~(phys_addr_t)0"
>   Commit "PCI: brcmstb: Set internal memory viewport sizes"
>   -- now gives error on missing dma-ranges property.
>   Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
>   -- removed "Allof:" from brcm,scb-sizes definition (RobH)
>   All Commits:
>   -- indentation style, use max chars 100 (AndyS)
>   -- rebased to torvalds master
> 
> v4:
>   Commit "device core: Introduce multiple dma pfn offsets"
>   -- of_dma_get_range() does not take a dev param but instead
>  takes two "out" params: map and map_size.  We do this so
>  that the code that parses dma-ranges is separate from
>  the code that modifies 'dev'.   (Nicolas)
>   -- the separate case of having a single pfn offset has
>  been removed and is now processed by going through the
>  map array. (Nicolas)
>   -- move attach_uniform_dma_pfn_offset() from of/address.c to
>  dma/mapping.c so that it does not depend on CONFIG_OF. (Nicolas)
>   -- devm_kcalloc => devm_kzalloc (DanC)
>   -- add/fix assignment to dev->dma_pfn_offset_map for func
>  attach_uniform_dma_pfn_offset() (DanC, Nicolas)
>   -- s/struct dma_pfn_offset_region/struct bus_dma_region/ (Nicolas)
>   -- s/attach_uniform_dma_pfn_offset/dma_attach_uniform_pfn_offset/
>   -- s/attach_dma_pfn_offset_map/dma_attach_pfn_offset_map/
>   -- More use of PFN_{PHYS,DOWN,UP}. (AndyS)
>   Commit "of: Include a dev param in of_dma_get_range()"
>   -- this commit was sqaushed with "device core: Introduce ..."
> 
> v3:
>   Commit "device core: Introduce multiple dma pfn offsets"
>   Commit "arm: dma-mapping: Invoke dma offset func if needed"
>   -- The above two commits have been squashed.  More importantly,
>  the code has been modified so that the functionality for
>  multiple pfn offsets subsumes the use of dev->dma_pfn_offset.
>  In fact, dma_pfn_offset is removed and supplanted by
>  dma_pfn_offset_map, which is a pointer to an array.  The
>  more common case of a uniform offset is now handled as
>  a map with a single entry, while cases requiring multiple
>  pfn offsets use a map with multiple entries.  Code paths
>  that used to do this:
> 
>  dev->dma_pfn_offset = mydrivers_pfn_offset;
> 
>  have been changed to do this:
> 
>  attach_uniform_dma_pfn_offset(dev, pfn_offset);
> 
>   Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
>   --

Re: [PATCH 04/21] dt-binding: mediatek: Add binding for mt8192 IOMMU and SMI

2020-07-20 Thread Rob Herring
On Sat, Jul 11, 2020 at 02:48:29PM +0800, Yong Wu wrote:
> This patch adds decriptions for mt8192 IOMMU and SMI.
> 
> mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor translation
> table format. The M4U-SMI HW diagram is as below:
> 
>   EMI
>|
>   M4U
>|
>   
>SMI Common
>   
>|
>   +---+--+--+--+---+
>   |   |  |  |   .. |   |
>   |   |  |  |  |   |
> larb0   larb1  larb2  larb4 ..  larb19   larb20
> disp0   disp1   mdpvdec   IPE  IPE
> 
> All the connections are HW fixed, SW can NOT adjust it.
> 
> mt8192 M4U support 0~16GB iova range. we preassign different engines
> into different iova ranges:
> 
> domain-id  module iova-range  larbs
>0   disp0 ~ 4G  larb0/1
>1   vcodec  4G ~ 8G larb4/5/7
>2   cam/mdp 8G ~ 12G larb2/9/11/13/14/16/17/18/19/20
>3   CCU00x4000_ ~ 0x43ff_ larb13: port 9/10
>4   CCU10x4400_ ~ 0x47ff_ larb14: port 4/5

You probably want to use dma-ranges for defining these 
address restrictions. 

How is the domain-id used or needed?

Rob 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 1/5] docs: IOMMU user API

2020-07-20 Thread Jacob Pan
Hi Eric,

On Fri, 17 Jul 2020 15:32:58 +0200
Auger Eric  wrote:

> Hi Jacob,
> 
> On 7/16/20 8:45 PM, Jacob Pan wrote:
> > IOMMU UAPI is newly introduced to support communications between
> > guest virtual IOMMU and host IOMMU. There has been lots of
> > discussions on how it should work with VFIO UAPI and userspace in
> > general.
> > 
> > This document is indended to clarify the UAPI design and usage.
> > The  
> intended
will fix.

> > mechanics of how future extensions should be achieved are also
> > covered in this documentation.
> > 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> >  Documentation/userspace-api/iommu.rst | 339
> > ++ 1 file changed, 339 insertions(+)
> >  create mode 100644 Documentation/userspace-api/iommu.rst
> > 
> > diff --git a/Documentation/userspace-api/iommu.rst
> > b/Documentation/userspace-api/iommu.rst new file mode 100644
> > index ..efc3e1838235
> > --- /dev/null
> > +++ b/Documentation/userspace-api/iommu.rst
> > @@ -0,0 +1,339 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +.. iommu:
> > +
> > +=
> > +IOMMU Userspace API
> > +=
> > +
> > +IOMMU UAPI is used for virtualization cases where communications
> > are +needed between physical and virtual IOMMU drivers. For native  
> s/native/baremetal?
Sounds good.

> > +usage, IOMMU is a system device which does not need to
> > communicate  
> the IOMMU?
Sounds good.

> > +with user space directly.
> > +
> > +The primary use cases are guest Shared Virtual Address (SVA) and
> > +guest IO virtual address (IOVA), wherein a virtual IOMMU (vIOMMU)
> > is +required to communicate with the physical IOMMU in the host.  
> wherin the vIOMMU implementation relies on the physical IOMMU and for
> this reason requires interactions with the host driver.
> 
Will do, it is more complete.

> > +
> > +.. contents:: :local:
> > +
> > +Functionalities
> > +===
> > +Communications of user and kernel involve both directions. The
> > +supported user-kernel APIs are as follows:
> > +
> > +1. Alloc/Free PASID
> > +2. Bind/unbind guest PASID (e.g. Intel VT-d)
> > +3. Bind/unbind guest PASID table (e.g. ARM sMMU)  
> s/sMMU/SMMU
Sounds good and all below.

> > +4. Invalidate IOMMU caches
> > +5. Service IO page faults (page request and response)  
> Report errors to the guest and serve page requests?
Yes, the UAPI does cover faults beyond PRS.

> > +
> > +Requirements
> > +
> > +The IOMMU UAPIs are generic and extensible to meet the following
> > +requirements:
> > +
> > +1. Emulated and para-virtualised vIOMMUs
> > +2. Multiple vendors (Intel VT-d, ARM sMMU, etc.)  
> SMMU
Ditto.

> > +3. Extensions to the UAPI shall not break existing user space
> > +
> > +Interfaces
> > +==
> > +Although the data structures defined in IOMMU UAPI are
> > self-contained, +there is no user API functions introduced.
> > Instead, IOMMU UAPI is +designed to work with existing user driver
> > frameworks such as VFIO. +
> > +Extension Rules & Precautions
> > +-
> > +When IOMMU UAPI gets extended, the data structures can *only* be
> > +modified in two ways:
> > +
> > +1. Adding new fields by re-purposing the padding[] field. No size
> > change. +2. Adding new union members at the end. May increase in
> > size.  
> May increase the structure sizes
Sounds good.

> > +
> > +No new fields can be added *after* the variable sized union in
> > that it +will break backward compatibility when offset moves. In
> > both cases, a +new flag must be accompanied with a new field such
> > that the IOMMU  
> a new flag must be introduced whenever a change affects the structure
> using either method?
Yours is more clear, will change to:

A new flag must be introduced whenever a change affects the structure
using either method. The IOMMU driver processes the data based on flags
which ensures backward compatibility.

> > +driver can process the data based on the new flag. Version field is
> > +only reserved for the unlikely event of UAPI upgrade at its
> > entirety. +
> > +It's *always* the caller's responsibility to indicate the size of
> > the +structure passed by setting argsz appropriately.
> > +Though at the same time, argsz is user provided data which is not
> > +trusted. The argsz field allows the user to indicate how much data
> > +they're providing, it's still the kernel's responsibility to
> > validate  
> he is providing
Sounds good.

> > +whether it's correct and sufficient for the requested operation.
> > +
> > +Compatibility Checking
> > +--
> > +When IOMMU UAPI extension results in size increase, IOMMU UAPI
> > core  
> in some structure size increase, the IOMMU UAPI code
Better

> > +and vendor driver shall handle the following cases:
> > +
> > +1. User and kernel has exact size match
> > +2. An older user with older kernel header (smaller UAPI size)
> > running o

Re: [PATCH 02/21] dt-binding: memory: mediatek: Extend LARB_NR_MAX to 32

2020-07-20 Thread Rob Herring
On Sat, 11 Jul 2020 14:48:27 +0800, Yong Wu wrote:
> Extend the max larb number definition as mt8192 has larb_nr over 16.
> 
> Signed-off-by: Yong Wu 
> ---
>  include/dt-bindings/memory/mtk-smi-larb-port.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

Acked-by: Rob Herring 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 01/21] dt-binding: memory: mediatek: Add a common larb-port header file

2020-07-20 Thread Rob Herring
On Sat, 11 Jul 2020 14:48:26 +0800, Yong Wu wrote:
> Put all the macros about smi larb/port togethers, this is a preparing
> patch for extending LARB_NR and adding new dom-id support.
> 
> Signed-off-by: Yong Wu 
> ---
>  include/dt-bindings/memory/mt2712-larb-port.h  |  2 +-
>  include/dt-bindings/memory/mt6779-larb-port.h  |  2 +-
>  include/dt-bindings/memory/mt8173-larb-port.h  |  2 +-
>  include/dt-bindings/memory/mt8183-larb-port.h  |  2 +-
>  include/dt-bindings/memory/mtk-smi-larb-port.h | 15 +++
>  5 files changed, 19 insertions(+), 4 deletions(-)
>  create mode 100644 include/dt-bindings/memory/mtk-smi-larb-port.h
> 

Acked-by: Rob Herring 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] PCI/ATS: PASID and PRI are only enumerated in PF devices.

2020-07-20 Thread kernel test robot
Hi Ashok,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on pci/next]
[also build test ERROR on iommu/next linux/master linus/master v5.8-rc6 
next-20200720]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Ashok-Raj/PCI-ATS-PASID-and-PRI-are-only-enumerated-in-PF-devices/20200721-004510
base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
config: x86_64-kexec (attached as .config)
compiler: gcc-9 (Debian 9.3.0-14) 9.3.0
reproduce (this is a W=1 build):
# save the attached .config to linux build tree
make W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   drivers/iommu/intel/iommu.c: In function 'dmar_insert_one_dev_info':
>> drivers/iommu/intel/iommu.c:2557:8: error: implicit declaration of function 
>> 'pci_pri_supported'; did you mean 'pci_ats_supported'? 
>> [-Werror=implicit-function-declaration]
2557 |pci_pri_supported(pdev))
 |^
 |pci_ats_supported
   cc1: some warnings being treated as errors

vim +2557 drivers/iommu/intel/iommu.c

  2504  
  2505  static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu 
*iommu,
  2506  int bus, int devfn,
  2507  struct device *dev,
  2508  struct dmar_domain 
*domain)
  2509  {
  2510  struct dmar_domain *found = NULL;
  2511  struct device_domain_info *info;
  2512  unsigned long flags;
  2513  int ret;
  2514  
  2515  info = alloc_devinfo_mem();
  2516  if (!info)
  2517  return NULL;
  2518  
  2519  if (!dev_is_real_dma_subdevice(dev)) {
  2520  info->bus = bus;
  2521  info->devfn = devfn;
  2522  info->segment = iommu->segment;
  2523  } else {
  2524  struct pci_dev *pdev = to_pci_dev(dev);
  2525  
  2526  info->bus = pdev->bus->number;
  2527  info->devfn = pdev->devfn;
  2528  info->segment = pci_domain_nr(pdev->bus);
  2529  }
  2530  
  2531  info->ats_supported = info->pasid_supported = 
info->pri_supported = 0;
  2532  info->ats_enabled = info->pasid_enabled = info->pri_enabled = 0;
  2533  info->ats_qdep = 0;
  2534  info->dev = dev;
  2535  info->domain = domain;
  2536  info->iommu = iommu;
  2537  info->pasid_table = NULL;
  2538  info->auxd_enabled = 0;
  2539  INIT_LIST_HEAD(&info->auxiliary_domains);
  2540  
  2541  if (dev && dev_is_pci(dev)) {
  2542  struct pci_dev *pdev = to_pci_dev(info->dev);
  2543  
  2544  if (ecap_dev_iotlb_support(iommu->ecap) &&
  2545  pci_ats_supported(pdev) &&
  2546  dmar_find_matched_atsr_unit(pdev))
  2547  info->ats_supported = 1;
  2548  
  2549  if (sm_supported(iommu)) {
  2550  if (pasid_supported(iommu)) {
  2551  int features = pci_pasid_features(pdev);
  2552  if (features >= 0)
  2553  info->pasid_supported = 
features | 1;
  2554  }
  2555  
  2556  if (info->ats_supported && 
ecap_prs(iommu->ecap) &&
> 2557  pci_pri_supported(pdev))
  2558  info->pri_supported = 1;
  2559  }
  2560  }
  2561  
  2562  spin_lock_irqsave(&device_domain_lock, flags);
  2563  if (dev)
  2564  found = find_domain(dev);
  2565  
  2566  if (!found) {
  2567  struct device_domain_info *info2;
  2568  info2 = dmar_search_domain_by_dev_info(info->segment, 
info->bus,
  2569 info->devfn);
  2570  if (info2) {
  2571  found  = info2->domain;
  2572  info2->dev = dev;
  2573  }
  2574  }
  2575  
  2576  if (found) {
  2577  spin_unlock_irqrestore(&device_domain_lock, flags);
  2578  free_devinfo_mem(info);
  2579 

Re: [Freedreno] arm64: Internal error: Oops: qcom_iommu_tlb_inv_context free_io_pgtable_ops on db410c

2020-07-20 Thread Naresh Kamboju
On Mon, 20 Jul 2020 at 21:27, Rob Clark  wrote:
>
> On Mon, Jul 20, 2020 at 4:28 AM Robin Murphy  wrote:
> >
> > On 2020-07-20 08:17, Arnd Bergmann wrote:
> > > On Mon, Jul 20, 2020 at 8:36 AM Naresh Kamboju
> > >  wrote:
<>
> > >> [5.444121] Unable to handle kernel NULL pointer dereference at
> > >> virtual address 0018
> > >> [5.456615]   ESR = 0x9604
> > >> [5.464471]   SET = 0, FnV = 0
> > >> [5.464487]   EA = 0, S1PTW = 0
> > >> [5.466521] Data abort info:
> > >> [5.469971]   ISV = 0, ISS = 0x0004
> > >> [5.472768]   CM = 0, WnR = 0
> > >> [5.476172] user pgtable: 4k pages, 48-bit VAs, pgdp=bacba000
> > >> [5.479349] [0018] pgd=, 
> > >> p4d=
> > >> [5.485820] Internal error: Oops: 9604 [#1] PREEMPT SMP
> > >> [5.492448] Modules linked in: crct10dif_ce adv7511(+)
> > >> qcom_spmi_temp_alarm cec msm(+) mdt_loader qcom_camss videobuf2_dma_sg
> > >> drm_kms_helper v4l2_fwnode videobuf2_memops videobuf2_v4l2 qcom_rng
> > >> videobuf2_common i2c_qcom_cci display_connector socinfo drm qrtr ns
> > >> rmtfs_mem fuse
> > >> [5.500256] CPU: 0 PID: 286 Comm: systemd-udevd Not tainted 5.8.0-rc5 
> > >> #1
> > >> [5.522484] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC 
> > >> (DT)
> > >> [5.529170] pstate: 2005 (nzCv daif -PAN -UAO BTYPE=--)
> > >> [5.535856] pc : qcom_iommu_tlb_inv_context+0x18/0xa8
> > >> [5.541148] lr : free_io_pgtable_ops+0x28/0x58
<>
> > >> [5.628297] Call trace:
> > >> [5.633592]  qcom_iommu_tlb_inv_context+0x18/0xa8
> > >
> > > This means that dev_iommu_fwspec_get() has returned NULL
> > > in qcom_iommu_tlb_inv_context(), either because dev->iommu
> > > is NULL, or because dev->iommu->fwspec is NULL.
> > >
> > > qcom_iommu_tlb_inv_context() does not check for a NULL
> > > pointer before using the returned object.
> > >
> > > The bug is either in the lack of error handling, or the fact
> > > that it's possible to get into this function for a device
> > > that has not been fully set up.
> >
> > Not quite - the device *was* properly set up, but has already been
> > properly torn down again in the removal path by iommu_release_device().
> > The problem is that qcom-iommu kept the device pointer as its TLB cookie
> > for the domain, but the domain has a longer lifespan than the validity
> > of that device - that's a fundamental design flaw in the driver.
>
> fwiw, I just sent "iommu/qcom: Use domain rather than dev as tlb
> cookie".. untested but looks like a straightforward enough change to
> switch over to using the domain rather than dev as cookie

The proposed patch tested and confirmed the reported problem fixed.

ref:
https://lore.kernel.org/linux-iommu/ca+g9fytj1rbycphxzrm-qm5ygtdlj1jd8vfzsqqvwi_dnjl...@mail.gmail.com/T/#m36a1fca18098f6c34275d928f9ba9c40c6d7fd63
https://lkft.validation.linaro.org/scheduler/job/1593950#L3392


>
> BR,
> -R


- Naresh
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/qcom: Use domain rather than dev as tlb cookie

2020-07-20 Thread Naresh Kamboju
On Mon, 20 Jul 2020 at 21:21, Rob Clark  wrote:
>
> From: Rob Clark 
>
> The device may be torn down, but the domain should still be valid.  Lets
> use that as the tlb flush ops cookie.
>
> Fixes a problem reported in [1]

This proposed fix patch applied on top of linux mainline master
and boot test PASS on db410c.

The reported problem got fixed.

>
> [1] https://lkml.org/lkml/2020/7/20/104
>
> Signed-off-by: Rob Clark 

Reported-by: Naresh Kamboju 
Tested-by: Naresh Kamboju 

> ---
> Note I don't have a good setup to test this atm, but I think it should
> work.
>
>  drivers/iommu/qcom_iommu.c | 37 +
>  1 file changed, 17 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
> index c3e1fbd1988c..d176df569af8 100644
> --- a/drivers/iommu/qcom_iommu.c
> +++ b/drivers/iommu/qcom_iommu.c
> @@ -65,6 +65,7 @@ struct qcom_iommu_domain {
> struct mutex init_mutex; /* Protects iommu pointer */
> struct iommu_domain  domain;
> struct qcom_iommu_dev   *iommu;
> +   struct iommu_fwspec *fwspec;
>  };
>
>  static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain 
> *dom)
> @@ -84,9 +85,9 @@ static struct qcom_iommu_dev * to_iommu(struct device *dev)
> return dev_iommu_priv_get(dev);
>  }
>
> -static struct qcom_iommu_ctx * to_ctx(struct device *dev, unsigned asid)
> +static struct qcom_iommu_ctx * to_ctx(struct qcom_iommu_domain *d, unsigned 
> asid)
>  {
> -   struct qcom_iommu_dev *qcom_iommu = to_iommu(dev);
> +   struct qcom_iommu_dev *qcom_iommu = d->iommu;
> if (!qcom_iommu)
> return NULL;
> return qcom_iommu->ctxs[asid - 1];
> @@ -118,14 +119,12 @@ iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
>
>  static void qcom_iommu_tlb_sync(void *cookie)
>  {
> -   struct iommu_fwspec *fwspec;
> -   struct device *dev = cookie;
> +   struct qcom_iommu_domain *qcom_domain = cookie;
> +   struct iommu_fwspec *fwspec = qcom_domain->fwspec;
> unsigned i;
>
> -   fwspec = dev_iommu_fwspec_get(dev);
> -
> for (i = 0; i < fwspec->num_ids; i++) {
> -   struct qcom_iommu_ctx *ctx = to_ctx(dev, fwspec->ids[i]);
> +   struct qcom_iommu_ctx *ctx = to_ctx(qcom_domain, 
> fwspec->ids[i]);
> unsigned int val, ret;
>
> iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
> @@ -139,14 +138,12 @@ static void qcom_iommu_tlb_sync(void *cookie)
>
>  static void qcom_iommu_tlb_inv_context(void *cookie)
>  {
> -   struct device *dev = cookie;
> -   struct iommu_fwspec *fwspec;
> +   struct qcom_iommu_domain *qcom_domain = cookie;
> +   struct iommu_fwspec *fwspec = qcom_domain->fwspec;
> unsigned i;
>
> -   fwspec = dev_iommu_fwspec_get(dev);
> -
> for (i = 0; i < fwspec->num_ids; i++) {
> -   struct qcom_iommu_ctx *ctx = to_ctx(dev, fwspec->ids[i]);
> +   struct qcom_iommu_ctx *ctx = to_ctx(qcom_domain, 
> fwspec->ids[i]);
> iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
> }
>
> @@ -156,16 +153,14 @@ static void qcom_iommu_tlb_inv_context(void *cookie)
>  static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
> size_t granule, bool leaf, void 
> *cookie)
>  {
> -   struct device *dev = cookie;
> -   struct iommu_fwspec *fwspec;
> +   struct qcom_iommu_domain *qcom_domain = cookie;
> +   struct iommu_fwspec *fwspec = qcom_domain->fwspec;
> unsigned i, reg;
>
> reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
>
> -   fwspec = dev_iommu_fwspec_get(dev);
> -
> for (i = 0; i < fwspec->num_ids; i++) {
> -   struct qcom_iommu_ctx *ctx = to_ctx(dev, fwspec->ids[i]);
> +   struct qcom_iommu_ctx *ctx = to_ctx(qcom_domain, 
> fwspec->ids[i]);
> size_t s = size;
>
> iova = (iova >> 12) << 12;
> @@ -256,7 +251,9 @@ static int qcom_iommu_init_domain(struct iommu_domain 
> *domain,
> };
>
> qcom_domain->iommu = qcom_iommu;
> -   pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, dev);
> +   qcom_domain->fwspec = fwspec;
> +
> +   pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, 
> qcom_domain);
> if (!pgtbl_ops) {
> dev_err(qcom_iommu->dev, "failed to allocate pagetable 
> ops\n");
> ret = -ENOMEM;
> @@ -269,7 +266,7 @@ static int qcom_iommu_init_domain(struct iommu_domain 
> *domain,
> domain->geometry.force_aperture = true;
>
> for (i = 0; i < fwspec->num_ids; i++) {
> -   struct qcom_iommu_ctx *ctx = to_ctx(dev, fwspec->ids[i]);
> +   struct qcom_iommu_ctx *ctx = to_ctx(qcom_domain, 
> fwspec->ids[i]);
>
> if (!ctx->secure_init) {
> ret = 

Re: [PATCH v2 00/12] ACPI/OF: Upgrade MSI/IOMMU ID mapping APIs

2020-07-20 Thread Lorenzo Pieralisi
On Fri, Jun 19, 2020 at 09:20:01AM +0100, Lorenzo Pieralisi wrote:
> This series is a v2 of a previous posting:
> 
> v1 -> v2
> 
> - Removed _rid() wrappers
> - Fixed !CONFIG_ACPI compilation issue
> - Converted of_pci_iommu_init() to use of_iommu_configure_dev_id()
> 
> v1: 
> https://lore.kernel.org/linux-arm-kernel/20200521130008.8266-1-lorenzo.pieral...@arm.com/
> 
> Original cover letter
> -
> 
> Firmware bindings provided in the ACPI IORT table[1] and device tree
> bindings define rules to carry out input/output ID mappings - ie
> retrieving an IOMMU/MSI controller input ID for a device with a given
> ID.
> 
> At the moment these firmware bindings are used exclusively for PCI
> devices and their requester ID to IOMMU/MSI id mapping but there is
> nothing PCI specific in the ACPI and devicetree bindings that prevent
> the firmware and kernel from using the firmware bindings to traslate
> device IDs for any bus that requires its devices to carry out
> input/output id translations.
> 
> The Freescale FSL bus is an example whereby the input/output ID
> translation kernel code put in place for PCI can be reused for devices
> attached to the bus that are not PCI devices.
> 
> This series updates the kernel code to make the MSI/IOMMU input/output
> ID translation PCI agnostic and apply the resulting changes to the
> device ID space provided by the Freescale FSL bus.
> 
> [1] 
> http://infocenter.arm.com/help/topic/com.arm.doc.den0049d/DEN0049D_IO_Remapping_Table.pdf
> 
> Cc: Rob Herring 
> Cc: "Rafael J. Wysocki" 
> Cc: "Joerg Roedel 
> Cc: Hanjun Guo 
> Cc: Bjorn Helgaas 
> Cc: Sudeep Holla 
> Cc: Robin Murphy 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Marc Zyngier 
> 
> Diana Craciun (2):
>   of/irq: make of_msi_map_get_device_domain() bus agnostic
>   bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver
> 
> Laurentiu Tudor (1):
>   dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus
> 
> Lorenzo Pieralisi (8):
>   ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for
> NC
>   ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic
>   ACPI/IORT: Make iort_msi_map_rid() PCI agnostic
>   ACPI/IORT: Remove useless PCI bus walk
>   ACPI/IORT: Add an input ID to acpi_dma_configure()
>   of/iommu: Make of_map_rid() PCI agnostic
>   of/device: Add input id to of_dma_configure()
>   of/irq: Make of_msi_map_rid() PCI bus agnostic
> 
> Makarand Pawagi (1):
>   bus: fsl-mc: Add ACPI support for fsl-mc
> 
>  .../devicetree/bindings/misc/fsl,qoriq-mc.txt |  50 +++-
>  drivers/acpi/arm64/iort.c | 108 --
>  drivers/acpi/scan.c   |   8 +-
>  drivers/bus/fsl-mc/dprc-driver.c  |  31 ++---
>  drivers/bus/fsl-mc/fsl-mc-bus.c   |  79 +
>  drivers/bus/fsl-mc/fsl-mc-msi.c   |  36 --
>  drivers/bus/fsl-mc/fsl-mc-private.h   |   6 +-
>  drivers/iommu/of_iommu.c  |  81 +++--
>  drivers/irqchip/irq-gic-v3-its-fsl-mc-msi.c   | 105 ++---
>  drivers/of/base.c |  42 +++
>  drivers/of/device.c   |   8 +-
>  drivers/of/irq.c  |  34 +++---
>  drivers/pci/msi.c |   9 +-
>  include/acpi/acpi_bus.h   |   9 +-
>  include/linux/acpi.h  |   7 ++
>  include/linux/acpi_iort.h |  20 ++--
>  include/linux/of.h|   4 +-
>  include/linux/of_device.h |  16 ++-
>  include/linux/of_iommu.h  |   6 +-
>  include/linux/of_irq.h|  13 ++-
>  20 files changed, 451 insertions(+), 221 deletions(-)

Hi guys,

I think this series is ready for upstream (there are two ACKs missing
from Rafael on patch (5) and Bjorn on patch (3) - I asked for them), it
touches lots of subsystems so I am not really sure what's the best way
to pull it, more so given that it is also late in the cycle (I do think
it is best to merge it via a single tree, it does not make sense to
split it up in my opinion).

Please let me know.

Thanks,
Lorenzo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] PCI/ATS: PASID and PRI are only enumerated in PF devices.

2020-07-20 Thread Ashok Raj
PASID and PRI capabilities are only enumerated in PF devices. VF devices
do not enumerate these capabilites. IOMMU drivers also need to enumerate
them before enabling features in the IOMMU. Extending the same support as
PASID feature discovery (pci_pasid_features) for PRI.

Signed-off-by: Ashok Raj 

To: Bjorn Helgaas 
To: Joerg Roedel 
To: Lu Baolu 
Cc: sta...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: Ashok Raj 
Cc: iommu@lists.linux-foundation.org
---
 drivers/iommu/intel/iommu.c |  2 +-
 drivers/pci/ats.c   | 14 ++
 include/linux/pci-ats.h |  1 +
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d759e7234e98..276452f5e6a7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2560,7 +2560,7 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
}
 
if (info->ats_supported && ecap_prs(iommu->ecap) &&
-   pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI))
+   pci_pri_supported(pdev))
info->pri_supported = 1;
}
}
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index b761c1f72f67..ffb4de8c5a77 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -461,6 +461,20 @@ int pci_pasid_features(struct pci_dev *pdev)
 }
 EXPORT_SYMBOL_GPL(pci_pasid_features);
 
+/**
+ * pci_pri_supported - Check if PRI is supported.
+ * @pdev: PCI device structure
+ *
+ * Returns false when no PRI capability is present.
+ * Returns true if PRI feature is supported and enabled
+ */
+bool pci_pri_supported(struct pci_dev *pdev)
+{
+   /* VFs share the PF PRI configuration */
+   return !!(pci_physfn(pdev)->pri_cap);
+}
+EXPORT_SYMBOL_GPL(pci_pri_supported);
+
 #define PASID_NUMBER_SHIFT 8
 #define PASID_NUMBER_MASK  (0x1f << PASID_NUMBER_SHIFT)
 /**
diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
index f75c307f346d..073d57292445 100644
--- a/include/linux/pci-ats.h
+++ b/include/linux/pci-ats.h
@@ -28,6 +28,7 @@ int pci_enable_pri(struct pci_dev *pdev, u32 reqs);
 void pci_disable_pri(struct pci_dev *pdev);
 int pci_reset_pri(struct pci_dev *pdev);
 int pci_prg_resp_pasid_required(struct pci_dev *pdev);
+bool pci_pri_supported(struct pci_dev *pdev);
 #endif /* CONFIG_PCI_PRI */
 
 #ifdef CONFIG_PCI_PASID
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Freedreno] arm64: Internal error: Oops: qcom_iommu_tlb_inv_context free_io_pgtable_ops on db410c

2020-07-20 Thread Rob Clark
On Mon, Jul 20, 2020 at 4:28 AM Robin Murphy  wrote:
>
> On 2020-07-20 08:17, Arnd Bergmann wrote:
> > On Mon, Jul 20, 2020 at 8:36 AM Naresh Kamboju
> >  wrote:
> >>
> >> This kernel oops while boot linux mainline kernel on arm64  db410c device.
> >>
> >> metadata:
> >>git branch: master
> >>git repo: 
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> >>git commit: f8456690ba8eb18ea4714e68554e242a04f65cff
> >>git describe: v5.8-rc5-48-gf8456690ba8e
> >>make_kernelversion: 5.8.0-rc5
> >>kernel-config:
> >> https://builds.tuxbuild.com/2aLnwV7BLStU0t1R1QPwHQ/kernel.config
> >
> > Thanks for the report. Adding freedreno folks to Cc, as this may have 
> > something
> > to do with that driver.
> >
> >>
> >> [5.444121] Unable to handle kernel NULL pointer dereference at
> >> virtual address 0018
> >> [5.456615]   ESR = 0x9604
> >> [5.464471]   SET = 0, FnV = 0
> >> [5.464487]   EA = 0, S1PTW = 0
> >> [5.466521] Data abort info:
> >> [5.469971]   ISV = 0, ISS = 0x0004
> >> [5.472768]   CM = 0, WnR = 0
> >> [5.476172] user pgtable: 4k pages, 48-bit VAs, pgdp=bacba000
> >> [5.479349] [0018] pgd=, 
> >> p4d=
> >> [5.485820] Internal error: Oops: 9604 [#1] PREEMPT SMP
> >> [5.492448] Modules linked in: crct10dif_ce adv7511(+)
> >> qcom_spmi_temp_alarm cec msm(+) mdt_loader qcom_camss videobuf2_dma_sg
> >> drm_kms_helper v4l2_fwnode videobuf2_memops videobuf2_v4l2 qcom_rng
> >> videobuf2_common i2c_qcom_cci display_connector socinfo drm qrtr ns
> >> rmtfs_mem fuse
> >> [5.500256] CPU: 0 PID: 286 Comm: systemd-udevd Not tainted 5.8.0-rc5 #1
> >> [5.522484] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> >> [5.529170] pstate: 2005 (nzCv daif -PAN -UAO BTYPE=--)
> >> [5.535856] pc : qcom_iommu_tlb_inv_context+0x18/0xa8
> >> [5.541148] lr : free_io_pgtable_ops+0x28/0x58
> >> [5.546350] sp : 80001219b5f0
> >> [5.550689] x29: 80001219b5f0 x28: 0013
> >> [5.554078] x27: 0100 x26: 36add3b8
> >> [5.559459] x25: 8915e910 x24: 3a5458c0
> >> [5.564753] x23: 0003 x22: 36a37058
> >> [5.570049] x21: 36a3a100 x20: 36a3a480
> >> [5.575344] x19: 36a37158 x18: 
> >> [5.580639] x17:  x16: 
> >> [5.585935] x15: 0004 x14: 0368
> >> [5.591229] x13:  x12: 39c61798
> >> [5.596525] x11: 39c616d0 x10: 4000
> >> [5.601820] x9 :  x8 : 39c616f8
> >> [5.607114] x7 :  x6 : 09f699a0
> >> [5.612410] x5 : 80001219b520 x4 : 36a3a000
> >> [5.617705] x3 : 09f69904 x2 : 
> >> [5.623001] x1 : 8000107e27e8 x0 : 3a545810
> >> [5.628297] Call trace:
> >> [5.633592]  qcom_iommu_tlb_inv_context+0x18/0xa8
> >
> > This means that dev_iommu_fwspec_get() has returned NULL
> > in qcom_iommu_tlb_inv_context(), either because dev->iommu
> > is NULL, or because dev->iommu->fwspec is NULL.
> >
> > qcom_iommu_tlb_inv_context() does not check for a NULL
> > pointer before using the returned object.
> >
> > The bug is either in the lack of error handling, or the fact
> > that it's possible to get into this function for a device
> > that has not been fully set up.
>
> Not quite - the device *was* properly set up, but has already been
> properly torn down again in the removal path by iommu_release_device().
> The problem is that qcom-iommu kept the device pointer as its TLB cookie
> for the domain, but the domain has a longer lifespan than the validity
> of that device - that's a fundamental design flaw in the driver.

fwiw, I just sent "iommu/qcom: Use domain rather than dev as tlb
cookie".. untested but looks like a straightforward enough change to
switch over to using the domain rather than dev as cookie

BR,
-R

>
> Robin.
>
> >> [5.635764]  free_io_pgtable_ops+0x28/0x58
> >> [5.640624]  qcom_iommu_domain_free+0x38/0x60
> >> [5.644617]  iommu_group_release+0x4c/0x70
> >> [5.649045]  kobject_put+0x6c/0x120
> >> [5.653035]  kobject_del+0x64/0x90
> >> [5.656421]  kobject_put+0xfc/0x120
> >> [5.659893]  iommu_group_remove_device+0xdc/0xf0
> >> [5.663281]  iommu_release_device+0x44/0x70
> >> [5.668142]  iommu_bus_notifier+0xbc/0xd0
> >> [5.672048]  notifier_call_chain+0x54/0x98
> >> [5.676214]  blocking_notifier_call_chain+0x48/0x70
> >> [5.680209]  device_del+0x26c/0x3a0
> >> [5.684981]  platform_device_del.part.0+0x1c/0x88
> >> [5.688453]  platform_device_unregister+0x24/0x40
> >> [5.693316]  of_platform_device_destroy+0xe4/0xf8
> >> [5.698002]  device_for_each_child+0x5c/0xa8
> >> [5.702689]  of_platform_depopulate+0x3c/0x80
> 

[PATCH] iommu/qcom: Use domain rather than dev as tlb cookie

2020-07-20 Thread Rob Clark
From: Rob Clark 

The device may be torn down, but the domain should still be valid.  Lets
use that as the tlb flush ops cookie.

Fixes a problem reported in [1]

[1] https://lkml.org/lkml/2020/7/20/104

Signed-off-by: Rob Clark 
---
Note I don't have a good setup to test this atm, but I think it should
work.

 drivers/iommu/qcom_iommu.c | 37 +
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index c3e1fbd1988c..d176df569af8 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -65,6 +65,7 @@ struct qcom_iommu_domain {
struct mutex init_mutex; /* Protects iommu pointer */
struct iommu_domain  domain;
struct qcom_iommu_dev   *iommu;
+   struct iommu_fwspec *fwspec;
 };
 
 static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
@@ -84,9 +85,9 @@ static struct qcom_iommu_dev * to_iommu(struct device *dev)
return dev_iommu_priv_get(dev);
 }
 
-static struct qcom_iommu_ctx * to_ctx(struct device *dev, unsigned asid)
+static struct qcom_iommu_ctx * to_ctx(struct qcom_iommu_domain *d, unsigned 
asid)
 {
-   struct qcom_iommu_dev *qcom_iommu = to_iommu(dev);
+   struct qcom_iommu_dev *qcom_iommu = d->iommu;
if (!qcom_iommu)
return NULL;
return qcom_iommu->ctxs[asid - 1];
@@ -118,14 +119,12 @@ iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
 
 static void qcom_iommu_tlb_sync(void *cookie)
 {
-   struct iommu_fwspec *fwspec;
-   struct device *dev = cookie;
+   struct qcom_iommu_domain *qcom_domain = cookie;
+   struct iommu_fwspec *fwspec = qcom_domain->fwspec;
unsigned i;
 
-   fwspec = dev_iommu_fwspec_get(dev);
-
for (i = 0; i < fwspec->num_ids; i++) {
-   struct qcom_iommu_ctx *ctx = to_ctx(dev, fwspec->ids[i]);
+   struct qcom_iommu_ctx *ctx = to_ctx(qcom_domain, 
fwspec->ids[i]);
unsigned int val, ret;
 
iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
@@ -139,14 +138,12 @@ static void qcom_iommu_tlb_sync(void *cookie)
 
 static void qcom_iommu_tlb_inv_context(void *cookie)
 {
-   struct device *dev = cookie;
-   struct iommu_fwspec *fwspec;
+   struct qcom_iommu_domain *qcom_domain = cookie;
+   struct iommu_fwspec *fwspec = qcom_domain->fwspec;
unsigned i;
 
-   fwspec = dev_iommu_fwspec_get(dev);
-
for (i = 0; i < fwspec->num_ids; i++) {
-   struct qcom_iommu_ctx *ctx = to_ctx(dev, fwspec->ids[i]);
+   struct qcom_iommu_ctx *ctx = to_ctx(qcom_domain, 
fwspec->ids[i]);
iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
}
 
@@ -156,16 +153,14 @@ static void qcom_iommu_tlb_inv_context(void *cookie)
 static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
size_t granule, bool leaf, void 
*cookie)
 {
-   struct device *dev = cookie;
-   struct iommu_fwspec *fwspec;
+   struct qcom_iommu_domain *qcom_domain = cookie;
+   struct iommu_fwspec *fwspec = qcom_domain->fwspec;
unsigned i, reg;
 
reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
 
-   fwspec = dev_iommu_fwspec_get(dev);
-
for (i = 0; i < fwspec->num_ids; i++) {
-   struct qcom_iommu_ctx *ctx = to_ctx(dev, fwspec->ids[i]);
+   struct qcom_iommu_ctx *ctx = to_ctx(qcom_domain, 
fwspec->ids[i]);
size_t s = size;
 
iova = (iova >> 12) << 12;
@@ -256,7 +251,9 @@ static int qcom_iommu_init_domain(struct iommu_domain 
*domain,
};
 
qcom_domain->iommu = qcom_iommu;
-   pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, dev);
+   qcom_domain->fwspec = fwspec;
+
+   pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, 
qcom_domain);
if (!pgtbl_ops) {
dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
ret = -ENOMEM;
@@ -269,7 +266,7 @@ static int qcom_iommu_init_domain(struct iommu_domain 
*domain,
domain->geometry.force_aperture = true;
 
for (i = 0; i < fwspec->num_ids; i++) {
-   struct qcom_iommu_ctx *ctx = to_ctx(dev, fwspec->ids[i]);
+   struct qcom_iommu_ctx *ctx = to_ctx(qcom_domain, 
fwspec->ids[i]);
 
if (!ctx->secure_init) {
ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, 
ctx->asid);
@@ -419,7 +416,7 @@ static void qcom_iommu_detach_dev(struct iommu_domain 
*domain, struct device *de
 
pm_runtime_get_sync(qcom_iommu->dev);
for (i = 0; i < fwspec->num_ids; i++) {
-   struct qcom_iommu_ctx *ctx = to_ctx(dev, fwspec->ids[i]);
+   struct qcom_iommu_ctx *ctx = to_ctx(qcom_domain, 
fwspec->ids[i]);
 
/* Disable the context bank: *

Re: [PATCH v2 2/5] iommu/arm-smmu: Emulate bypass by using context banks

2020-07-20 Thread Jordan Crouse
On Mon, Jul 20, 2020 at 09:58:42AM +0100, Will Deacon wrote:
> On Thu, Jul 16, 2020 at 05:16:16PM -0700, Bjorn Andersson wrote:
> > Some firmware found on various Qualcomm platforms traps writes to S2CR
> > of type BYPASS and writes FAULT into the register. This prevents us from
> > marking the streams for the display controller as BYPASS to allow
> > continued scanout of the screen through the initialization of the ARM
> > SMMU.
> > 
> > This adds a Qualcomm specific cfg_probe function, which probes the
> > behavior of the S2CR registers and if found faulty enables the related
> > quirk. Based on this quirk context banks are allocated for IDENTITY
> > domains as well, but with ARM_SMMU_SCTLR_M omitted.
> > 
> > The result is valid stream mappings, without translation.
> > 
> > Tested-by: John Stultz 
> > Tested-by: Vinod Koul 
> > Signed-off-by: Bjorn Andersson 
> > ---
> > 
> > Changes since v1:
> > - Picked up tested-by
> > 
> >  drivers/iommu/arm-smmu-qcom.c | 21 +
> >  drivers/iommu/arm-smmu.c  | 14 --
> >  drivers/iommu/arm-smmu.h  |  3 +++
> >  3 files changed, 36 insertions(+), 2 deletions(-)
> 
> [...]
> 
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index fb85e716ae9a..5d5fe6741ed4 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -654,7 +654,9 @@ static void arm_smmu_write_context_bank(struct 
> > arm_smmu_device *smmu, int idx)
> >  
> > /* SCTLR */
> > reg = ARM_SMMU_SCTLR_CFIE | ARM_SMMU_SCTLR_CFRE | ARM_SMMU_SCTLR_AFE |
> > - ARM_SMMU_SCTLR_TRE | ARM_SMMU_SCTLR_M;
> > + ARM_SMMU_SCTLR_TRE;
> > +   if (cfg->m)
> > +   reg |= ARM_SMMU_SCTLR_M;
> > if (stage1)
> > reg |= ARM_SMMU_SCTLR_S1_ASIDPNE;
> > if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
> > @@ -678,7 +680,11 @@ static int arm_smmu_init_domain_context(struct 
> > iommu_domain *domain,
> > if (smmu_domain->smmu)
> > goto out_unlock;
> >  
> > -   if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> > +   /*
> > +* Nothing to do for IDENTITY domains,unless disabled context banks are
> > +* used to emulate bypass mappings on Qualcomm platforms.
> > +*/
> > +   if (domain->type == IOMMU_DOMAIN_IDENTITY && !smmu->qcom_bypass_quirk) {
> 
> Given that the other thread [1] with Jordan (why haven't you cc'd him?! --
> adding him now) has identified the need for a callback to allocate the
> context bank, why don't we use the same sort of idea here? If the impl
> provides a CB allocator function, call it irrespective of the domain type.
> If it allocates a domain even for an identity domain, then we can install
> if with SCTLR.M clear.

Here is what I have so far for the context bank allocator.  I think its a good
start, but it still feels a bit half baked, so comments definitely welcome.

https://lists.linuxfoundation.org/pipermail/iommu/2020-July/046754.html
https://lists.linuxfoundation.org/pipermail/iommu/2020-July/046752.html

> Will
> 
> [1] https://lore.kernel.org/r/20200716151625.ga14...@jcrouse1-lnx.qualcomm.com

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v10 08/13] drm/msm: Add a context pointer to the submitqueue

2020-07-20 Thread Jordan Crouse
Each submitqueue is attached to a context. Add a pointer to the
context to the submitqueue at create time and refcount it so
that it stays around through the life of the queue.

GPU submissions can access the active context via the submitqueue
instead of requiring it to be passed around from function to
function.

Signed-off-by: Jordan Crouse 
---

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c   | 12 +---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  5 ++---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  5 ++---
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  3 +--
 drivers/gpu/drm/msm/msm_drv.c   |  3 ++-
 drivers/gpu/drm/msm/msm_drv.h   |  8 
 drivers/gpu/drm/msm/msm_gem.h   |  1 +
 drivers/gpu/drm/msm/msm_gem_submit.c|  8 
 drivers/gpu/drm/msm/msm_gpu.c   |  9 -
 drivers/gpu/drm/msm/msm_gpu.h   |  7 +++
 drivers/gpu/drm/msm/msm_submitqueue.c   |  8 +++-
 11 files changed, 39 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 9e63a190642c..eff2439ea57b 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -43,8 +43,7 @@ static void a5xx_flush(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring)
gpu_write(gpu, REG_A5XX_CP_RB_WPTR, wptr);
 }
 
-static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit 
*submit,
-   struct msm_file_private *ctx)
+static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit 
*submit)
 {
struct msm_drm_private *priv = gpu->dev->dev_private;
struct msm_ringbuffer *ring = submit->ring;
@@ -57,7 +56,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct 
msm_gem_submit *submit
case MSM_SUBMIT_CMD_IB_TARGET_BUF:
break;
case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
-   if (priv->lastctx == ctx)
+   if (priv->lastctx == submit->queue->ctx)
break;
/* fall-thru */
case MSM_SUBMIT_CMD_BUF:
@@ -103,8 +102,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct 
msm_gem_submit *submit
msm_gpu_retire(gpu);
 }
 
-static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
-   struct msm_file_private *ctx)
+static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
@@ -114,7 +112,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit,
 
if (IS_ENABLED(CONFIG_DRM_MSM_GPU_SUDO) && submit->in_rb) {
priv->lastctx = NULL;
-   a5xx_submit_in_rb(gpu, submit, ctx);
+   a5xx_submit_in_rb(gpu, submit);
return;
}
 
@@ -148,7 +146,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit,
case MSM_SUBMIT_CMD_IB_TARGET_BUF:
break;
case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
-   if (priv->lastctx == ctx)
+   if (priv->lastctx == submit->queue->ctx)
break;
/* fall-thru */
case MSM_SUBMIT_CMD_BUF:
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index c5a3e4d4c007..5eabb0109577 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -81,8 +81,7 @@ static void get_stats_counter(struct msm_ringbuffer *ring, 
u32 counter,
OUT_RING(ring, upper_32_bits(iova));
 }
 
-static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
-   struct msm_file_private *ctx)
+static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 {
unsigned int index = submit->seqno % MSM_GPU_SUBMIT_STATS_COUNT;
struct msm_drm_private *priv = gpu->dev->dev_private;
@@ -115,7 +114,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit,
case MSM_SUBMIT_CMD_IB_TARGET_BUF:
break;
case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
-   if (priv->lastctx == ctx)
+   if (priv->lastctx == submit->queue->ctx)
break;
/* fall-thru */
case MSM_SUBMIT_CMD_BUF:
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index e23641a5ec84..b38a8126541a 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -457,8 +457,7 @@ void adreno_recover(struct msm_gpu *gpu)
}
 }
 
-void adreno_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
-   struct msm_file_private *ctx

[PATCH v10 12/13] drm/msm/a6xx: Add support for per-instance pagetables

2020-07-20 Thread Jordan Crouse
Add support for using per-instance pagetables if all the dependencies are
available.

Signed-off-by: Jordan Crouse 
---

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 53 +++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
 drivers/gpu/drm/msm/msm_ringbuffer.h  |  1 +
 3 files changed, 55 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 5eabb0109577..57c6cdec7e9a 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -81,6 +81,41 @@ static void get_stats_counter(struct msm_ringbuffer *ring, 
u32 counter,
OUT_RING(ring, upper_32_bits(iova));
 }
 
+static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
+   struct msm_ringbuffer *ring, struct msm_file_private *ctx)
+{
+   phys_addr_t ttbr;
+   u32 asid;
+   u64 memptr = rbmemptr(ring, ttbr0);
+
+   if (ctx == a6xx_gpu->cur_ctx)
+   return;
+
+   if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid))
+   return;
+
+   /* Execute the table update */
+   OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
+   OUT_RING(ring, CP_SMMU_TABLE_UPDATE_0_TTBR0_LO(lower_32_bits(ttbr)));
+   OUT_RING(ring,
+   CP_SMMU_TABLE_UPDATE_1_TTBR0_HI(upper_32_bits(ttbr)) |
+   CP_SMMU_TABLE_UPDATE_1_ASID(asid));
+   OUT_RING(ring, CP_SMMU_TABLE_UPDATE_2_CONTEXTIDR(0));
+   OUT_RING(ring, CP_SMMU_TABLE_UPDATE_3_CONTEXTBANK(0));
+
+   /*
+* Write the new TTBR0 to the memstore. This is good for debugging.
+*/
+   OUT_PKT7(ring, CP_MEM_WRITE, 4);
+   OUT_RING(ring, CP_MEM_WRITE_0_ADDR_LO(lower_32_bits(memptr)));
+   OUT_RING(ring, CP_MEM_WRITE_1_ADDR_HI(upper_32_bits(memptr)));
+   OUT_RING(ring, lower_32_bits(ttbr));
+   OUT_RING(ring, (asid << 16) | upper_32_bits(ttbr));
+
+
+   a6xx_gpu->cur_ctx = ctx;
+}
+
 static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 {
unsigned int index = submit->seqno % MSM_GPU_SUBMIT_STATS_COUNT;
@@ -90,6 +125,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit)
struct msm_ringbuffer *ring = submit->ring;
unsigned int i;
 
+   a6xx_set_pagetable(a6xx_gpu, ring, submit->queue->ctx);
+
get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
rbmemptr_stats(ring, index, cpcycles_start));
 
@@ -696,6 +733,8 @@ static int a6xx_hw_init(struct msm_gpu *gpu)
/* Always come up on rb 0 */
a6xx_gpu->cur_ring = gpu->rb[0];
 
+   a6xx_gpu->cur_ctx = NULL;
+
/* Enable the SQE_to start the CP engine */
gpu_write(gpu, REG_A6XX_CP_SQE_CNTL, 1);
 
@@ -1008,6 +1047,19 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
return (unsigned long)busy_time;
 }
 
+static struct msm_gem_address_space *
+a6xx_create_private_address_space(struct msm_gpu *gpu)
+{
+   struct msm_mmu *mmu;
+
+   mmu = msm_iommu_pagetable_create(gpu->aspace->mmu);
+   if (IS_ERR(mmu))
+   return msm_gem_address_space_get(gpu->aspace);
+
+   return msm_gem_address_space_create(mmu,
+   "gpu", 0x1ULL, 0x1ULL);
+}
+
 static const struct adreno_gpu_funcs funcs = {
.base = {
.get_param = adreno_get_param,
@@ -1031,6 +1083,7 @@ static const struct adreno_gpu_funcs funcs = {
.gpu_state_put = a6xx_gpu_state_put,
 #endif
.create_address_space = adreno_iommu_create_address_space,
+   .create_private_address_space = 
a6xx_create_private_address_space,
},
.get_timestamp = a6xx_get_timestamp,
 };
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 03ba60d5b07f..da22d7549d9b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -19,6 +19,7 @@ struct a6xx_gpu {
uint64_t sqe_iova;
 
struct msm_ringbuffer *cur_ring;
+   struct msm_file_private *cur_ctx;
 
struct a6xx_gmu gmu;
 };
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h 
b/drivers/gpu/drm/msm/msm_ringbuffer.h
index 7764373d0ed2..0987d6bf848c 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -31,6 +31,7 @@ struct msm_rbmemptrs {
volatile uint32_t fence;
 
volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT];
+   volatile u64 ttbr0;
 };
 
 struct msm_ringbuffer {
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v10 13/13] arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

2020-07-20 Thread Jordan Crouse
Set the qcom,adreno-smmu compatible string for the GPU SMMU to enable
split pagetables and per-instance pagetables for drm/msm.

Signed-off-by: Jordan Crouse 
---

 arch/arm64/boot/dts/qcom/sdm845.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 759cdd0b002b..d883144360aa 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -3981,7 +3981,7 @@ opp-25700 {
};
 
adreno_smmu: iommu@504 {
-   compatible = "qcom,sdm845-smmu-v2", "qcom,smmu-v2";
+   compatible = "qcom,adreno-smmu", "qcom,smmu-v2";
reg = <0 0x504 0 0x1>;
#iommu-cells = <1>;
#global-interrupts = <2>;
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v10 00/13] iommu/arm-smmu: Add Adreno SMMU specific implementation

2020-07-20 Thread Jordan Crouse
(reworded the summary to reflect ongoing changes in the code)

This series adds an Adreno SMMU implementation to arm-smmu to allow GPU hardware
pagetable switching.

The Adreno GPU has built in capabilities to switch the TTBR0 pagetable during
runtime to allow each individual instance or application to have its own
pagetable.  In order to take advantage of the HW capabilities there are certain
requirements needed of the SMMU hardware.

This series adds support for an Adreno specific arm-smmu implementation. The new
implementation 1) ensures that the GPU domain is always assigned context bank 0,
2) enables split pagetable support (TTBR1) so that the instance specific
pagetable can be swapped while the global memory remains in place and 3) shares
the current pagetable configuration with the GPU driver to allow it to create
its own io-pgtable instances.

The series then adds the drm/msm code to enable these features. For targets that
support it allocate new pagetables using the io-pgtable configuration shared by
the arm-smmu driver and swap them in during runtime.

This version of the series merges the previous patchset(s) [1] and [2]
with the following improvements:

  - arm-smmu: add implementation hook to allocate context banks
  - arm-smmu: Match the GPU domain by stream ID instead of compatible string
  - arm-smmu: Make DOMAIN_ATTR_PGTABLE_CFG bi-directional. The leaf driver
queries the configuration to create a pagetable and then sends the newly
created configuration back to the smmu-driver to enable TTBR0
  - drm/msm: Add context reference counting for submissions
  - drm/msm: Use dummy functions to skip TLB operations on per-instance
pagetables

[1] https://lists.linuxfoundation.org/pipermail/iommu/2020-June/045653.html
[2] https://lists.linuxfoundation.org/pipermail/iommu/2020-June/045659.html


Jordan Crouse (13):
  iommu/arm-smmu: Pass io-pgtable config to implementation specific
function
  iommu/arm-smmu: Add support for split pagetables
  iommu/arm-smmu: Add implementation hooks to configure contexts
  iommu/arm-smmu-qcom: Add implementation for the adreno GPU SMMU
  iommu: Add a domain attribute to get/set a pagetable configuration
  iommu/arm-smmu-qcom: Get and set the pagetable config for split
pagetables
  dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU
  drm/msm: Add a context pointer to the submitqueue
  drm/msm: Set the global virtual address range from the IOMMU domain
  drm/msm: Add support to create a local pagetable
  drm/msm: Add support for private address space instances
  drm/msm/a6xx: Add support for per-instance pagetables
  arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

 .../devicetree/bindings/iommu/arm,smmu.yaml   |   4 +
 arch/arm64/boot/dts/qcom/sdm845.dtsi  |   2 +-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c |  12 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  58 -
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |  18 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |   3 +-
 drivers/gpu/drm/msm/msm_drv.c |  16 +-
 drivers/gpu/drm/msm/msm_drv.h |  13 ++
 drivers/gpu/drm/msm/msm_gem.h |   1 +
 drivers/gpu/drm/msm/msm_gem_submit.c  |   8 +-
 drivers/gpu/drm/msm/msm_gem_vma.c |   9 +
 drivers/gpu/drm/msm/msm_gpu.c |  26 ++-
 drivers/gpu/drm/msm/msm_gpu.h |  12 +-
 drivers/gpu/drm/msm/msm_gpummu.c  |   2 +-
 drivers/gpu/drm/msm/msm_iommu.c   | 198 +-
 drivers/gpu/drm/msm/msm_mmu.h |  16 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h  |   1 +
 drivers/gpu/drm/msm/msm_submitqueue.c |   8 +-
 drivers/iommu/arm-smmu-impl.c |   6 +-
 drivers/iommu/arm-smmu-qcom.c | 130 +++-
 drivers/iommu/arm-smmu.c  | 108 +-
 drivers/iommu/arm-smmu.h  |  65 +-
 include/linux/iommu.h |   1 +
 24 files changed, 619 insertions(+), 99 deletions(-)

-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v10 01/13] iommu/arm-smmu: Pass io-pgtable config to implementation specific function

2020-07-20 Thread Jordan Crouse
Construct the io-pgtable config before calling the implementation specific
init_context function and pass it so the implementation specific function
can get a chance to change it before the io-pgtable is created.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm-smmu-impl.c |  3 ++-
 drivers/iommu/arm-smmu.c  | 11 ++-
 drivers/iommu/arm-smmu.h  |  3 ++-
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index c75b9d957b70..a20e426d81ac 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -68,7 +68,8 @@ static int cavium_cfg_probe(struct arm_smmu_device *smmu)
return 0;
 }
 
-static int cavium_init_context(struct arm_smmu_domain *smmu_domain)
+static int cavium_init_context(struct arm_smmu_domain *smmu_domain,
+   struct io_pgtable_cfg *pgtbl_cfg)
 {
struct cavium_smmu *cs = container_of(smmu_domain->smmu,
  struct cavium_smmu, smmu);
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 243bc4cb2705..0e2c65ee9e5a 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -797,11 +797,6 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
cfg->asid = cfg->cbndx;
 
smmu_domain->smmu = smmu;
-   if (smmu->impl && smmu->impl->init_context) {
-   ret = smmu->impl->init_context(smmu_domain);
-   if (ret)
-   goto out_unlock;
-   }
 
pgtbl_cfg = (struct io_pgtable_cfg) {
.pgsize_bitmap  = smmu->pgsize_bitmap,
@@ -812,6 +807,12 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
.iommu_dev  = smmu->dev,
};
 
+   if (smmu->impl && smmu->impl->init_context) {
+   ret = smmu->impl->init_context(smmu_domain, &pgtbl_cfg);
+   if (ret)
+   goto out_clear_smmu;
+   }
+
if (smmu_domain->non_strict)
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index d172c024be61..38b041530a4f 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -383,7 +383,8 @@ struct arm_smmu_impl {
u64 val);
int (*cfg_probe)(struct arm_smmu_device *smmu);
int (*reset)(struct arm_smmu_device *smmu);
-   int (*init_context)(struct arm_smmu_domain *smmu_domain);
+   int (*init_context)(struct arm_smmu_domain *smmu_domain,
+   struct io_pgtable_cfg *cfg);
void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync,
 int status);
int (*def_domain_type)(struct device *dev);
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v10 03/13] iommu/arm-smmu: Add implementation hooks to configure contexts

2020-07-20 Thread Jordan Crouse
Add a new hook to allow implementations to implement their own context
bank allocation scheme and update the existing init_context function to
take the device pointer.

These modifications will be used by the upcoming Adreno SMMU
implementation to identify the GPU device and properly configure it
for pagetable switching.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm-smmu-impl.c |  2 +-
 drivers/iommu/arm-smmu.c  | 46 ---
 drivers/iommu/arm-smmu.h  | 28 -
 3 files changed, 44 insertions(+), 32 deletions(-)

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index a20e426d81ac..b71b14685cc9 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -69,7 +69,7 @@ static int cavium_cfg_probe(struct arm_smmu_device *smmu)
 }
 
 static int cavium_init_context(struct arm_smmu_domain *smmu_domain,
-   struct io_pgtable_cfg *pgtbl_cfg)
+   struct io_pgtable_cfg *pgtbl_cfg, struct device *dev)
 {
struct cavium_smmu *cs = container_of(smmu_domain->smmu,
  struct cavium_smmu, smmu);
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 8798428a4c8d..fff536a44faa 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -93,16 +93,6 @@ struct arm_smmu_cb {
struct arm_smmu_cfg *cfg;
 };
 
-struct arm_smmu_master_cfg {
-   struct arm_smmu_device  *smmu;
-   s16 smendx[];
-};
-#define INVALID_SMENDX -1
-#define cfg_smendx(cfg, fw, i) \
-   (i >= fw->num_ids ? INVALID_SMENDX : cfg->smendx[i])
-#define for_each_cfg_sme(cfg, fw, i, idx) \
-   for (i = 0; idx = cfg_smendx(cfg, fw, i), i < fw->num_ids; ++i)
-
 static bool using_legacy_binding, using_generic_binding;
 
 static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
@@ -237,19 +227,6 @@ static int arm_smmu_register_legacy_master(struct device 
*dev,
 }
 #endif /* CONFIG_ARM_SMMU_LEGACY_DT_BINDINGS */
 
-static int __arm_smmu_alloc_bitmap(unsigned long *map, int start, int end)
-{
-   int idx;
-
-   do {
-   idx = find_next_zero_bit(map, end, start);
-   if (idx == end)
-   return -ENOSPC;
-   } while (test_and_set_bit(idx, map));
-
-   return idx;
-}
-
 static void __arm_smmu_free_bitmap(unsigned long *map, int idx)
 {
clear_bit(idx, map);
@@ -668,7 +645,8 @@ static void arm_smmu_write_context_bank(struct 
arm_smmu_device *smmu, int idx)
 }
 
 static int arm_smmu_init_domain_context(struct iommu_domain *domain,
-   struct arm_smmu_device *smmu)
+   struct arm_smmu_device *smmu,
+   struct device *dev)
 {
int irq, start, ret = 0;
unsigned long ias, oas;
@@ -782,10 +760,20 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
ret = -EINVAL;
goto out_unlock;
}
-   ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
+
+   smmu_domain->smmu = smmu;
+
+   if (smmu->impl && smmu->impl->alloc_context_bank)
+   ret = smmu->impl->alloc_context_bank(smmu_domain, dev,
+   start, smmu->num_context_banks);
+   else
+   ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
  smmu->num_context_banks);
-   if (ret < 0)
+
+   if (ret < 0) {
+   smmu_domain->smmu = NULL;
goto out_unlock;
+   }
 
cfg->cbndx = ret;
if (smmu->version < ARM_SMMU_V2) {
@@ -800,8 +788,6 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
else
cfg->asid = cfg->cbndx;
 
-   smmu_domain->smmu = smmu;
-
pgtbl_cfg = (struct io_pgtable_cfg) {
.pgsize_bitmap  = smmu->pgsize_bitmap,
.ias= ias,
@@ -812,7 +798,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
};
 
if (smmu->impl && smmu->impl->init_context) {
-   ret = smmu->impl->init_context(smmu_domain, &pgtbl_cfg);
+   ret = smmu->impl->init_context(smmu_domain, &pgtbl_cfg, dev);
if (ret)
goto out_clear_smmu;
}
@@ -1190,7 +1176,7 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
return ret;
 
/* Ensure that the domain is finalised */
-   ret = arm_smmu_init_domain_context(domain, smmu);
+   ret = arm_smmu_init_domain_context(domain, smmu, dev);
if (ret < 0)
goto rpm_put;
 
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 5f2de20e883b..d10d745a0290 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -347

[PATCH v10 07/13] dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU

2020-07-20 Thread Jordan Crouse
Every Qcom Adreno GPU has an embedded SMMU for its own use. These
devices depend on unique features such as split pagetables,
different stall/halt requirements and other settings. Identify them
with a compatible string so that they can be identified in the
arm-smmu implementation specific code.

Signed-off-by: Jordan Crouse 
Reviewed-by: Rob Herring 
---

 Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index d7ceb4c34423..e52a1b146c97 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -38,6 +38,10 @@ properties:
   - qcom,sc7180-smmu-500
   - qcom,sdm845-smmu-500
   - const: arm,mmu-500
+  - description: Qcom Adreno GPUs implementing "arm,smmu-v2"
+items:
+  - const: qcom,adreno-smmu
+  - const: qcom,smmu-v2
   - items:
   - const: arm,mmu-500
   - const: arm,smmu-v2
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v10 09/13] drm/msm: Set the global virtual address range from the IOMMU domain

2020-07-20 Thread Jordan Crouse
Use the aperture settings from the IOMMU domain to set up the virtual
address range for the GPU. This allows us to transparently deal with
IOMMU side features (like split pagetables).

Signed-off-by: Jordan Crouse 
---

 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 13 +++--
 drivers/gpu/drm/msm/msm_iommu.c |  7 +++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index b38a8126541a..f9e3badf2fca 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -192,9 +192,18 @@ adreno_iommu_create_address_space(struct msm_gpu *gpu,
struct iommu_domain *iommu = iommu_domain_alloc(&platform_bus_type);
struct msm_mmu *mmu = msm_iommu_new(&pdev->dev, iommu);
struct msm_gem_address_space *aspace;
+   u64 start, size;
 
-   aspace = msm_gem_address_space_create(mmu, "gpu", SZ_16M,
-   0x - SZ_16M);
+   /*
+* Use the aperture start or SZ_16M, whichever is greater. This will
+* ensure that we align with the allocated pagetable range while still
+* allowing room in the lower 32 bits for GMEM and whatnot
+*/
+   start = max_t(u64, SZ_16M, iommu->geometry.aperture_start);
+   size = iommu->geometry.aperture_end - start + 1;
+
+   aspace = msm_gem_address_space_create(mmu, "gpu",
+   start & GENMASK(48, 0), size);
 
if (IS_ERR(aspace) && !IS_ERR(mmu))
mmu->funcs->destroy(mmu);
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 3a381a9674c9..1b6635504069 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -36,6 +36,10 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
struct msm_iommu *iommu = to_msm_iommu(mmu);
size_t ret;
 
+   /* The arm-smmu driver expects the addresses to be sign extended */
+   if (iova & BIT_ULL(48))
+   iova |= GENMASK_ULL(63, 49);
+
ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
WARN_ON(!ret);
 
@@ -46,6 +50,9 @@ static int msm_iommu_unmap(struct msm_mmu *mmu, uint64_t 
iova, size_t len)
 {
struct msm_iommu *iommu = to_msm_iommu(mmu);
 
+   if (iova & BIT_ULL(48))
+   iova |= GENMASK_ULL(63, 49);
+
iommu_unmap(iommu->domain, iova, len);
 
return 0;
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v10 10/13] drm/msm: Add support to create a local pagetable

2020-07-20 Thread Jordan Crouse
Add support to create a io-pgtable for use by targets that support
per-instance pagetables. In order to support per-instance pagetables the
GPU SMMU device needs to have the qcom,adreno-smmu compatible string and
split pagetables enabled.

Signed-off-by: Jordan Crouse 
---

 drivers/gpu/drm/msm/msm_gpummu.c |   2 +-
 drivers/gpu/drm/msm/msm_iommu.c  | 191 ++-
 drivers/gpu/drm/msm/msm_mmu.h|  16 ++-
 3 files changed, 206 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpummu.c b/drivers/gpu/drm/msm/msm_gpummu.c
index 310a31b05faa..aab121f4beb7 100644
--- a/drivers/gpu/drm/msm/msm_gpummu.c
+++ b/drivers/gpu/drm/msm/msm_gpummu.c
@@ -102,7 +102,7 @@ struct msm_mmu *msm_gpummu_new(struct device *dev, struct 
msm_gpu *gpu)
}
 
gpummu->gpu = gpu;
-   msm_mmu_init(&gpummu->base, dev, &funcs);
+   msm_mmu_init(&gpummu->base, dev, &funcs, MSM_MMU_GPUMMU);
 
return &gpummu->base;
 }
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 1b6635504069..8cf8c7f7a665 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -4,15 +4,202 @@
  * Author: Rob Clark 
  */
 
+#include 
 #include "msm_drv.h"
 #include "msm_mmu.h"
 
 struct msm_iommu {
struct msm_mmu base;
struct iommu_domain *domain;
+   atomic_t pagetables;
 };
+
 #define to_msm_iommu(x) container_of(x, struct msm_iommu, base)
 
+struct msm_iommu_pagetable {
+   struct msm_mmu base;
+   struct msm_mmu *parent;
+   struct io_pgtable_ops *pgtbl_ops;
+   phys_addr_t ttbr;
+   u32 asid;
+};
+static struct msm_iommu_pagetable *to_pagetable(struct msm_mmu *mmu)
+{
+   return container_of(mmu, struct msm_iommu_pagetable, base);
+}
+
+static int msm_iommu_pagetable_unmap(struct msm_mmu *mmu, u64 iova,
+   size_t size)
+{
+   struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+   struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
+   size_t unmapped = 0;
+
+   /* Unmap the block one page at a time */
+   while (size) {
+   unmapped += ops->unmap(ops, iova, 4096, NULL);
+   iova += 4096;
+   size -= 4096;
+   }
+
+   iommu_flush_tlb_all(to_msm_iommu(pagetable->parent)->domain);
+
+   return (unmapped == size) ? 0 : -EINVAL;
+}
+
+static int msm_iommu_pagetable_map(struct msm_mmu *mmu, u64 iova,
+   struct sg_table *sgt, size_t len, int prot)
+{
+   struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+   struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
+   struct scatterlist *sg;
+   size_t mapped = 0;
+   u64 addr = iova;
+   unsigned int i;
+
+   for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+   size_t size = sg->length;
+   phys_addr_t phys = sg_phys(sg);
+
+   /* Map the block one page at a time */
+   while (size) {
+   if (ops->map(ops, addr, phys, 4096, prot)) {
+   msm_iommu_pagetable_unmap(mmu, iova, mapped);
+   return -EINVAL;
+   }
+
+   phys += 4096;
+   addr += 4096;
+   size -= 4096;
+   mapped += 4096;
+   }
+   }
+
+   return 0;
+}
+
+static void msm_iommu_pagetable_destroy(struct msm_mmu *mmu)
+{
+   struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+   struct msm_iommu *iommu = to_msm_iommu(pagetable->parent);
+
+   /*
+* If this is the last attached pagetable for the parent,
+* disable TTBR0 in the arm-smmu driver
+*/
+   if (atomic_dec_return(&iommu->pagetables) == 0)
+   iommu_domain_set_attr(iommu->domain,
+   DOMAIN_ATTR_PGTABLE_CFG, NULL);
+
+   free_io_pgtable_ops(pagetable->pgtbl_ops);
+   kfree(pagetable);
+}
+
+int msm_iommu_pagetable_params(struct msm_mmu *mmu,
+   phys_addr_t *ttbr, int *asid)
+{
+   struct msm_iommu_pagetable *pagetable;
+
+   if (mmu->type != MSM_MMU_IOMMU_PAGETABLE)
+   return -EINVAL;
+
+   pagetable = to_pagetable(mmu);
+
+   if (ttbr)
+   *ttbr = pagetable->ttbr;
+
+   if (asid)
+   *asid = pagetable->asid;
+
+   return 0;
+}
+
+static const struct msm_mmu_funcs pagetable_funcs = {
+   .map = msm_iommu_pagetable_map,
+   .unmap = msm_iommu_pagetable_unmap,
+   .destroy = msm_iommu_pagetable_destroy,
+};
+
+static void msm_iommu_tlb_flush_all(void *cookie)
+{
+}
+
+static void msm_iommu_tlb_flush_walk(unsigned long iova, size_t size,
+   size_t granule, void *cookie)
+{
+}
+
+static void msm_iommu_tlb_add_page(struct iommu_iotlb_gather *gather,
+   unsigned long iova, size_t granule, void *cookie)
+{
+}
+
+static const struct iommu_flush_ops null_tl

[PATCH v10 02/13] iommu/arm-smmu: Add support for split pagetables

2020-07-20 Thread Jordan Crouse
Enable TTBR1 for a context bank if IO_PGTABLE_QUIRK_ARM_TTBR1 is selected
by the io-pgtable configuration.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm-smmu.c | 21 -
 drivers/iommu/arm-smmu.h | 25 +++--
 2 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 0e2c65ee9e5a..8798428a4c8d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -555,11 +555,15 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr;
cb->ttbr[1] = 0;
} else {
-   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
-   cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID,
- cfg->asid);
+   cb->ttbr[0] = FIELD_PREP(ARM_SMMU_TTBRn_ASID,
+   cfg->asid);
cb->ttbr[1] = FIELD_PREP(ARM_SMMU_TTBRn_ASID,
-cfg->asid);
+   cfg->asid);
+
+   if (pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+   cb->ttbr[1] |= pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+   else
+   cb->ttbr[0] |= pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
}
} else {
cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
@@ -824,7 +828,14 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
 
/* Update the domain's page sizes to reflect the page table format */
domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
-   domain->geometry.aperture_end = (1UL << ias) - 1;
+
+   if (pgtbl_cfg.quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) {
+   domain->geometry.aperture_start = ~0UL << ias;
+   domain->geometry.aperture_end = ~0UL;
+   } else {
+   domain->geometry.aperture_end = (1UL << ias) - 1;
+   }
+
domain->geometry.force_aperture = true;
 
/* Initialise the context bank with our page table cfg */
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 38b041530a4f..5f2de20e883b 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -168,10 +168,12 @@ enum arm_smmu_cbar_type {
 #define ARM_SMMU_CB_TCR0x30
 #define ARM_SMMU_TCR_EAE   BIT(31)
 #define ARM_SMMU_TCR_EPD1  BIT(23)
+#define ARM_SMMU_TCR_A1BIT(22)
 #define ARM_SMMU_TCR_TG0   GENMASK(15, 14)
 #define ARM_SMMU_TCR_SH0   GENMASK(13, 12)
 #define ARM_SMMU_TCR_ORGN0 GENMASK(11, 10)
 #define ARM_SMMU_TCR_IRGN0 GENMASK(9, 8)
+#define ARM_SMMU_TCR_EPD0  BIT(7)
 #define ARM_SMMU_TCR_T0SZ  GENMASK(5, 0)
 
 #define ARM_SMMU_VTCR_RES1 BIT(31)
@@ -347,12 +349,23 @@ struct arm_smmu_domain {
 
 static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
 {
-   return ARM_SMMU_TCR_EPD1 |
-  FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
-  FIELD_PREP(ARM_SMMU_TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
-  FIELD_PREP(ARM_SMMU_TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
-  FIELD_PREP(ARM_SMMU_TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
-  FIELD_PREP(ARM_SMMU_TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
+   u32 tcr = FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
+   FIELD_PREP(ARM_SMMU_TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
+   FIELD_PREP(ARM_SMMU_TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
+   FIELD_PREP(ARM_SMMU_TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
+   FIELD_PREP(ARM_SMMU_TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
+
+   /*
+   * When TTBR1 is selected shift the TCR fields by 16 bits and disable
+   * translation in TTBR0
+   */
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) {
+   tcr = (tcr << 16) & ~ARM_SMMU_TCR_A1;
+   tcr |= ARM_SMMU_TCR_EPD0;
+   } else
+   tcr |= ARM_SMMU_TCR_EPD1;
+
+   return tcr;
 }
 
 static inline u32 arm_smmu_lpae_tcr2(struct io_pgtable_cfg *cfg)
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v10 05/13] iommu: Add a domain attribute to get/set a pagetable configuration

2020-07-20 Thread Jordan Crouse
Add domain attribute DOMAIN_ATTR_PGTABLE_CFG. This will be used by
arm-smmu to share the current pagetable configuration with the
leaf driver and to allow the leaf driver to set up a new pagetable
configuration under certain circumstances.

Signed-off-by: Jordan Crouse 
---

 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 5657d4fef9f2..e8d59ad15611 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -124,6 +124,7 @@ enum iommu_attr {
DOMAIN_ATTR_FSL_PAMUV1,
DOMAIN_ATTR_NESTING,/* two stages of translation */
DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
+   DOMAIN_ATTR_PGTABLE_CFG,
DOMAIN_ATTR_MAX,
 };
 
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v10 11/13] drm/msm: Add support for private address space instances

2020-07-20 Thread Jordan Crouse
Add support for allocating private address space instances. Targets that
support per-context pagetables should implement their own function to
allocate private address spaces.

The default will return a pointer to the global address space.

Signed-off-by: Jordan Crouse 
---

 drivers/gpu/drm/msm/msm_drv.c | 13 +++--
 drivers/gpu/drm/msm/msm_drv.h |  5 +
 drivers/gpu/drm/msm/msm_gem_vma.c |  9 +
 drivers/gpu/drm/msm/msm_gpu.c | 17 +
 drivers/gpu/drm/msm/msm_gpu.h |  5 +
 5 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 556198d4ba5f..c0328abea52d 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -603,7 +603,7 @@ static int context_init(struct drm_device *dev, struct 
drm_file *file)
kref_init(&ctx->ref);
msm_submitqueue_init(dev, ctx);
 
-   ctx->aspace = priv->gpu ? priv->gpu->aspace : NULL;
+   ctx->aspace = msm_gpu_create_private_address_space(priv->gpu);
file->driver_priv = ctx;
 
return 0;
@@ -786,18 +786,19 @@ static int msm_ioctl_gem_cpu_fini(struct drm_device *dev, 
void *data,
 }
 
 static int msm_ioctl_gem_info_iova(struct drm_device *dev,
-   struct drm_gem_object *obj, uint64_t *iova)
+   struct drm_file *file, struct drm_gem_object *obj,
+   uint64_t *iova)
 {
-   struct msm_drm_private *priv = dev->dev_private;
+   struct msm_file_private *ctx = file->driver_priv;
 
-   if (!priv->gpu)
+   if (!ctx->aspace)
return -EINVAL;
 
/*
 * Don't pin the memory here - just get an address so that userspace can
 * be productive
 */
-   return msm_gem_get_iova(obj, priv->gpu->aspace, iova);
+   return msm_gem_get_iova(obj, ctx->aspace, iova);
 }
 
 static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
@@ -836,7 +837,7 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void 
*data,
args->value = msm_gem_mmap_offset(obj);
break;
case MSM_INFO_GET_IOVA:
-   ret = msm_ioctl_gem_info_iova(dev, obj, &args->value);
+   ret = msm_ioctl_gem_info_iova(dev, file, obj, &args->value);
break;
case MSM_INFO_SET_NAME:
/* length check should leave room for terminating null: */
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index ab5f77261816..df400f9ec38c 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -250,6 +250,10 @@ int msm_gem_map_vma(struct msm_gem_address_space *aspace,
 void msm_gem_close_vma(struct msm_gem_address_space *aspace,
struct msm_gem_vma *vma);
 
+
+struct msm_gem_address_space *
+msm_gem_address_space_get(struct msm_gem_address_space *aspace);
+
 void msm_gem_address_space_put(struct msm_gem_address_space *aspace);
 
 struct msm_gem_address_space *
@@ -435,6 +439,7 @@ static inline void msm_file_private_destroy(struct kref 
*kref)
struct msm_file_private *ctx = container_of(kref,
struct msm_file_private, ref);
 
+   msm_gem_address_space_put(ctx->aspace);
kfree(ctx);
 }
 
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c 
b/drivers/gpu/drm/msm/msm_gem_vma.c
index 5f6a11211b64..29cc1305cf37 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -27,6 +27,15 @@ void msm_gem_address_space_put(struct msm_gem_address_space 
*aspace)
kref_put(&aspace->kref, msm_gem_address_space_destroy);
 }
 
+struct msm_gem_address_space *
+msm_gem_address_space_get(struct msm_gem_address_space *aspace)
+{
+   if (!IS_ERR_OR_NULL(aspace))
+   kref_get(&aspace->kref);
+
+   return aspace;
+}
+
 /* Actually unmap memory for the vma */
 void msm_gem_purge_vma(struct msm_gem_address_space *aspace,
struct msm_gem_vma *vma)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index a1f3da6550e5..aabbd7908ee5 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -823,6 +823,23 @@ static int get_clocks(struct platform_device *pdev, struct 
msm_gpu *gpu)
return 0;
 }
 
+/* Return a new address space for a msm_drm_private instance */
+struct msm_gem_address_space *
+msm_gpu_create_private_address_space(struct msm_gpu *gpu)
+{
+   if (!gpu)
+   return NULL;
+
+   /*
+* If the target doesn't support private address spaces then return
+* the global one
+*/
+   if (!gpu->funcs->create_private_address_space)
+   return msm_gem_address_space_get(gpu->aspace);
+
+   return gpu->funcs->create_private_address_space(gpu);
+}
+
 int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
const char *na

[PATCH v10 04/13] iommu/arm-smmu-qcom: Add implementation for the adreno GPU SMMU

2020-07-20 Thread Jordan Crouse
Add a special implementation for the SMMU attached to most Adreno GPU
target triggered from the qcom,adreno-smmu compatible string.

The new Adreno SMMU implementation will enable split pagetables
(TTBR1) for the domain attached to the GPU device (SID 0) and
hard code it context bank 0 so the GPU hardware can implement
per-instance pagetables.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm-smmu-impl.c |  3 ++
 drivers/iommu/arm-smmu-qcom.c | 83 ++-
 drivers/iommu/arm-smmu.h  |  1 +
 3 files changed, 85 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index b71b14685cc9..3bb1ef4e85f7 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -176,5 +176,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
of_device_is_compatible(np, "qcom,sc7180-smmu-500"))
return qcom_smmu_impl_init(smmu);
 
+   if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))
+   return qcom_adreno_smmu_impl_init(smmu);
+
return smmu;
 }
diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c
index be4318044f96..b9a5c5369e86 100644
--- a/drivers/iommu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm-smmu-qcom.c
@@ -12,6 +12,67 @@ struct qcom_smmu {
struct arm_smmu_device smmu;
 };
 
+#define QCOM_ADRENO_SMMU_GPU_SID 0
+
+static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
+{
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
+   int idx, i;
+
+   /*
+* The GPU will always use SID 0 so that is a handy way to uniquely
+* identify it and configure it for per-instance pagetables
+*/
+   for_each_cfg_sme(cfg, fwspec, i, idx) {
+   u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
+
+   if (sid == QCOM_ADRENO_SMMU_GPU_SID)
+   return true;
+   }
+
+   return false;
+}
+
+static int qcom_adreno_smmu_alloc_context_bank(struct arm_smmu_domain 
*smmu_domain,
+   struct device *dev, int start, int count)
+{
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+   /*
+* Assign context bank 0 to the GPU device so the GPU hardware can
+* switch pagetables
+*/
+   if (qcom_adreno_smmu_is_gpu_device(dev)) {
+   if (start > 0 || test_bit(0, smmu->context_map))
+   return -ENOSPC;
+
+   set_bit(0, smmu->context_map);
+   return 0;
+   }
+
+   return __arm_smmu_alloc_bitmap(smmu->context_map, start, count);
+}
+
+static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
+   struct io_pgtable_cfg *pgtbl_cfg, struct device *dev)
+{
+   /* Only enable split pagetables for the GPU device (SID 0) */
+   if (!qcom_adreno_smmu_is_gpu_device(dev))
+   return 0;
+
+   /*
+* All targets that use the qcom,adreno-smmu compatible string *should*
+* be AARCH64 stage 1 but double check because the arm-smmu code assumes
+* that is the case when the TTBR1 quirk is enabled
+*/
+   if ((smmu_domain->stage == ARM_SMMU_DOMAIN_S1) &&
+   (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64))
+   pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
+
+   return 0;
+}
+
 static const struct of_device_id qcom_smmu_client_of_match[] __maybe_unused = {
{ .compatible = "qcom,adreno" },
{ .compatible = "qcom,mdp4" },
@@ -65,7 +126,15 @@ static const struct arm_smmu_impl qcom_smmu_impl = {
.reset = qcom_smmu500_reset,
 };
 
-struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)
+static const struct arm_smmu_impl qcom_adreno_smmu_impl = {
+   .init_context = qcom_adreno_smmu_init_context,
+   .def_domain_type = qcom_smmu_def_domain_type,
+   .reset = qcom_smmu500_reset,
+   .alloc_context_bank = qcom_adreno_smmu_alloc_context_bank,
+};
+
+static struct arm_smmu_device *qcom_smmu_create(struct arm_smmu_device *smmu,
+   const struct arm_smmu_impl *impl)
 {
struct qcom_smmu *qsmmu;
 
@@ -75,8 +144,18 @@ struct arm_smmu_device *qcom_smmu_impl_init(struct 
arm_smmu_device *smmu)
 
qsmmu->smmu = *smmu;
 
-   qsmmu->smmu.impl = &qcom_smmu_impl;
+   qsmmu->smmu.impl = impl;
devm_kfree(smmu->dev, smmu);
 
return &qsmmu->smmu;
 }
+
+struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)
+{
+   return qcom_smmu_create(smmu, &qcom_smmu_impl);
+}
+
+struct arm_smmu_device *qcom_adreno_smmu_impl_init(struct arm_smmu_device 
*smmu)
+{
+   return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl);
+}
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index d10d745a0290..9f81c1fffe1e 100644
--- a/drivers/iommu/arm-smmu.h

[PATCH v10 06/13] iommu/arm-smmu-qcom: Get and set the pagetable config for split pagetables

2020-07-20 Thread Jordan Crouse
The Adreno GPU has the capability to manage its own pagetables and switch
them dynamically from the hardware. To do this the GPU uses TTBR1 for
"global" GPU memory and creates local pagetables for each context and
switches them dynamically with the GPU.

Use DOMAIN_ATTR_PGTABLE_CFG to get the current configuration for the
TTBR1 pagetable from the smmu driver so the leaf driver can create
compatible pagetables for use with TTBR0.

Because TTBR0 is disabled by default when TTBR1 is enabled the GPU
driver can pass the configuration of one of the newly created pagetables
back through DOMAIN_ATTR_PGTABLE_CFG as a trigger to enable translation on
TTBR0.

Signed-off-by: Jordan Crouse 
---

 drivers/iommu/arm-smmu-qcom.c | 47 +++
 drivers/iommu/arm-smmu.c  | 32 ++--
 drivers/iommu/arm-smmu.h  | 10 
 3 files changed, 81 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c
index b9a5c5369e86..9a0c64ca9cb6 100644
--- a/drivers/iommu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm-smmu-qcom.c
@@ -34,6 +34,52 @@ static bool qcom_adreno_smmu_is_gpu_device(struct device 
*dev)
return false;
 }
 
+/*
+ * Local implementation to configure TTBR0 wil the specified pagetable config.
+ * The GPU driver will call this to enable TTBR0 when per-instance pagetables
+ * are active
+ */
+static int qcom_adreno_smmu_set_pgtable_cfg(struct arm_smmu_domain 
*smmu_domain,
+   struct io_pgtable_cfg *pgtbl_cfg)
+{
+   struct io_pgtable *pgtable = 
io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+   struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+   struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+
+   /* The domain must have split pagetables already enabled */
+   if (cb->tcr[0] & ARM_SMMU_TCR_EPD1)
+   return -EINVAL;
+
+   /* If the pagetable config is NULL, disable TTBR0 */
+   if (!pgtbl_cfg) {
+   /* Do nothing if it is already disabled */
+   if ((cb->tcr[0] & ARM_SMMU_TCR_EPD0))
+   return -EINVAL;
+
+   /* Set TCR to the original configuration */
+   cb->tcr[0] = arm_smmu_lpae_tcr(&pgtable->cfg);
+   cb->ttbr[0] = FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
+   } else {
+   u32 tcr = cb->tcr[0];
+
+   /* FIXME: What sort of validation do we need to do here? */
+
+   /* Don't call this again if TTBR0 is already enabled */
+   if (!(cb->tcr[0] & ARM_SMMU_TCR_EPD0))
+   return -EINVAL;
+
+   tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
+   tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
+
+   cb->tcr[0] = tcr;
+   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+   cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
+   }
+
+   arm_smmu_write_context_bank(smmu_domain->smmu, cb->cfg->cbndx);
+   return 0;
+}
+
 static int qcom_adreno_smmu_alloc_context_bank(struct arm_smmu_domain 
*smmu_domain,
struct device *dev, int start, int count)
 {
@@ -131,6 +177,7 @@ static const struct arm_smmu_impl qcom_adreno_smmu_impl = {
.def_domain_type = qcom_smmu_def_domain_type,
.reset = qcom_smmu500_reset,
.alloc_context_bank = qcom_adreno_smmu_alloc_context_bank,
+   .set_pgtable_cfg = qcom_adreno_smmu_set_pgtable_cfg,
 };
 
 static struct arm_smmu_device *qcom_smmu_create(struct arm_smmu_device *smmu,
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index fff536a44faa..e1036ae54a8d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -86,13 +86,6 @@ struct arm_smmu_smr {
boolvalid;
 };
 
-struct arm_smmu_cb {
-   u64 ttbr[2];
-   u32 tcr[2];
-   u32 mair[2];
-   struct arm_smmu_cfg *cfg;
-};
-
 static bool using_legacy_binding, using_generic_binding;
 
 static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
@@ -558,7 +551,7 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
}
 }
 
-static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
+void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
 {
u32 reg;
bool stage1;
@@ -1515,6 +1508,18 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case DOMAIN_ATTR_NESTING:
*(int *)data = (smmu_domain->stage == 
ARM_SMMU_DOMAIN_NESTED);
return 0;
+   case DOMAIN_ATTR_PGTABLE_CFG: {
+   struct io_pgtable *pgtable;
+   struct io_pgtable_cfg *dest = data;
+
+   if (!smmu_domain->pgtbl_ops)
+   return

Re: [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)

2020-07-20 Thread Jean-Philippe Brucker
On Mon, Jul 20, 2020 at 12:11:17PM +0100, Will Deacon wrote:
> On Thu, Jun 18, 2020 at 05:51:13PM +0200, Jean-Philippe Brucker wrote:
> > Since v7 [1], I split the series into three parts to ease review. This
> > first one adds page table sharing to the SMMUv3 driver. The second one
> > adds support for I/O page faults through PRI and Stall, and the last one
> > adds additional and optional features (DVM, VHE and HTTU). SVA needs the
> > three parts to work. No significant change apart from that, I just
> > addressed the previous comments.
> > 
> > I'd rather everything went through the IOMMU tree but I'm assuming patch
> > 1 will also go through the x86 tree as part of [2]. It is definitely
> > required by patch 3 which is required by patch 11. I don't know how this
> > kind of conflict is usually resolved, but if it's a problem I could
> > further shrink the series to only patches 4-10 this cycle.
> 
> Modulo my review comments, I think most of this looks alright from the SMMU
> side. However, I would really like it if the SVA driver parts could be moved
> into a separate file (e.g. arm-smmu-v3-sva.c), with a separate config option
> (dependent on the current one, so you can easily build a driver without SVA
> support). Does that sound remotely feasible?

Yes it makes sense. It requires moving some definitions from arm-smmu-v3.c
to a .h but should be straightforward, I'll give it a try.

Thanks,
Jean

> If so, I think it would really
> help in terms of maintainability, since the SVA model is really all about
> the mm, whereas the driver model is all about the device. This makes it
> really hard to read when you have to keep working out whether the current
> 'handle' is an mm_struct or an arm_smmu_device.
> 
> Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v5 15/15] iommu/vt-d: Support reporting nesting capability info

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Saturday, July 18, 2020 1:14 AM
> 
> Hi Yi,
> 
> Missing a proper commit message. You can comment on the fact you only
> support the case where all the physical iomms have the same CAP/ECAP MASKS

got it. will add it. it looks like the subject is straightforward, so I removed 
commit
message.

> 
> On 7/12/20 1:21 PM, Liu Yi L wrote:
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> > v2 -> v3:
> > *) remove cap/ecap_mask in iommu_nesting_info.
> > ---
> >  drivers/iommu/intel/iommu.c | 81
> +++--
> >  include/linux/intel-iommu.h | 16 +
> >  2 files changed, 95 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index a9504cb..9f7ad1a 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -5659,12 +5659,16 @@ static inline bool iommu_pasid_support(void)
> >  static inline bool nested_mode_support(void)
> >  {
> > struct dmar_drhd_unit *drhd;
> > -   struct intel_iommu *iommu;
> > +   struct intel_iommu *iommu, *prev = NULL;
> > bool ret = true;
> >
> > rcu_read_lock();
> > for_each_active_iommu(iommu, drhd) {
> > -   if (!sm_supported(iommu) || !ecap_nest(iommu->ecap)) {
> > +   if (!prev)
> > +   prev = iommu;
> > +   if (!sm_supported(iommu) || !ecap_nest(iommu->ecap) ||
> > +   (VTD_CAP_MASK & (iommu->cap ^ prev->cap)) ||
> > +   (VTD_ECAP_MASK & (iommu->ecap ^ prev->ecap))) {
> > ret = false;
> > break;> }
> > @@ -6079,6 +6083,78 @@ intel_iommu_domain_set_attr(struct iommu_domain
> *domain,
> > return ret;
> >  }
> >
> > +static int intel_iommu_get_nesting_info(struct iommu_domain *domain,
> > +   struct iommu_nesting_info *info)
> > +{
> > +   struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > +   u64 cap = VTD_CAP_MASK, ecap = VTD_ECAP_MASK;
> > +   struct device_domain_info *domain_info;
> > +   struct iommu_nesting_info_vtd vtd;
> > +   unsigned long flags;
> > +   unsigned int size;
> > +
> > +   if (domain->type != IOMMU_DOMAIN_UNMANAGED ||
> > +   !(dmar_domain->flags & DOMAIN_FLAG_NESTING_MODE))
> > +   return -ENODEV;
> > +
> > +   if (!info)
> > +   return -EINVAL;
> > +
> > +   size = sizeof(struct iommu_nesting_info) +
> > +   sizeof(struct iommu_nesting_info_vtd);
> > +   /*
> > +* if provided buffer size is smaller than expected, should
> > +* return 0 and also the expected buffer size to caller.
> > +*/
> > +   if (info->size < size) {
> > +   info->size = size;
> > +   return 0;
> > +   }
> > +
> > +   spin_lock_irqsave(&device_domain_lock, flags);
> > +   /*
> > +* arbitrary select the first domain_info as all nesting
> > +* related capabilities should be consistent across iommu
> > +* units.
> > +*/
> > +   domain_info = list_first_entry(&dmar_domain->devices,
> > +  struct device_domain_info, link);
> > +   cap &= domain_info->iommu->cap;
> > +   ecap &= domain_info->iommu->ecap;
> > +   spin_unlock_irqrestore(&device_domain_lock, flags);
> > +
> > +   info->format = IOMMU_PASID_FORMAT_INTEL_VTD;
> > +   info->features = IOMMU_NESTING_FEAT_SYSWIDE_PASID |
> > +IOMMU_NESTING_FEAT_BIND_PGTBL |
> > +IOMMU_NESTING_FEAT_CACHE_INVLD;
> > +   info->addr_width = dmar_domain->gaw;
> > +   info->pasid_bits = ilog2(intel_pasid_max_id);
> > +   info->padding = 0;
> > +   vtd.flags = 0;
> > +   vtd.padding = 0;
> > +   vtd.cap_reg = cap;
> > +   vtd.ecap_reg = ecap;
> > +
> > +   memcpy(info->data, &vtd, sizeof(vtd));
> > +   return 0;
> > +}
> > +
> > +static int intel_iommu_domain_get_attr(struct iommu_domain *domain,
> > +  enum iommu_attr attr, void *data)
> > +{
> > +   switch (attr) {
> > +   case DOMAIN_ATTR_NESTING:
> > +   {
> > +   struct iommu_nesting_info *info =
> > +   (struct iommu_nesting_info *)data;
> > +
> > +   return intel_iommu_get_nesting_info(domain, info);
> > +   }
> > +   default:
> > +   return -ENODEV;
> -ENOENT?

arm_smmu_domain_get_attr() is using -ENODEV, so I used the same. I can
modify it if -ENOENT is better. :-)

Regards,
Yi Liu

> > +   }
> > +}
> > +
> >  /*
> >   * Check that the device does not live on an external facing PCI port that 
> > is
> >   * marked as untrusted. Such devices should not be able to apply quirks and
> > @@ -6101,6 +6177,7 @@ const struct iommu_ops intel_iommu_ops = {
> > .domain_alloc   = intel_iommu_domain_alloc,
> > .domain_free= intel_iommu_domain_free,
> 

RE: [PATCH v5 13/15] vfio/pci: Expose PCIe PASID capability to guest

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Monday, July 20, 2020 8:35 PM
> 
> Yi,
> 
> On 7/12/20 1:21 PM, Liu Yi L wrote:
> > This patch exposes PCIe PASID capability to guest for assigned devices.
> > Existing vfio_pci driver hides it from guest by setting the capability
> > length as 0 in pci_ext_cap_length[].
> >
> > And this patch only exposes PASID capability for devices which has PCIe
> > PASID extended struture in its configuration space. So VFs, will will
> s/will//

got it.

> > not see PASID capability on VFs as VF doesn't implement PASID extended
> suggested rewording: VFs will not expose the PASID capability as they do
> not implement the PASID extended structure in their config space?

make sense. will modify it.

> > structure in its configuration space. For VF, it is a TODO in future.
> > Related discussion can be found in below link:
> >
> > https://lkml.org/lkml/2020/4/7/693
> 
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Liu Yi L 
> > ---
> > v1 -> v2:
> > *) added in v2, but it was sent in a separate patchseries before
> > ---
> >  drivers/vfio/pci/vfio_pci_config.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/vfio/pci/vfio_pci_config.c 
> > b/drivers/vfio/pci/vfio_pci_config.c
> > index d98843f..07ff2e6 100644
> > --- a/drivers/vfio/pci/vfio_pci_config.c
> > +++ b/drivers/vfio/pci/vfio_pci_config.c
> > @@ -95,7 +95,7 @@ static const u16 pci_ext_cap_length[PCI_EXT_CAP_ID_MAX +
> 1] = {
> > [PCI_EXT_CAP_ID_LTR]=   PCI_EXT_CAP_LTR_SIZEOF,
> > [PCI_EXT_CAP_ID_SECPCI] =   0,  /* not yet */
> > [PCI_EXT_CAP_ID_PMUX]   =   0,  /* not yet */
> > -   [PCI_EXT_CAP_ID_PASID]  =   0,  /* not yet */
> > +   [PCI_EXT_CAP_ID_PASID]  =   PCI_EXT_CAP_PASID_SIZEOF,
> >  };
> >
> >  /*
> >
> Reviewed-by: Eric Auger 

thanks.

Regards,
Yi Liu

> Thanks
> 
> Eric

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v5 09/15] iommu/vt-d: Check ownership for PASIDs from user-space

2020-07-20 Thread Liu, Yi L
Eric,

> From: Auger Eric 
> Sent: Monday, July 20, 2020 8:38 PM
> 
> Yi,
> 
> On 7/20/20 12:18 PM, Liu, Yi L wrote:
> > Hi Eric,
> >
> >> From: Auger Eric 
> >> Sent: Monday, July 20, 2020 12:06 AM
> >>
> >> Hi Yi,
> >>
> >> On 7/12/20 1:21 PM, Liu Yi L wrote:
> >>> When an IOMMU domain with nesting attribute is used for guest SVA, a
> >>> system-wide PASID is allocated for binding with the device and the domain.
> >>> For security reason, we need to check the PASID passsed from user-space.
> >> passed
> >
> > got it.
> >
> >>> e.g. page table bind/unbind and PASID related cache invalidation.
> >>>
> >>> Cc: Kevin Tian 
> >>> CC: Jacob Pan 
> >>> Cc: Alex Williamson 
> >>> Cc: Eric Auger 
> >>> Cc: Jean-Philippe Brucker 
> >>> Cc: Joerg Roedel 
> >>> Cc: Lu Baolu 
> >>> Signed-off-by: Liu Yi L 
> >>> Signed-off-by: Jacob Pan 
> >>> ---
> >>>  drivers/iommu/intel/iommu.c | 10 ++
> >>>  drivers/iommu/intel/svm.c   |  7 +--
> >>>  2 files changed, 15 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> >>> index 4d54198..a9504cb 100644
> >>> --- a/drivers/iommu/intel/iommu.c
> >>> +++ b/drivers/iommu/intel/iommu.c
> >>> @@ -5436,6 +5436,7 @@ intel_iommu_sva_invalidate(struct iommu_domain
> >> *domain, struct device *dev,
> >>>   int granu = 0;
> >>>   u64 pasid = 0;
> >>>   u64 addr = 0;
> >>> + void *pdata;
> >>>
> >>>   granu = to_vtd_granularity(cache_type, inv_info->granularity);
> >>>   if (granu == -EINVAL) {
> >>> @@ -5456,6 +5457,15 @@ intel_iommu_sva_invalidate(struct iommu_domain
> >> *domain, struct device *dev,
> >>>(inv_info->granu.addr_info.flags &
> >> IOMMU_INV_ADDR_FLAGS_PASID))
> >>>   pasid = inv_info->granu.addr_info.pasid;
> >>>
> >>> + pdata = ioasid_find(dmar_domain->ioasid_sid, pasid, NULL);
> >>> + if (!pdata) {
> >>> + ret = -EINVAL;
> >>> + goto out_unlock;
> >>> + } else if (IS_ERR(pdata)) {
> >>> + ret = PTR_ERR(pdata);
> >>> + goto out_unlock;
> >>> + }
> >>> +
> >>>   switch (BIT(cache_type)) {
> >>>   case IOMMU_CACHE_INV_TYPE_IOTLB:
> >>>   /* HW will ignore LSB bits based on address mask */
> >>> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> >>> index d2c0e1a..212dee0 100644
> >>> --- a/drivers/iommu/intel/svm.c
> >>> +++ b/drivers/iommu/intel/svm.c
> >>> @@ -319,7 +319,7 @@ int intel_svm_bind_gpasid(struct iommu_domain
> *domain,
> >> struct device *dev,
> >>>   dmar_domain = to_dmar_domain(domain);
> >>>
> >>>   mutex_lock(&pasid_mutex);
> >>> - svm = ioasid_find(INVALID_IOASID_SET, data->hpasid, NULL);
> I meant while using INVALID_IOASID_SET instead of the actual
> dmar_domain->ioasid_sid. But I think I've now recovered, the asset is
> simply not used ;-)

oh, I think should be using dmar_domain->ioasid_sid from the beginning.
does it look good so far? :-)

Regards,
Yi Liu

> >> I do not get what the call was supposed to do before that patch?
> >
> > you mean patch 10/15 by "that patch", right? the ownership check should
> > be done as to prevent illegal bind request from userspace. before patch
> > 10/15, it should be added.
> >
> >>> + svm = ioasid_find(dmar_domain->ioasid_sid, data->hpasid, NULL);
> >>>   if (IS_ERR(svm)) {
> >>>   ret = PTR_ERR(svm);
> >>>   goto out;
> >>> @@ -436,6 +436,7 @@ int intel_svm_unbind_gpasid(struct iommu_domain
> >> *domain,
> >>>   struct device *dev, ioasid_t pasid)
> >>>  {
> >>>   struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> >>> + struct dmar_domain *dmar_domain;
> >>>   struct intel_svm_dev *sdev;
> >>>   struct intel_svm *svm;
> >>>   int ret = -EINVAL;
> >>> @@ -443,8 +444,10 @@ int intel_svm_unbind_gpasid(struct iommu_domain
> >> *domain,
> >>>   if (WARN_ON(!iommu))
> >>>   return -EINVAL;
> >>>
> >>> + dmar_domain = to_dmar_domain(domain);
> >>> +
> >>>   mutex_lock(&pasid_mutex);
> >>> - svm = ioasid_find(INVALID_IOASID_SET, pasid, NULL);
> >>> + svm = ioasid_find(dmar_domain->ioasid_sid, pasid, NULL);
> >> just to make sure, about the locking, can't domain->ioasid_sid change
> >> under the hood?
> >
> > I guess not. intel_svm_unbind_gpasid() and iommu_domain_set_attr()
> > is called by vfio today, and within vfio, there is vfio_iommu->lock.
> OK
> 
> Thanks
> 
> Eric
> >
> > Regards,
> > Yi Liu
> >
> >>
> >> Thanks
> >>
> >> Eric
> >>>   if (!svm) {
> >>>   ret = -EINVAL;
> >>>   goto out;
> >>>
> >

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 09/15] iommu/vt-d: Check ownership for PASIDs from user-space

2020-07-20 Thread Auger Eric
Yi,

On 7/20/20 12:18 PM, Liu, Yi L wrote:
> Hi Eric,
> 
>> From: Auger Eric 
>> Sent: Monday, July 20, 2020 12:06 AM
>>
>> Hi Yi,
>>
>> On 7/12/20 1:21 PM, Liu Yi L wrote:
>>> When an IOMMU domain with nesting attribute is used for guest SVA, a
>>> system-wide PASID is allocated for binding with the device and the domain.
>>> For security reason, we need to check the PASID passsed from user-space.
>> passed
> 
> got it.
> 
>>> e.g. page table bind/unbind and PASID related cache invalidation.
>>>
>>> Cc: Kevin Tian 
>>> CC: Jacob Pan 
>>> Cc: Alex Williamson 
>>> Cc: Eric Auger 
>>> Cc: Jean-Philippe Brucker 
>>> Cc: Joerg Roedel 
>>> Cc: Lu Baolu 
>>> Signed-off-by: Liu Yi L 
>>> Signed-off-by: Jacob Pan 
>>> ---
>>>  drivers/iommu/intel/iommu.c | 10 ++
>>>  drivers/iommu/intel/svm.c   |  7 +--
>>>  2 files changed, 15 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>> index 4d54198..a9504cb 100644
>>> --- a/drivers/iommu/intel/iommu.c
>>> +++ b/drivers/iommu/intel/iommu.c
>>> @@ -5436,6 +5436,7 @@ intel_iommu_sva_invalidate(struct iommu_domain
>> *domain, struct device *dev,
>>> int granu = 0;
>>> u64 pasid = 0;
>>> u64 addr = 0;
>>> +   void *pdata;
>>>
>>> granu = to_vtd_granularity(cache_type, inv_info->granularity);
>>> if (granu == -EINVAL) {
>>> @@ -5456,6 +5457,15 @@ intel_iommu_sva_invalidate(struct iommu_domain
>> *domain, struct device *dev,
>>>  (inv_info->granu.addr_info.flags &
>> IOMMU_INV_ADDR_FLAGS_PASID))
>>> pasid = inv_info->granu.addr_info.pasid;
>>>
>>> +   pdata = ioasid_find(dmar_domain->ioasid_sid, pasid, NULL);
>>> +   if (!pdata) {
>>> +   ret = -EINVAL;
>>> +   goto out_unlock;
>>> +   } else if (IS_ERR(pdata)) {
>>> +   ret = PTR_ERR(pdata);
>>> +   goto out_unlock;
>>> +   }
>>> +
>>> switch (BIT(cache_type)) {
>>> case IOMMU_CACHE_INV_TYPE_IOTLB:
>>> /* HW will ignore LSB bits based on address mask */
>>> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
>>> index d2c0e1a..212dee0 100644
>>> --- a/drivers/iommu/intel/svm.c
>>> +++ b/drivers/iommu/intel/svm.c
>>> @@ -319,7 +319,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain,
>> struct device *dev,
>>> dmar_domain = to_dmar_domain(domain);
>>>
>>> mutex_lock(&pasid_mutex);
>>> -   svm = ioasid_find(INVALID_IOASID_SET, data->hpasid, NULL);
I meant while using INVALID_IOASID_SET instead of the actual
dmar_domain->ioasid_sid. But I think I've now recovered, the asset is
simply not used ;-)
>> I do not get what the call was supposed to do before that patch?
> 
> you mean patch 10/15 by "that patch", right? the ownership check should
> be done as to prevent illegal bind request from userspace. before patch
> 10/15, it should be added.
> 
>>> +   svm = ioasid_find(dmar_domain->ioasid_sid, data->hpasid, NULL);
>>> if (IS_ERR(svm)) {
>>> ret = PTR_ERR(svm);
>>> goto out;
>>> @@ -436,6 +436,7 @@ int intel_svm_unbind_gpasid(struct iommu_domain
>> *domain,
>>> struct device *dev, ioasid_t pasid)
>>>  {
>>> struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
>>> +   struct dmar_domain *dmar_domain;
>>> struct intel_svm_dev *sdev;
>>> struct intel_svm *svm;
>>> int ret = -EINVAL;
>>> @@ -443,8 +444,10 @@ int intel_svm_unbind_gpasid(struct iommu_domain
>> *domain,
>>> if (WARN_ON(!iommu))
>>> return -EINVAL;
>>>
>>> +   dmar_domain = to_dmar_domain(domain);
>>> +
>>> mutex_lock(&pasid_mutex);
>>> -   svm = ioasid_find(INVALID_IOASID_SET, pasid, NULL);
>>> +   svm = ioasid_find(dmar_domain->ioasid_sid, pasid, NULL);
>> just to make sure, about the locking, can't domain->ioasid_sid change
>> under the hood?
> 
> I guess not. intel_svm_unbind_gpasid() and iommu_domain_set_attr()
> is called by vfio today, and within vfio, there is vfio_iommu->lock.
OK

Thanks

Eric
> 
> Regards,
> Yi Liu
> 
>>
>> Thanks
>>
>> Eric
>>> if (!svm) {
>>> ret = -EINVAL;
>>> goto out;
>>>
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 13/15] vfio/pci: Expose PCIe PASID capability to guest

2020-07-20 Thread Auger Eric
Yi,

On 7/12/20 1:21 PM, Liu Yi L wrote:
> This patch exposes PCIe PASID capability to guest for assigned devices.
> Existing vfio_pci driver hides it from guest by setting the capability
> length as 0 in pci_ext_cap_length[].
> 
> And this patch only exposes PASID capability for devices which has PCIe
> PASID extended struture in its configuration space. So VFs, will will
s/will//
> not see PASID capability on VFs as VF doesn't implement PASID extended
suggested rewording: VFs will not expose the PASID capability as they do
not implement the PASID extended structure in their config space?
> structure in its configuration space. For VF, it is a TODO in future.
> Related discussion can be found in below link:
> 
> https://lkml.org/lkml/2020/4/7/693

> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> Cc: Alex Williamson 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Liu Yi L 
> ---
> v1 -> v2:
> *) added in v2, but it was sent in a separate patchseries before
> ---
>  drivers/vfio/pci/vfio_pci_config.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_config.c 
> b/drivers/vfio/pci/vfio_pci_config.c
> index d98843f..07ff2e6 100644
> --- a/drivers/vfio/pci/vfio_pci_config.c
> +++ b/drivers/vfio/pci/vfio_pci_config.c
> @@ -95,7 +95,7 @@ static const u16 pci_ext_cap_length[PCI_EXT_CAP_ID_MAX + 1] 
> = {
>   [PCI_EXT_CAP_ID_LTR]=   PCI_EXT_CAP_LTR_SIZEOF,
>   [PCI_EXT_CAP_ID_SECPCI] =   0,  /* not yet */
>   [PCI_EXT_CAP_ID_PMUX]   =   0,  /* not yet */
> - [PCI_EXT_CAP_ID_PASID]  =   0,  /* not yet */
> + [PCI_EXT_CAP_ID_PASID]  =   PCI_EXT_CAP_PASID_SIZEOF,
>  };
>  
>  /*
> 
Reviewed-by: Eric Auger 

Thanks

Eric

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 12/15] vfio/type1: Add vSVA support for IOMMU-backed mdevs

2020-07-20 Thread Auger Eric
Yi,

On 7/12/20 1:21 PM, Liu Yi L wrote:
> Recent years, mediated device pass-through framework (e.g. vfio-mdev)
> is used to achieve flexible device sharing across domains (e.g. VMs).
> Also there are hardware assisted mediated pass-through solutions from
> platform vendors. e.g. Intel VT-d scalable mode which supports Intel
> Scalable I/O Virtualization technology. Such mdevs are called IOMMU-
> backed mdevs as there are IOMMU enforced DMA isolation for such mdevs.
there is IOMMU enforced DMA isolation
> In kernel, IOMMU-backed mdevs are exposed to IOMMU layer by aux-domain
> concept, which means mdevs are protected by an iommu domain which is
> auxiliary to the domain that the kernel driver primarily uses for DMA
> API. Details can be found in the KVM presentation as below:
> 
> https://events19.linuxfoundation.org/wp-content/uploads/2017/12/\
> Hardware-Assisted-Mediated-Pass-Through-with-VFIO-Kevin-Tian-Intel.pdf
> 
> This patch extends NESTING_IOMMU ops to IOMMU-backed mdev devices. The
> main requirement is to use the auxiliary domain associated with mdev.

So as a result vSVM becomes functional for scalable mode mediated
devices, right?
> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> CC: Jun Tian 
> Cc: Alex Williamson 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Liu Yi L 
> ---
> v1 -> v2:
> *) check the iommu_device to ensure the handling mdev is IOMMU-backed
> ---
>  drivers/vfio/vfio_iommu_type1.c | 39 +++
>  1 file changed, 35 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 960cc59..f1f1ae2 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -2373,20 +2373,41 @@ static int vfio_iommu_resv_refresh(struct vfio_iommu 
> *iommu,
>   return ret;
>  }
>  
> +static struct device *vfio_get_iommu_device(struct vfio_group *group,
> + struct device *dev)
> +{
> + if (group->mdev_group)
> + return vfio_mdev_get_iommu_device(dev);
> + else
> + return dev;
> +}
> +
>  static int vfio_dev_bind_gpasid_fn(struct device *dev, void *data)
>  {
>   struct domain_capsule *dc = (struct domain_capsule *)data;
>   unsigned long arg = *(unsigned long *)dc->data;
> + struct device *iommu_device;
> +
> + iommu_device = vfio_get_iommu_device(dc->group, dev);
> + if (!iommu_device)
> + return -EINVAL;
>  
> - return iommu_sva_bind_gpasid(dc->domain, dev, (void __user *)arg);
> + return iommu_sva_bind_gpasid(dc->domain, iommu_device,
> +  (void __user *)arg);
>  }
>  
>  static int vfio_dev_unbind_gpasid_fn(struct device *dev, void *data)
>  {
>   struct domain_capsule *dc = (struct domain_capsule *)data;
>   unsigned long arg = *(unsigned long *)dc->data;
> + struct device *iommu_device;
>  
> - iommu_sva_unbind_gpasid(dc->domain, dev, (void __user *)arg);
> + iommu_device = vfio_get_iommu_device(dc->group, dev);
> + if (!iommu_device)
> + return -EINVAL;
> +
> + iommu_sva_unbind_gpasid(dc->domain, iommu_device,
> + (void __user *)arg);
>   return 0;
>  }
>  
> @@ -2395,8 +2416,13 @@ static int __vfio_dev_unbind_gpasid_fn(struct device 
> *dev, void *data)
>   struct domain_capsule *dc = (struct domain_capsule *)data;
>   struct iommu_gpasid_bind_data *unbind_data =
>   (struct iommu_gpasid_bind_data *)dc->data;
> + struct device *iommu_device;
> +
> + iommu_device = vfio_get_iommu_device(dc->group, dev);
> + if (!iommu_device)
> + return -EINVAL;
>  
> - __iommu_sva_unbind_gpasid(dc->domain, dev, unbind_data);
> + __iommu_sva_unbind_gpasid(dc->domain, iommu_device, unbind_data);
>   return 0;
>  }
>  
> @@ -3077,8 +3103,13 @@ static int vfio_dev_cache_invalidate_fn(struct device 
> *dev, void *data)
>  {
>   struct domain_capsule *dc = (struct domain_capsule *)data;
>   unsigned long arg = *(unsigned long *)dc->data;
> + struct device *iommu_device;
> +
> + iommu_device = vfio_get_iommu_device(dc->group, dev);
> + if (!iommu_device)
> + return -EINVAL;
>  
> - iommu_cache_invalidate(dc->domain, dev, (void __user *)arg);
> + iommu_cache_invalidate(dc->domain, iommu_device, (void __user *)arg);
>   return 0;
>  }
>  
> 
Besides,

Looks grood to me

Reviewed-by: Eric Auger 

Eric

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 0/5] NVIDIA ARM SMMU Implementation

2020-07-20 Thread Will Deacon
On Sat, 18 Jul 2020 12:34:52 -0700, Krishna Reddy wrote:
> Changes in v11:
> Addressed Rob comment on DT binding patch to set min/maxItems of reg property 
> in else part.
> Rebased on top of 
> https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=for-joerg/arm-smmu/updates.
> 
> Changes in v10:
> Perform SMMU base ioremap before calling implementation init.
> Check for Global faults across both ARM MMU-500s during global interrupt.
> Check for context faults across all contexts of both ARM MMU-500s during 
> context fault interrupt.
> Add new DT binding nvidia,smmu-500 for NVIDIA implementation.
> https://lkml.org/lkml/2020/7/8/57
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/5] iommu/arm-smmu: move TLB timeout and spin count macros
  https://git.kernel.org/will/c/cd8479cf0de9
[2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init
  https://git.kernel.org/will/c/6c019f4e697e
[3/5] iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage
  https://git.kernel.org/will/c/aab5a1c88276
[4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU
  https://git.kernel.org/will/c/3d2deb0cdb69
[5/5] iommu/arm-smmu: Add global/context fault implementation hooks
  https://git.kernel.org/will/c/aa7ec73297df

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: arm64: Internal error: Oops: qcom_iommu_tlb_inv_context free_io_pgtable_ops on db410c

2020-07-20 Thread Robin Murphy

On 2020-07-20 08:17, Arnd Bergmann wrote:

On Mon, Jul 20, 2020 at 8:36 AM Naresh Kamboju
 wrote:


This kernel oops while boot linux mainline kernel on arm64  db410c device.

metadata:
   git branch: master
   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
   git commit: f8456690ba8eb18ea4714e68554e242a04f65cff
   git describe: v5.8-rc5-48-gf8456690ba8e
   make_kernelversion: 5.8.0-rc5
   kernel-config:
https://builds.tuxbuild.com/2aLnwV7BLStU0t1R1QPwHQ/kernel.config


Thanks for the report. Adding freedreno folks to Cc, as this may have something
to do with that driver.



[5.444121] Unable to handle kernel NULL pointer dereference at
virtual address 0018
[5.456615]   ESR = 0x9604
[5.464471]   SET = 0, FnV = 0
[5.464487]   EA = 0, S1PTW = 0
[5.466521] Data abort info:
[5.469971]   ISV = 0, ISS = 0x0004
[5.472768]   CM = 0, WnR = 0
[5.476172] user pgtable: 4k pages, 48-bit VAs, pgdp=bacba000
[5.479349] [0018] pgd=, p4d=
[5.485820] Internal error: Oops: 9604 [#1] PREEMPT SMP
[5.492448] Modules linked in: crct10dif_ce adv7511(+)
qcom_spmi_temp_alarm cec msm(+) mdt_loader qcom_camss videobuf2_dma_sg
drm_kms_helper v4l2_fwnode videobuf2_memops videobuf2_v4l2 qcom_rng
videobuf2_common i2c_qcom_cci display_connector socinfo drm qrtr ns
rmtfs_mem fuse
[5.500256] CPU: 0 PID: 286 Comm: systemd-udevd Not tainted 5.8.0-rc5 #1
[5.522484] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[5.529170] pstate: 2005 (nzCv daif -PAN -UAO BTYPE=--)
[5.535856] pc : qcom_iommu_tlb_inv_context+0x18/0xa8
[5.541148] lr : free_io_pgtable_ops+0x28/0x58
[5.546350] sp : 80001219b5f0
[5.550689] x29: 80001219b5f0 x28: 0013
[5.554078] x27: 0100 x26: 36add3b8
[5.559459] x25: 8915e910 x24: 3a5458c0
[5.564753] x23: 0003 x22: 36a37058
[5.570049] x21: 36a3a100 x20: 36a3a480
[5.575344] x19: 36a37158 x18: 
[5.580639] x17:  x16: 
[5.585935] x15: 0004 x14: 0368
[5.591229] x13:  x12: 39c61798
[5.596525] x11: 39c616d0 x10: 4000
[5.601820] x9 :  x8 : 39c616f8
[5.607114] x7 :  x6 : 09f699a0
[5.612410] x5 : 80001219b520 x4 : 36a3a000
[5.617705] x3 : 09f69904 x2 : 
[5.623001] x1 : 8000107e27e8 x0 : 3a545810
[5.628297] Call trace:
[5.633592]  qcom_iommu_tlb_inv_context+0x18/0xa8


This means that dev_iommu_fwspec_get() has returned NULL
in qcom_iommu_tlb_inv_context(), either because dev->iommu
is NULL, or because dev->iommu->fwspec is NULL.

qcom_iommu_tlb_inv_context() does not check for a NULL
pointer before using the returned object.

The bug is either in the lack of error handling, or the fact
that it's possible to get into this function for a device
that has not been fully set up.


Not quite - the device *was* properly set up, but has already been 
properly torn down again in the removal path by iommu_release_device(). 
The problem is that qcom-iommu kept the device pointer as its TLB cookie 
for the domain, but the domain has a longer lifespan than the validity 
of that device - that's a fundamental design flaw in the driver.


Robin.


[5.635764]  free_io_pgtable_ops+0x28/0x58
[5.640624]  qcom_iommu_domain_free+0x38/0x60
[5.644617]  iommu_group_release+0x4c/0x70
[5.649045]  kobject_put+0x6c/0x120
[5.653035]  kobject_del+0x64/0x90
[5.656421]  kobject_put+0xfc/0x120
[5.659893]  iommu_group_remove_device+0xdc/0xf0
[5.663281]  iommu_release_device+0x44/0x70
[5.668142]  iommu_bus_notifier+0xbc/0xd0
[5.672048]  notifier_call_chain+0x54/0x98
[5.676214]  blocking_notifier_call_chain+0x48/0x70
[5.680209]  device_del+0x26c/0x3a0
[5.684981]  platform_device_del.part.0+0x1c/0x88
[5.688453]  platform_device_unregister+0x24/0x40
[5.693316]  of_platform_device_destroy+0xe4/0xf8
[5.698002]  device_for_each_child+0x5c/0xa8
[5.702689]  of_platform_depopulate+0x3c/0x80
[5.707144]  msm_pdev_probe+0x1c4/0x308 [msm]


It was triggered by a failure in msm_pdev_probe(), which was
calls of_platform_depopulate() in its error handling code.
This is a combination of two problems:

a) Whatever caused msm_pdev_probe() to fail means that
the gpu won't be usable, though it should not have caused the
kernel to crash.

b) the error handling itself causing additional problems due
to failed unwinding.


[5.711286]  platform_drv_probe+0x54/0xa8
[5.715624]  really_probe+0xd8/0x320
[5.719617]  driver_probe_device+0x58/0xb8
[5.723263]  device_driver_attach+0x74/0x80
[5.727168]  __driver_attach+0x58/0xe0
[5.731248]  bus_for_each_d

Re: [PATCH v8 00/12] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)

2020-07-20 Thread Will Deacon
On Thu, Jun 18, 2020 at 05:51:13PM +0200, Jean-Philippe Brucker wrote:
> Since v7 [1], I split the series into three parts to ease review. This
> first one adds page table sharing to the SMMUv3 driver. The second one
> adds support for I/O page faults through PRI and Stall, and the last one
> adds additional and optional features (DVM, VHE and HTTU). SVA needs the
> three parts to work. No significant change apart from that, I just
> addressed the previous comments.
> 
> I'd rather everything went through the IOMMU tree but I'm assuming patch
> 1 will also go through the x86 tree as part of [2]. It is definitely
> required by patch 3 which is required by patch 11. I don't know how this
> kind of conflict is usually resolved, but if it's a problem I could
> further shrink the series to only patches 4-10 this cycle.

Modulo my review comments, I think most of this looks alright from the SMMU
side. However, I would really like it if the SVA driver parts could be moved
into a separate file (e.g. arm-smmu-v3-sva.c), with a separate config option
(dependent on the current one, so you can easily build a driver without SVA
support). Does that sound remotely feasible? If so, I think it would really
help in terms of maintainability, since the SVA model is really all about
the mm, whereas the driver model is all about the device. This makes it
really hard to read when you have to keep working out whether the current
'handle' is an mm_struct or an arm_smmu_device.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v5 11/15] vfio/type1: Allow invalidating first-level/stage IOMMU cache

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Monday, July 20, 2020 5:42 PM
> 
> Yi,
> 
> On 7/12/20 1:21 PM, Liu Yi L wrote:
> > This patch provides an interface allowing the userspace to invalidate
> > IOMMU cache for first-level page table. It is required when the first
> > level IOMMU page table is not managed by the host kernel in the nested
> > translation setup.
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Eric Auger 
> > Signed-off-by: Jacob Pan 
> > ---
> > v1 -> v2:
> > *) rename from "vfio/type1: Flush stage-1 IOMMU cache for nesting type"
> > *) rename vfio_cache_inv_fn() to vfio_dev_cache_invalidate_fn()
> > *) vfio_dev_cache_inv_fn() always successful
> > *) remove VFIO_IOMMU_CACHE_INVALIDATE, and reuse
> VFIO_IOMMU_NESTING_OP
> > ---
> >  drivers/vfio/vfio_iommu_type1.c | 50
> +
> >  include/uapi/linux/vfio.h   |  3 +++
> >  2 files changed, 53 insertions(+)
> >
> > diff --git a/drivers/vfio/vfio_iommu_type1.c 
> > b/drivers/vfio/vfio_iommu_type1.c
> > index f0f21ff..960cc59 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -3073,6 +3073,53 @@ static long vfio_iommu_handle_pgtbl_op(struct
> vfio_iommu *iommu,
> > return ret;
> >  }
> >
> > +static int vfio_dev_cache_invalidate_fn(struct device *dev, void *data)
> > +{
> > +   struct domain_capsule *dc = (struct domain_capsule *)data;
> > +   unsigned long arg = *(unsigned long *)dc->data;
> > +
> > +   iommu_cache_invalidate(dc->domain, dev, (void __user *)arg);
> > +   return 0;
> > +}
> > +
> > +static long vfio_iommu_invalidate_cache(struct vfio_iommu *iommu,
> > +   unsigned long arg)
> > +{
> > +   struct domain_capsule dc = { .data = &arg };
> > +   struct vfio_group *group;
> > +   struct vfio_domain *domain;
> > +   int ret = 0;
> > +   struct iommu_nesting_info *info;
> > +
> > +   mutex_lock(&iommu->lock);
> > +   /*
> > +* Cache invalidation is required for any nesting IOMMU,
> > +* so no need to check system-wide PASID support.
> > +*/
> > +   info = iommu->nesting_info;
> > +   if (!info || !(info->features & IOMMU_NESTING_FEAT_CACHE_INVLD)) {
> > +   ret = -EOPNOTSUPP;
> > +   goto out_unlock;
> > +   }
> > +
> > +   group = vfio_find_nesting_group(iommu);
> so I see you reuse it here. But still wondering if you cant't directly
> set dc.domain and dc.group group below using list_firt_entry?

I guess yes for current implementation. I also considered if I can
get a helper function to retrun a dc with group and domain field
initialized as it is common code used by both bind/unbind and cache_inv
path. perhaps something like get_domain_capsule_for_nesting()

> > +   if (!group) {
> > +   ret = -EINVAL;
> > +   goto out_unlock;
> > +   }
> > +
> > +   domain = list_first_entry(&iommu->domain_list,
> > + struct vfio_domain, next);
> > +   dc.group = group;
> > +   dc.domain = domain->domain;
> > +   iommu_group_for_each_dev(group->iommu_group, &dc,
> > +vfio_dev_cache_invalidate_fn);
> > +
> > +out_unlock:
> > +   mutex_unlock(&iommu->lock);
> > +   return ret;
> > +}
> > +
> >  static long vfio_iommu_type1_nesting_op(struct vfio_iommu *iommu,
> > unsigned long arg)
> >  {
> > @@ -3095,6 +3142,9 @@ static long vfio_iommu_type1_nesting_op(struct
> vfio_iommu *iommu,
> > case VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL:
> > ret = vfio_iommu_handle_pgtbl_op(iommu, false, arg + minsz);
> > break;
> > +   case VFIO_IOMMU_NESTING_OP_CACHE_INVLD:
> > +   ret = vfio_iommu_invalidate_cache(iommu, arg + minsz);
> > +   break;
> > default:
> > ret = -EINVAL;
> > }
> > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> > index a8ad786..845a5800 100644
> > --- a/include/uapi/linux/vfio.h
> > +++ b/include/uapi/linux/vfio.h
> > @@ -1225,6 +1225,8 @@ struct vfio_iommu_type1_pasid_request {
> >   * +-+---+
> >   * | UNBIND_PGTBL|  struct iommu_gpasid_bind_data|
> >   * +-+---+
> > + * | CACHE_INVLD |  struct iommu_cache_invalidate_info   |
> > + * +-+---+
> >   *
> >   * returns: 0 on success, -errno on failure.
> >   */
> > @@ -1237,6 +1239,7 @@ struct vfio_iommu_type1_nesting_op {
> >
> >  #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL   (0)
> >  #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL (1)
> > +#define VFIO_IOMMU_NESTING_OP_CACHE_INVLD  (2)
> >
> >  #define VFIO_IOMMU_NESTING_OP  _IO(VFIO_TYPE, VFIO_BASE + 19)
> >
> >
> Otherwise looks good to me

th

RE: [PATCH v5 10/15] vfio/type1: Support binding guest page tables to PASID

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Monday, July 20, 2020 5:37 PM
> 
> Yi,
> 
> On 7/12/20 1:21 PM, Liu Yi L wrote:
> > Nesting translation allows two-levels/stages page tables, with 1st level
> > for guest translations (e.g. GVA->GPA), 2nd level for host translations
> > (e.g. GPA->HPA). This patch adds interface for binding guest page tables
> > to a PASID. This PASID must have been allocated to user space before the
> by the userspace?

yes, it is. will modify it.

> > binding request.
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Jean-Philippe Brucker 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> > v3 -> v4:
> > *) address comments from Alex on v3
> >
> > v2 -> v3:
> > *) use __iommu_sva_unbind_gpasid() for unbind call issued by VFIO
> >https://lore.kernel.org/linux-iommu/1592931837-58223-6-git-send-email-
> jacob.jun@linux.intel.com/
> >
> > v1 -> v2:
> > *) rename subject from "vfio/type1: Bind guest page tables to host"
> > *) remove VFIO_IOMMU_BIND, introduce VFIO_IOMMU_NESTING_OP to support
> bind/
> >unbind guet page table
> > *) replaced vfio_iommu_for_each_dev() with a group level loop since this
> >series enforces one group per container w/ nesting type as start.
> > *) rename vfio_bind/unbind_gpasid_fn() to vfio_dev_bind/unbind_gpasid_fn()
> > *) vfio_dev_unbind_gpasid() always successful
> > *) use vfio_mm->pasid_lock to avoid race between PASID free and page table
> >bind/unbind
> > ---
> >  drivers/vfio/vfio_iommu_type1.c | 166
> 
> >  drivers/vfio/vfio_pasid.c   |  26 +++
> >  include/linux/vfio.h|  20 +
> >  include/uapi/linux/vfio.h   |  31 
> >  4 files changed, 243 insertions(+)
> >
> > diff --git a/drivers/vfio/vfio_iommu_type1.c 
> > b/drivers/vfio/vfio_iommu_type1.c
> > index 55b4065..f0f21ff 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -149,6 +149,30 @@ struct vfio_regions {
> >  #define DIRTY_BITMAP_PAGES_MAX  ((u64)INT_MAX)
> >  #define DIRTY_BITMAP_SIZE_MAX
> DIRTY_BITMAP_BYTES(DIRTY_BITMAP_PAGES_MAX)
> >
> > +struct domain_capsule {
> > +   struct vfio_group *group;
> > +   struct iommu_domain *domain;
> > +   void *data;
> > +};
> > +
> > +/* iommu->lock must be held */
> > +static struct vfio_group *vfio_find_nesting_group(struct vfio_iommu *iommu)
> > +{
> > +   struct vfio_domain *d;
> > +   struct vfio_group *group = NULL;
> > +
> > +   if (!iommu->nesting_info)
> > +   return NULL;
> > +
> > +   /* only support singleton container with nesting type */
> > +   list_for_each_entry(d, &iommu->domain_list, next) {
> > +   list_for_each_entry(group, &d->group_list, next) {
> > +   break;
> use list_first_entry?

yeah, based on the discussion in below link, we only support singleton
container with nesting type, also if no group in container, the nesting_info
will be cleared. so yes, list_first_entry will work as well.

https://lkml.org/lkml/2020/5/15/1028

> > +   }
> > +   }
> > +   return group;
> > +}
> > +
> >  static int put_pfn(unsigned long pfn, int prot);
> >
> >  static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu
> *iommu,
> > @@ -2349,6 +2373,48 @@ static int vfio_iommu_resv_refresh(struct vfio_iommu
> *iommu,
> > return ret;
> >  }
> >
> > +static int vfio_dev_bind_gpasid_fn(struct device *dev, void *data)
> > +{
> > +   struct domain_capsule *dc = (struct domain_capsule *)data;
> > +   unsigned long arg = *(unsigned long *)dc->data;
> > +
> > +   return iommu_sva_bind_gpasid(dc->domain, dev, (void __user *)arg);
> > +}
> > +
> > +static int vfio_dev_unbind_gpasid_fn(struct device *dev, void *data)
> > +{
> > +   struct domain_capsule *dc = (struct domain_capsule *)data;
> > +   unsigned long arg = *(unsigned long *)dc->data;
> > +
> > +   iommu_sva_unbind_gpasid(dc->domain, dev, (void __user *)arg);
> > +   return 0;
> > +}
> > +
> > +static int __vfio_dev_unbind_gpasid_fn(struct device *dev, void *data)
> > +{
> > +   struct domain_capsule *dc = (struct domain_capsule *)data;
> > +   struct iommu_gpasid_bind_data *unbind_data =
> > +   (struct iommu_gpasid_bind_data *)dc->data;
> > +
> > +   __iommu_sva_unbind_gpasid(dc->domain, dev, unbind_data);
> > +   return 0;
> > +}
> > +
> > +static void vfio_group_unbind_gpasid_fn(ioasid_t pasid, void *data)
> > +{
> > +   struct domain_capsule *dc = (struct domain_capsule *)data;
> > +   struct iommu_gpasid_bind_data unbind_data;
> > +
> > +   unbind_data.argsz = offsetof(struct iommu_gpasid_bind_data, vendor);
> > +   unbind_data.flags = 0;
> > +   unbind_data.hpasid = pasid;
> > +
> > +   dc->data = &unbind_data;
> > +
> > +   iommu_group_for_each_dev(dc->group->iommu_group,
> > +dc, __vfio_dev_unbin

RE: [PATCH v5 09/15] iommu/vt-d: Check ownership for PASIDs from user-space

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Monday, July 20, 2020 12:06 AM
> 
> Hi Yi,
> 
> On 7/12/20 1:21 PM, Liu Yi L wrote:
> > When an IOMMU domain with nesting attribute is used for guest SVA, a
> > system-wide PASID is allocated for binding with the device and the domain.
> > For security reason, we need to check the PASID passsed from user-space.
> passed

got it.

> > e.g. page table bind/unbind and PASID related cache invalidation.
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> >  drivers/iommu/intel/iommu.c | 10 ++
> >  drivers/iommu/intel/svm.c   |  7 +--
> >  2 files changed, 15 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 4d54198..a9504cb 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -5436,6 +5436,7 @@ intel_iommu_sva_invalidate(struct iommu_domain
> *domain, struct device *dev,
> > int granu = 0;
> > u64 pasid = 0;
> > u64 addr = 0;
> > +   void *pdata;
> >
> > granu = to_vtd_granularity(cache_type, inv_info->granularity);
> > if (granu == -EINVAL) {
> > @@ -5456,6 +5457,15 @@ intel_iommu_sva_invalidate(struct iommu_domain
> *domain, struct device *dev,
> >  (inv_info->granu.addr_info.flags &
> IOMMU_INV_ADDR_FLAGS_PASID))
> > pasid = inv_info->granu.addr_info.pasid;
> >
> > +   pdata = ioasid_find(dmar_domain->ioasid_sid, pasid, NULL);
> > +   if (!pdata) {
> > +   ret = -EINVAL;
> > +   goto out_unlock;
> > +   } else if (IS_ERR(pdata)) {
> > +   ret = PTR_ERR(pdata);
> > +   goto out_unlock;
> > +   }
> > +
> > switch (BIT(cache_type)) {
> > case IOMMU_CACHE_INV_TYPE_IOTLB:
> > /* HW will ignore LSB bits based on address mask */
> > diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> > index d2c0e1a..212dee0 100644
> > --- a/drivers/iommu/intel/svm.c
> > +++ b/drivers/iommu/intel/svm.c
> > @@ -319,7 +319,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain,
> struct device *dev,
> > dmar_domain = to_dmar_domain(domain);
> >
> > mutex_lock(&pasid_mutex);
> > -   svm = ioasid_find(INVALID_IOASID_SET, data->hpasid, NULL);
> I do not get what the call was supposed to do before that patch?

you mean patch 10/15 by "that patch", right? the ownership check should
be done as to prevent illegal bind request from userspace. before patch
10/15, it should be added.

> > +   svm = ioasid_find(dmar_domain->ioasid_sid, data->hpasid, NULL);
> > if (IS_ERR(svm)) {
> > ret = PTR_ERR(svm);
> > goto out;
> > @@ -436,6 +436,7 @@ int intel_svm_unbind_gpasid(struct iommu_domain
> *domain,
> > struct device *dev, ioasid_t pasid)
> >  {
> > struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> > +   struct dmar_domain *dmar_domain;
> > struct intel_svm_dev *sdev;
> > struct intel_svm *svm;
> > int ret = -EINVAL;
> > @@ -443,8 +444,10 @@ int intel_svm_unbind_gpasid(struct iommu_domain
> *domain,
> > if (WARN_ON(!iommu))
> > return -EINVAL;
> >
> > +   dmar_domain = to_dmar_domain(domain);
> > +
> > mutex_lock(&pasid_mutex);
> > -   svm = ioasid_find(INVALID_IOASID_SET, pasid, NULL);
> > +   svm = ioasid_find(dmar_domain->ioasid_sid, pasid, NULL);
> just to make sure, about the locking, can't domain->ioasid_sid change
> under the hood?

I guess not. intel_svm_unbind_gpasid() and iommu_domain_set_attr()
is called by vfio today, and within vfio, there is vfio_iommu->lock.

Regards,
Yi Liu

> 
> Thanks
> 
> Eric
> > if (!svm) {
> > ret = -EINVAL;
> > goto out;
> >

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 11/15] vfio/type1: Allow invalidating first-level/stage IOMMU cache

2020-07-20 Thread Auger Eric
Yi,

On 7/12/20 1:21 PM, Liu Yi L wrote:
> This patch provides an interface allowing the userspace to invalidate
> IOMMU cache for first-level page table. It is required when the first
> level IOMMU page table is not managed by the host kernel in the nested
> translation setup.
> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> Cc: Alex Williamson 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Eric Auger 
> Signed-off-by: Jacob Pan 
> ---
> v1 -> v2:
> *) rename from "vfio/type1: Flush stage-1 IOMMU cache for nesting type"
> *) rename vfio_cache_inv_fn() to vfio_dev_cache_invalidate_fn()
> *) vfio_dev_cache_inv_fn() always successful
> *) remove VFIO_IOMMU_CACHE_INVALIDATE, and reuse VFIO_IOMMU_NESTING_OP
> ---
>  drivers/vfio/vfio_iommu_type1.c | 50 
> +
>  include/uapi/linux/vfio.h   |  3 +++
>  2 files changed, 53 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index f0f21ff..960cc59 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -3073,6 +3073,53 @@ static long vfio_iommu_handle_pgtbl_op(struct 
> vfio_iommu *iommu,
>   return ret;
>  }
>  
> +static int vfio_dev_cache_invalidate_fn(struct device *dev, void *data)
> +{
> + struct domain_capsule *dc = (struct domain_capsule *)data;
> + unsigned long arg = *(unsigned long *)dc->data;
> +
> + iommu_cache_invalidate(dc->domain, dev, (void __user *)arg);
> + return 0;
> +}
> +
> +static long vfio_iommu_invalidate_cache(struct vfio_iommu *iommu,
> + unsigned long arg)
> +{
> + struct domain_capsule dc = { .data = &arg };
> + struct vfio_group *group;
> + struct vfio_domain *domain;
> + int ret = 0;
> + struct iommu_nesting_info *info;
> +
> + mutex_lock(&iommu->lock);
> + /*
> +  * Cache invalidation is required for any nesting IOMMU,
> +  * so no need to check system-wide PASID support.
> +  */
> + info = iommu->nesting_info;
> + if (!info || !(info->features & IOMMU_NESTING_FEAT_CACHE_INVLD)) {
> + ret = -EOPNOTSUPP;
> + goto out_unlock;
> + }
> +
> + group = vfio_find_nesting_group(iommu);
so I see you reuse it here. But still wondering if you cant't directly
set dc.domain and dc.group group below using list_firt_entry?
> + if (!group) {
> + ret = -EINVAL;
> + goto out_unlock;
> + }
> +
> + domain = list_first_entry(&iommu->domain_list,
> +   struct vfio_domain, next);
> + dc.group = group;
> + dc.domain = domain->domain;
> + iommu_group_for_each_dev(group->iommu_group, &dc,
> +  vfio_dev_cache_invalidate_fn);
> +
> +out_unlock:
> + mutex_unlock(&iommu->lock);
> + return ret;
> +}
> +
>  static long vfio_iommu_type1_nesting_op(struct vfio_iommu *iommu,
>   unsigned long arg)
>  {
> @@ -3095,6 +3142,9 @@ static long vfio_iommu_type1_nesting_op(struct 
> vfio_iommu *iommu,
>   case VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL:
>   ret = vfio_iommu_handle_pgtbl_op(iommu, false, arg + minsz);
>   break;
> + case VFIO_IOMMU_NESTING_OP_CACHE_INVLD:
> + ret = vfio_iommu_invalidate_cache(iommu, arg + minsz);
> + break;
>   default:
>   ret = -EINVAL;
>   }
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index a8ad786..845a5800 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1225,6 +1225,8 @@ struct vfio_iommu_type1_pasid_request {
>   * +-+---+
>   * | UNBIND_PGTBL|  struct iommu_gpasid_bind_data|
>   * +-+---+
> + * | CACHE_INVLD |  struct iommu_cache_invalidate_info   |
> + * +-+---+
>   *
>   * returns: 0 on success, -errno on failure.
>   */
> @@ -1237,6 +1239,7 @@ struct vfio_iommu_type1_nesting_op {
>  
>  #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL (0)
>  #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL   (1)
> +#define VFIO_IOMMU_NESTING_OP_CACHE_INVLD(2)
>  
>  #define VFIO_IOMMU_NESTING_OP_IO(VFIO_TYPE, VFIO_BASE + 19)
>  
> 
Otherwise looks good to me

Thanks

Eric

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 10/15] vfio/type1: Support binding guest page tables to PASID

2020-07-20 Thread Auger Eric
Yi,

On 7/12/20 1:21 PM, Liu Yi L wrote:
> Nesting translation allows two-levels/stages page tables, with 1st level
> for guest translations (e.g. GVA->GPA), 2nd level for host translations
> (e.g. GPA->HPA). This patch adds interface for binding guest page tables
> to a PASID. This PASID must have been allocated to user space before the
by the userspace?
> binding request.
> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> Cc: Alex Williamson 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Jean-Philippe Brucker 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Jacob Pan 
> ---
> v3 -> v4:
> *) address comments from Alex on v3
> 
> v2 -> v3:
> *) use __iommu_sva_unbind_gpasid() for unbind call issued by VFIO
>
> https://lore.kernel.org/linux-iommu/1592931837-58223-6-git-send-email-jacob.jun@linux.intel.com/
> 
> v1 -> v2:
> *) rename subject from "vfio/type1: Bind guest page tables to host"
> *) remove VFIO_IOMMU_BIND, introduce VFIO_IOMMU_NESTING_OP to support bind/
>unbind guet page table
> *) replaced vfio_iommu_for_each_dev() with a group level loop since this
>series enforces one group per container w/ nesting type as start.
> *) rename vfio_bind/unbind_gpasid_fn() to vfio_dev_bind/unbind_gpasid_fn()
> *) vfio_dev_unbind_gpasid() always successful
> *) use vfio_mm->pasid_lock to avoid race between PASID free and page table
>bind/unbind
> ---
>  drivers/vfio/vfio_iommu_type1.c | 166 
> 
>  drivers/vfio/vfio_pasid.c   |  26 +++
>  include/linux/vfio.h|  20 +
>  include/uapi/linux/vfio.h   |  31 
>  4 files changed, 243 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 55b4065..f0f21ff 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -149,6 +149,30 @@ struct vfio_regions {
>  #define DIRTY_BITMAP_PAGES_MAX((u64)INT_MAX)
>  #define DIRTY_BITMAP_SIZE_MAX 
> DIRTY_BITMAP_BYTES(DIRTY_BITMAP_PAGES_MAX)
>  
> +struct domain_capsule {
> + struct vfio_group *group;
> + struct iommu_domain *domain;
> + void *data;
> +};
> +
> +/* iommu->lock must be held */
> +static struct vfio_group *vfio_find_nesting_group(struct vfio_iommu *iommu)
> +{
> + struct vfio_domain *d;
> + struct vfio_group *group = NULL;
> +
> + if (!iommu->nesting_info)
> + return NULL;
> +
> + /* only support singleton container with nesting type */
> + list_for_each_entry(d, &iommu->domain_list, next) {
> + list_for_each_entry(group, &d->group_list, next) {
> + break;
use list_first_entry?
> + }
> + }
> + return group;
> +}
> +
>  static int put_pfn(unsigned long pfn, int prot);
>  
>  static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu 
> *iommu,
> @@ -2349,6 +2373,48 @@ static int vfio_iommu_resv_refresh(struct vfio_iommu 
> *iommu,
>   return ret;
>  }
>  
> +static int vfio_dev_bind_gpasid_fn(struct device *dev, void *data)
> +{
> + struct domain_capsule *dc = (struct domain_capsule *)data;
> + unsigned long arg = *(unsigned long *)dc->data;
> +
> + return iommu_sva_bind_gpasid(dc->domain, dev, (void __user *)arg);
> +}
> +
> +static int vfio_dev_unbind_gpasid_fn(struct device *dev, void *data)
> +{
> + struct domain_capsule *dc = (struct domain_capsule *)data;
> + unsigned long arg = *(unsigned long *)dc->data;
> +
> + iommu_sva_unbind_gpasid(dc->domain, dev, (void __user *)arg);
> + return 0;
> +}
> +
> +static int __vfio_dev_unbind_gpasid_fn(struct device *dev, void *data)
> +{
> + struct domain_capsule *dc = (struct domain_capsule *)data;
> + struct iommu_gpasid_bind_data *unbind_data =
> + (struct iommu_gpasid_bind_data *)dc->data;
> +
> + __iommu_sva_unbind_gpasid(dc->domain, dev, unbind_data);
> + return 0;
> +}
> +
> +static void vfio_group_unbind_gpasid_fn(ioasid_t pasid, void *data)
> +{
> + struct domain_capsule *dc = (struct domain_capsule *)data;
> + struct iommu_gpasid_bind_data unbind_data;
> +
> + unbind_data.argsz = offsetof(struct iommu_gpasid_bind_data, vendor);
> + unbind_data.flags = 0;
> + unbind_data.hpasid = pasid;
> +
> + dc->data = &unbind_data;
> +
> + iommu_group_for_each_dev(dc->group->iommu_group,
> +  dc, __vfio_dev_unbind_gpasid_fn);
> +}
> +
>  static void vfio_iommu_type1_detach_group(void *iommu_data,
> struct iommu_group *iommu_group)
>  {
> @@ -2392,6 +2458,21 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>   if (!group)
>   continue;
>  
> + if (iommu->nesting_info && iommu->vmm &&
> + (iommu->nesting_info->features &
> + IOMMU_NESTING_FEAT_BIND_PGTBL)) {
> +  

RE: [PATCH v5 08/15] iommu: Pass domain to sva_unbind_gpasid()

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Sunday, July 19, 2020 11:38 PM
> 
> Yi,
> 
> On 7/12/20 1:21 PM, Liu Yi L wrote:
> > From: Yi Sun 
> >
> > Current interface is good enough for SVA virtualization on an assigned
> > physical PCI device, but when it comes to mediated devices, a physical
> > device may attached with multiple aux-domains. Also, for guest unbind,
> > the PASID to be unbind should be allocated to the VM. This check
> > requires to know the ioasid_set which is associated with the domain.
> >
> > So this interface needs to pass in domain info. Then the iommu driver
> > is able to know which domain will be used for the 2nd stage
> > translation of the nesting mode and also be able to do PASID ownership
> > check. This patch passes @domain per the above reason.
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Yi Sun 
> > Signed-off-by: Liu Yi L 
> > ---
> > v2 -> v3:
> > *) pass in domain info only
> > *) use ioasid_t for pasid instead of int type
> >
> > v1 -> v2:
> > *) added in v2.
> > ---
> >  drivers/iommu/intel/svm.c   | 3 ++-
> >  drivers/iommu/iommu.c   | 2 +-
> >  include/linux/intel-iommu.h | 3 ++-
> >  include/linux/iommu.h   | 3 ++-
> >  4 files changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> > index b9a9c55..d2c0e1a 100644
> > --- a/drivers/iommu/intel/svm.c
> > +++ b/drivers/iommu/intel/svm.c
> > @@ -432,7 +432,8 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain,
> struct device *dev,
> > return ret;
> >  }
> >
> > -int intel_svm_unbind_gpasid(struct device *dev, int pasid)
> > +int intel_svm_unbind_gpasid(struct iommu_domain *domain,
> > +   struct device *dev, ioasid_t pasid)
> int -> ioasid_t proto change is not described in the commit message,

oops, yes. btw. I noticed there is another thread which is going to use
u32 for pasid. perhaps I need to drop such change.

https://lore.kernel.org/linux-iommu/1594684087-61184-2-git-send-email-fenghua...@intel.com/#Z30drivers:iommu:iommu.c

> >  {
> > struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> > struct intel_svm_dev *sdev;
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index
> > 7910249..d3e554c 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -2151,7 +2151,7 @@ int __iommu_sva_unbind_gpasid(struct iommu_domain
> *domain, struct device *dev,
> > if (unlikely(!domain->ops->sva_unbind_gpasid))
> > return -ENODEV;
> >
> > -   return domain->ops->sva_unbind_gpasid(dev, data->hpasid);
> > +   return domain->ops->sva_unbind_gpasid(domain, dev, data->hpasid);
> >  }
> >  EXPORT_SYMBOL_GPL(__iommu_sva_unbind_gpasid);
> >
> > diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> > index 0d0ab32..18f292e 100644
> > --- a/include/linux/intel-iommu.h
> > +++ b/include/linux/intel-iommu.h
> > @@ -738,7 +738,8 @@ extern int intel_svm_enable_prq(struct intel_iommu
> > *iommu);  extern int intel_svm_finish_prq(struct intel_iommu *iommu);
> > int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev,
> >   struct iommu_gpasid_bind_data *data); -int
> > intel_svm_unbind_gpasid(struct device *dev, int pasid);
> > +int intel_svm_unbind_gpasid(struct iommu_domain *domain,
> > +   struct device *dev, ioasid_t pasid);
> >  struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm,
> >  void *drvdata);
> >  void intel_svm_unbind(struct iommu_sva *handle); diff --git
> > a/include/linux/iommu.h b/include/linux/iommu.h index e84a1d5..ca5edd8
> > 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -303,7 +303,8 @@ struct iommu_ops {
> > int (*sva_bind_gpasid)(struct iommu_domain *domain,
> > struct device *dev, struct iommu_gpasid_bind_data 
> > *data);
> >
> > -   int (*sva_unbind_gpasid)(struct device *dev, int pasid);
> > +   int (*sva_unbind_gpasid)(struct iommu_domain *domain,
> > +struct device *dev, ioasid_t pasid);
> >
> > int (*def_domain_type)(struct device *dev);
> >
> >
> Besides
> Reviewed-by: Eric Auger 
>

thanks.

Regards,
Yi Liu

> Eric

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 5/5] iommu/arm-smmu: Setup identity domain for boot mappings

2020-07-20 Thread Will Deacon
On Thu, Jul 16, 2020 at 05:16:19PM -0700, Bjorn Andersson wrote:
> With many Qualcomm platforms not having functional S2CR BYPASS a
> temporary IOMMU domain, without translation, needs to be allocated in
> order to allow these memory transactions.
> 
> Unfortunately the boot loader uses the first few context banks, so
> rather than overwriting a active bank the last context bank is used and
> streams are diverted here during initialization.
> 
> This also performs the readback of SMR registers for the Qualcomm
> platform, to trigger the mechanism.
> 
> This is based on prior work by Thierry Reding and Laurentiu Tudor.
> 
> Tested-by: John Stultz 
> Tested-by: Vinod Koul 
> Signed-off-by: Bjorn Andersson 
> ---
> 
> Changes since v1:
> - Rebased to avoid conflict
> - Picked up tested-by
> 
>  drivers/iommu/arm-smmu-qcom.c | 11 +
>  drivers/iommu/arm-smmu.c  | 79 +--

Perhaps the CB allocator callback can help to reduce the changes to the core
driver here. What do you think?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/5] iommu/arm-smmu: Emulate bypass by using context banks

2020-07-20 Thread Will Deacon
On Thu, Jul 16, 2020 at 05:16:16PM -0700, Bjorn Andersson wrote:
> Some firmware found on various Qualcomm platforms traps writes to S2CR
> of type BYPASS and writes FAULT into the register. This prevents us from
> marking the streams for the display controller as BYPASS to allow
> continued scanout of the screen through the initialization of the ARM
> SMMU.
> 
> This adds a Qualcomm specific cfg_probe function, which probes the
> behavior of the S2CR registers and if found faulty enables the related
> quirk. Based on this quirk context banks are allocated for IDENTITY
> domains as well, but with ARM_SMMU_SCTLR_M omitted.
> 
> The result is valid stream mappings, without translation.
> 
> Tested-by: John Stultz 
> Tested-by: Vinod Koul 
> Signed-off-by: Bjorn Andersson 
> ---
> 
> Changes since v1:
> - Picked up tested-by
> 
>  drivers/iommu/arm-smmu-qcom.c | 21 +
>  drivers/iommu/arm-smmu.c  | 14 --
>  drivers/iommu/arm-smmu.h  |  3 +++
>  3 files changed, 36 insertions(+), 2 deletions(-)

[...]

> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index fb85e716ae9a..5d5fe6741ed4 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -654,7 +654,9 @@ static void arm_smmu_write_context_bank(struct 
> arm_smmu_device *smmu, int idx)
>  
>   /* SCTLR */
>   reg = ARM_SMMU_SCTLR_CFIE | ARM_SMMU_SCTLR_CFRE | ARM_SMMU_SCTLR_AFE |
> -   ARM_SMMU_SCTLR_TRE | ARM_SMMU_SCTLR_M;
> +   ARM_SMMU_SCTLR_TRE;
> + if (cfg->m)
> + reg |= ARM_SMMU_SCTLR_M;
>   if (stage1)
>   reg |= ARM_SMMU_SCTLR_S1_ASIDPNE;
>   if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
> @@ -678,7 +680,11 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   if (smmu_domain->smmu)
>   goto out_unlock;
>  
> - if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> + /*
> +  * Nothing to do for IDENTITY domains,unless disabled context banks are
> +  * used to emulate bypass mappings on Qualcomm platforms.
> +  */
> + if (domain->type == IOMMU_DOMAIN_IDENTITY && !smmu->qcom_bypass_quirk) {

Given that the other thread [1] with Jordan (why haven't you cc'd him?! --
adding him now) has identified the need for a callback to allocate the
context bank, why don't we use the same sort of idea here? If the impl
provides a CB allocator function, call it irrespective of the domain type.
If it allocates a domain even for an identity domain, then we can install
if with SCTLR.M clear.

Will

[1] https://lore.kernel.org/r/20200716151625.ga14...@jcrouse1-lnx.qualcomm.com
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v5 07/15] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free)

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Sunday, July 19, 2020 11:39 PM
> 
> Yi,
> 
> On 7/12/20 1:21 PM, Liu Yi L wrote:
> > This patch allows user space to request PASID allocation/free, e.g.
> > when serving the request from the guest.
> >
> > PASIDs that are not freed by userspace are automatically freed when
> > the IOASID set is destroyed when process exits.
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Yi Sun 
> > Signed-off-by: Jacob Pan 
> > ---
> > v4 -> v5:
> > *) address comments from Eric Auger.
> > *) the comments for the PASID_FREE request is addressed in patch 5/15 of
> >this series.
> >
> > v3 -> v4:
> > *) address comments from v3, except the below comment against the range
> >of PASID_FREE request. needs more help on it.
> > "> +if (req.range.min > req.range.max)
> >
> > Is it exploitable that a user can spin the kernel for a long time in
> > the case of a free by calling this with [0, MAX_UINT] regardless of
> > their actual allocations?"
> >
> > https://lore.kernel.org/linux-iommu/20200702151832.048b4...@x1.home/
> >
> > v1 -> v2:
> > *) move the vfio_mm related code to be a seprate module
> > *) use a single structure for alloc/free, could support a range of
> > PASIDs
> > *) fetch vfio_mm at group_attach time instead of at iommu driver open
> > time
> > ---
> >  drivers/vfio/Kconfig|  1 +
> >  drivers/vfio/vfio_iommu_type1.c | 85
> +
> >  drivers/vfio/vfio_pasid.c   | 10 +
> >  include/linux/vfio.h|  6 +++
> >  include/uapi/linux/vfio.h   | 37 ++
> >  5 files changed, 139 insertions(+)
> >
> > diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index
> > 3d8a108..95d90c6 100644
> > --- a/drivers/vfio/Kconfig
> > +++ b/drivers/vfio/Kconfig
> > @@ -2,6 +2,7 @@
> >  config VFIO_IOMMU_TYPE1
> > tristate
> > depends on VFIO
> > +   select VFIO_PASID if (X86)
> > default n
> >
> >  config VFIO_IOMMU_SPAPR_TCE
> > diff --git a/drivers/vfio/vfio_iommu_type1.c
> > b/drivers/vfio/vfio_iommu_type1.c index ed80104..55b4065 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -76,6 +76,7 @@ struct vfio_iommu {
> > booldirty_page_tracking;
> > boolpinned_page_dirty_scope;
> > struct iommu_nesting_info   *nesting_info;
> > +   struct vfio_mm  *vmm;
> >  };
> >
> >  struct vfio_domain {
> > @@ -1937,6 +1938,11 @@ static void vfio_iommu_iova_insert_copy(struct
> > vfio_iommu *iommu,
> >
> >  static void vfio_iommu_release_nesting_info(struct vfio_iommu *iommu)
> > {
> > +   if (iommu->vmm) {
> > +   vfio_mm_put(iommu->vmm);
> > +   iommu->vmm = NULL;
> > +   }
> > +
> > kfree(iommu->nesting_info);
> > iommu->nesting_info = NULL;
> >  }
> > @@ -2071,6 +2077,26 @@ static int vfio_iommu_type1_attach_group(void
> *iommu_data,
> > iommu->nesting_info);
> > if (ret)
> > goto out_detach;
> > +
> > +   if (iommu->nesting_info->features &
> > +   IOMMU_NESTING_FEAT_SYSWIDE_PASID)
> {
> > +   struct vfio_mm *vmm;
> > +   int sid;
> > +
> > +   vmm = vfio_mm_get_from_task(current);
> > +   if (IS_ERR(vmm)) {
> > +   ret = PTR_ERR(vmm);
> > +   goto out_detach;
> > +   }
> > +   iommu->vmm = vmm;
> > +
> > +   sid = vfio_mm_ioasid_sid(vmm);
> > +   ret = iommu_domain_set_attr(domain->domain,
> > +   DOMAIN_ATTR_IOASID_SID,
> > +   &sid);
> > +   if (ret)
> > +   goto out_detach;
> > +   }
> > }
> >
> > /* Get aperture info */
> > @@ -2855,6 +2881,63 @@ static int vfio_iommu_type1_dirty_pages(struct
> vfio_iommu *iommu,
> > return -EINVAL;
> >  }
> >
> > +static int vfio_iommu_type1_pasid_alloc(struct vfio_iommu *iommu,
> > +   unsigned int min,
> > +   unsigned int max)
> > +{
> > +   int ret = -EOPNOTSUPP;
> > +
> > +   mutex_lock(&iommu->lock);
> > +   if (iommu->vmm)
> > +   ret = vfio_pasid_alloc(iommu->vmm, min, max);
> > +   mutex_unlock(&iommu->lock);
> > +   return ret;
> > +}
> > +
> > +static int vfio_iommu_type1_pasid_free(struct vfio_iommu *iommu,
> > +  unsigned int min,
> > +  unsigned int max)
> > +{
> > +   int ret = -EOPNOTSUPP;
> > +
> > +   mutex_lock(&iommu->lock);
> > +   if (iommu->vmm) {
>

RE: [PATCH v5 04/15] vfio/type1: Report iommu nesting info to userspace

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Monday, July 20, 2020 4:33 PM
> 
> Yi,
> 
> On 7/20/20 9:51 AM, Liu, Yi L wrote:
> > Hi Eric,
> >
> >> From: Auger Eric 
> >> Sent: Saturday, July 18, 2020 1:34 AM
> >>
> >> Yi,
> >>
> >> On 7/12/20 1:20 PM, Liu Yi L wrote:
> >>> This patch exports iommu nesting capability info to user space
> >>> through VFIO. User space is expected to check this info for supported 
> >>> uAPIs (e.g.
> >> it is not only to check the supported uAPIS but rather to know which
> >> callbacks it must call upon vIOMMU events and which features are
> >> supported by the physical IOMMU.
> >
> > yes, will refine the description per your comment.
> >
> >>> PASID alloc/free, bind page table, and cache invalidation) and the
> >>> vendor specific format information for first level/stage page table
> >>> that will be bound to.
> >>>
> >>> The nesting info is available only after the nesting iommu type is
> >>> set for a container.
> >> to NESTED type
> >
> > you mean "The nesting info is available only after container set to be 
> > NESTED
> type."
> >
> > right?
> correct

got you.

> >
> >>  Current implementation imposes one limitation - one
> >>> nesting container should include at most one group. The philosophy
> >>> of vfio container is having all groups/devices within the container
> >>> share the same IOMMU context. When vSVA is enabled, one IOMMU
> >>> context could include one 2nd-level address space and multiple 1st-level 
> >>> address
> spaces.
> >>> While the 2nd-level address space is reasonably sharable by multiple
> >>> groups , blindly sharing 1st-level address spaces across all groups
> >>> within the container might instead break the guest expectation. In
> >>> the future sub/ super container concept might be introduced to allow
> >>> partial address space sharing within an IOMMU context. But for now
> >>> let's go with this restriction by requiring singleton container for
> >>> using nesting iommu features. Below link has the related discussion about 
> >>> this
> decision.
> >>
> >> Maybe add a note about SMMU related changes spotted by Jean.
> >
> > I guess you mean the comments Jean gave in patch 3/15, right? I'll
> > copy his comments and also give the below link as well.
> >
> > https://lore.kernel.org/kvm/20200717090900.GC4850@myrica/
> correct

I see.

Regards,
Yi Liu

> Thanks
> 
> Eric
> >
> >>>
> >>> https://lkml.org/lkml/2020/5/15/1028
> >>>
> >>> Cc: Kevin Tian 
> >>> CC: Jacob Pan 
> >>> Cc: Alex Williamson 
> >>> Cc: Eric Auger 
> >>> Cc: Jean-Philippe Brucker 
> >>> Cc: Joerg Roedel 
> >>> Cc: Lu Baolu 
> >>> Signed-off-by: Liu Yi L 
> >>> ---
> >>> v4 -> v5:
> >>> *) address comments from Eric Auger.
> >>> *) return struct iommu_nesting_info for
> >> VFIO_IOMMU_TYPE1_INFO_CAP_NESTING as
> >>>cap is much "cheap", if needs extension in future, just define another 
> >>> cap.
> >>>https://lore.kernel.org/kvm/20200708132947.5b7ee...@x1.home/
> >>>
> >>> v3 -> v4:
> >>> *) address comments against v3.
> >>>
> >>> v1 -> v2:
> >>> *) added in v2
> >>> ---
> >>>  drivers/vfio/vfio_iommu_type1.c | 102
> >> +++-
> >>>  include/uapi/linux/vfio.h   |  19 
> >>>  2 files changed, 109 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/drivers/vfio/vfio_iommu_type1.c
> >>> b/drivers/vfio/vfio_iommu_type1.c index 3bd70ff..ed80104 100644
> >>> --- a/drivers/vfio/vfio_iommu_type1.c
> >>> +++ b/drivers/vfio/vfio_iommu_type1.c
> >>> @@ -62,18 +62,20 @@ MODULE_PARM_DESC(dma_entry_limit,
> >>>"Maximum number of user DMA mappings per container (65535).");
> >>>
> >>>  struct vfio_iommu {
> >>> - struct list_headdomain_list;
> >>> - struct list_headiova_list;
> >>> - struct vfio_domain  *external_domain; /* domain for external user */
> >>> - struct mutexlock;
> >>> - struct rb_root  dma_list;
> >>> - struct blocking_notifier_head notifier;
> >>> - unsigned intdma_avail;
> >>> - uint64_tpgsize_bitmap;
> >>> - boolv2;
> >>> - boolnesting;
> >>> - booldirty_page_tracking;
> >>> - boolpinned_page_dirty_scope;
> >>> + struct list_headdomain_list;
> >>> + struct list_headiova_list;
> >>> + /* domain for external user */
> >>> + struct vfio_domain  *external_domain;
> >>> + struct mutexlock;
> >>> + struct rb_root  dma_list;
> >>> + struct blocking_notifier_head   notifier;
> >>> + unsigned intdma_avail;
> >>> + uint64_tpgsize_bitmap;
> >>> + boolv2;
> >>> + boolnesting;
> >>> + booldirty_page_tracking;
> >>> + boolpinned_page_dirty_scope;
> >>> + struct iommu_nesting_info   *nesting_info;
> >>>  };
> >>>
> >>>  struct vfio_domain {
> >>> 

RE: [PATCH v5 05/15] vfio: Add PASID allocation/free support

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Monday, July 20, 2020 4:26 PM
[...]
> >>> +int vfio_pasid_alloc(struct vfio_mm *vmm, int min, int max) {
> >>> + ioasid_t pasid;
> >>> + struct vfio_pasid *vid;
> >>> +
> >>> + pasid = ioasid_alloc(vmm->ioasid_sid, min, max, NULL);
> >>> + if (pasid == INVALID_IOASID)
> >>> + return -ENOSPC;
> >>> +
> >>> + vid = kzalloc(sizeof(*vid), GFP_KERNEL);
> >>> + if (!vid) {
> >>> + ioasid_free(pasid);
> >>> + return -ENOMEM;
> >>> + }
> >>> +
> >>> + vid->pasid = pasid;
> >>> +
> >>> + mutex_lock(&vmm->pasid_lock);
> >>> + vfio_link_pasid(vmm, vid);
> >>> + mutex_unlock(&vmm->pasid_lock);
> >>> +
> >>> + return pasid;
> >>> +}
> >> I am not totally convinced by your previous reply on EXPORT_SYMBOL_GP()
> >> irrelevance in this patch. But well ;-)
> >
> > I recalled my memory, I think it's made per a comment from Chris.
> > I guess it may make sense to export symbols together with the exact
> > driver user of it in this patch as well :-) but maybe I misunderstood
> > him. if yes, I may add the symbol export back in this patch.
> >
> > https://lore.kernel.org/linux-iommu/20200331075331.ga26...@infradead.org/#t
> OK I don't know the best practice here. Anyway there is no caller at
> this stage either. I think you may describe in the commit message the
> vfio_iommu_type1 will be the eventual user of the exported functions in
> this module, what are the exact exported functions here. You may also
> explain the motivation behind creating a separate module.

got it. will add it.

Regards,
Yi Liu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 04/15] vfio/type1: Report iommu nesting info to userspace

2020-07-20 Thread Auger Eric
Yi,

On 7/20/20 9:51 AM, Liu, Yi L wrote:
> Hi Eric,
> 
>> From: Auger Eric 
>> Sent: Saturday, July 18, 2020 1:34 AM
>>
>> Yi,
>>
>> On 7/12/20 1:20 PM, Liu Yi L wrote:
>>> This patch exports iommu nesting capability info to user space through
>>> VFIO. User space is expected to check this info for supported uAPIs (e.g.
>> it is not only to check the supported uAPIS but rather to know which
>> callbacks it must call upon vIOMMU events and which features are
>> supported by the physical IOMMU.
> 
> yes, will refine the description per your comment.
> 
>>> PASID alloc/free, bind page table, and cache invalidation) and the vendor
>>> specific format information for first level/stage page table that will be
>>> bound to.
>>>
>>> The nesting info is available only after the nesting iommu type is set
>>> for a container.
>> to NESTED type
> 
> you mean "The nesting info is available only after container set to be NESTED 
> type."
> 
> right?
correct
> 
>>  Current implementation imposes one limitation - one
>>> nesting container should include at most one group. The philosophy of
>>> vfio container is having all groups/devices within the container share
>>> the same IOMMU context. When vSVA is enabled, one IOMMU context could
>>> include one 2nd-level address space and multiple 1st-level address spaces.
>>> While the 2nd-level address space is reasonably sharable by multiple groups
>>> , blindly sharing 1st-level address spaces across all groups within the
>>> container might instead break the guest expectation. In the future sub/
>>> super container concept might be introduced to allow partial address space
>>> sharing within an IOMMU context. But for now let's go with this restriction
>>> by requiring singleton container for using nesting iommu features. Below
>>> link has the related discussion about this decision.
>>
>> Maybe add a note about SMMU related changes spotted by Jean.
> 
> I guess you mean the comments Jean gave in patch 3/15, right? I'll
> copy his comments and also give the below link as well.
> 
> https://lore.kernel.org/kvm/20200717090900.GC4850@myrica/
correct

Thanks

Eric
> 
>>>
>>> https://lkml.org/lkml/2020/5/15/1028
>>>
>>> Cc: Kevin Tian 
>>> CC: Jacob Pan 
>>> Cc: Alex Williamson 
>>> Cc: Eric Auger 
>>> Cc: Jean-Philippe Brucker 
>>> Cc: Joerg Roedel 
>>> Cc: Lu Baolu 
>>> Signed-off-by: Liu Yi L 
>>> ---
>>> v4 -> v5:
>>> *) address comments from Eric Auger.
>>> *) return struct iommu_nesting_info for
>> VFIO_IOMMU_TYPE1_INFO_CAP_NESTING as
>>>cap is much "cheap", if needs extension in future, just define another 
>>> cap.
>>>https://lore.kernel.org/kvm/20200708132947.5b7ee...@x1.home/
>>>
>>> v3 -> v4:
>>> *) address comments against v3.
>>>
>>> v1 -> v2:
>>> *) added in v2
>>> ---
>>>  drivers/vfio/vfio_iommu_type1.c | 102
>> +++-
>>>  include/uapi/linux/vfio.h   |  19 
>>>  2 files changed, 109 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/vfio/vfio_iommu_type1.c 
>>> b/drivers/vfio/vfio_iommu_type1.c
>>> index 3bd70ff..ed80104 100644
>>> --- a/drivers/vfio/vfio_iommu_type1.c
>>> +++ b/drivers/vfio/vfio_iommu_type1.c
>>> @@ -62,18 +62,20 @@ MODULE_PARM_DESC(dma_entry_limit,
>>>  "Maximum number of user DMA mappings per container (65535).");
>>>
>>>  struct vfio_iommu {
>>> -   struct list_headdomain_list;
>>> -   struct list_headiova_list;
>>> -   struct vfio_domain  *external_domain; /* domain for external user */
>>> -   struct mutexlock;
>>> -   struct rb_root  dma_list;
>>> -   struct blocking_notifier_head notifier;
>>> -   unsigned intdma_avail;
>>> -   uint64_tpgsize_bitmap;
>>> -   boolv2;
>>> -   boolnesting;
>>> -   booldirty_page_tracking;
>>> -   boolpinned_page_dirty_scope;
>>> +   struct list_headdomain_list;
>>> +   struct list_headiova_list;
>>> +   /* domain for external user */
>>> +   struct vfio_domain  *external_domain;
>>> +   struct mutexlock;
>>> +   struct rb_root  dma_list;
>>> +   struct blocking_notifier_head   notifier;
>>> +   unsigned intdma_avail;
>>> +   uint64_tpgsize_bitmap;
>>> +   boolv2;
>>> +   boolnesting;
>>> +   booldirty_page_tracking;
>>> +   boolpinned_page_dirty_scope;
>>> +   struct iommu_nesting_info   *nesting_info;
>>>  };
>>>
>>>  struct vfio_domain {
>>> @@ -130,6 +132,9 @@ struct vfio_regions {
>>>  #define IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)\
>>> (!list_empty(&iommu->domain_list))
>>>
>>> +#define CONTAINER_HAS_DOMAIN(iommu)(((iommu)->external_domain) || \
>>> +(!list_empty(&(io

Re: [PATCH v5 05/15] vfio: Add PASID allocation/free support

2020-07-20 Thread Auger Eric
Hi Yi,

On 7/20/20 10:03 AM, Liu, Yi L wrote:
> Hi Eric,
> 
>> From: Auger Eric 
>> Sent: Sunday, July 19, 2020 11:39 PM
>>
>> Yi,
>>
>> On 7/12/20 1:21 PM, Liu Yi L wrote:
>>> Shared Virtual Addressing (a.k.a Shared Virtual Memory) allows sharing
>>> multiple process virtual address spaces with the device for simplified
>>> programming model. PASID is used to tag an virtual address space in
>>> DMA requests and to identify the related translation structure in
>>> IOMMU. When a PASID-capable device is assigned to a VM, we want the
>>> same capability of using PASID to tag guest process virtual address
>>> spaces to achieve virtual SVA (vSVA).
>>>
>>> PASID management for guest is vendor specific. Some vendors (e.g.
>>> Intel
>>> VT-d) requires system-wide managed PASIDs cross all devices,
>>> regardless
>> across?
> 
> yep. will correct it.
> 
>>> of whether a device is used by host or assigned to guest. Other
>>> vendors (e.g. ARM SMMU) may allow PASIDs managed per-device thus could
>>> be fully delegated to the guest for assigned devices.
>>>
>>> For system-wide managed PASIDs, this patch introduces a vfio module to
>>> handle explicit PASID alloc/free requests from guest. Allocated PASIDs
>>> are associated to a process (or, mm_struct) in IOASID core. A vfio_mm
>>> object is introduced to track mm_struct. Multiple VFIO containers
>>> within a process share the same vfio_mm object.
>>>
>>> A quota mechanism is provided to prevent malicious user from
>>> exhausting available PASIDs. Currently the quota is a global parameter
>>> applied to all VFIO devices. In the future per-device quota might be
>>> supported too.
>>>
>>> Cc: Kevin Tian 
>>> CC: Jacob Pan 
>>> Cc: Eric Auger 
>>> Cc: Jean-Philippe Brucker 
>>> Cc: Joerg Roedel 
>>> Cc: Lu Baolu 
>>> Suggested-by: Alex Williamson 
>>> Signed-off-by: Liu Yi L 
>>> ---
>>> v4 -> v5:
>>> *) address comments from Eric Auger.
>>> *) address the comments from Alex on the pasid free range support. Added
>>>per vfio_mm pasid r-b tree.
>>>https://lore.kernel.org/kvm/20200709082751.32074...@x1.home/
>>>
>>> v3 -> v4:
>>> *) fix lock leam in vfio_mm_get_from_task()
>>> *) drop pasid_quota field in struct vfio_mm
>>> *) vfio_mm_get_from_task() returns ERR_PTR(-ENOTTY) when
>>> !CONFIG_VFIO_PASID
>>>
>>> v1 -> v2:
>>> *) added in v2, split from the pasid alloc/free support of v1
>>> ---
>>>  drivers/vfio/Kconfig  |   5 +
>>>  drivers/vfio/Makefile |   1 +
>>>  drivers/vfio/vfio_pasid.c | 235
>> ++
>>>  include/linux/vfio.h  |  28 ++
>>>  4 files changed, 269 insertions(+)
>>>  create mode 100644 drivers/vfio/vfio_pasid.c
>>>
>>> diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index
>>> fd17db9..3d8a108 100644
>>> --- a/drivers/vfio/Kconfig
>>> +++ b/drivers/vfio/Kconfig
>>> @@ -19,6 +19,11 @@ config VFIO_VIRQFD
>>> depends on VFIO && EVENTFD
>>> default n
>>>
>>> +config VFIO_PASID
>>> +   tristate
>>> +   depends on IOASID && VFIO
>>> +   default n
>>> +
>>>  menuconfig VFIO
>>> tristate "VFIO Non-Privileged userspace driver framework"
>>> depends on IOMMU_API
>>> diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index
>>> de67c47..bb836a3 100644
>>> --- a/drivers/vfio/Makefile
>>> +++ b/drivers/vfio/Makefile
>>> @@ -3,6 +3,7 @@ vfio_virqfd-y := virqfd.o
>>>
>>>  obj-$(CONFIG_VFIO) += vfio.o
>>>  obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o
>>> +obj-$(CONFIG_VFIO_PASID) += vfio_pasid.o
>>>  obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
>>>  obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
>>>  obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o diff --git
>>> a/drivers/vfio/vfio_pasid.c b/drivers/vfio/vfio_pasid.c new file mode
>>> 100644 index 000..66e6054e
>>> --- /dev/null
>>> +++ b/drivers/vfio/vfio_pasid.c
>>> @@ -0,0 +1,235 @@
>>> +// SPDX-License-Identifier: GPL-2.0-only
>>> +/*
>>> + * Copyright (C) 2020 Intel Corporation.
>>> + * Author: Liu Yi L 
>>> + *
>>> + */
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>> +#define DRIVER_VERSION  "0.1"
>>> +#define DRIVER_AUTHOR   "Liu Yi L "
>>> +#define DRIVER_DESC "PASID management for VFIO bus drivers"
>>> +
>>> +#define VFIO_DEFAULT_PASID_QUOTA   1000
>>> +static int pasid_quota = VFIO_DEFAULT_PASID_QUOTA;
>>> +module_param_named(pasid_quota, pasid_quota, uint, 0444);
>>> +MODULE_PARM_DESC(pasid_quota,
>>> +" Set the quota for max number of PASIDs that an application is
>>> +allowed to request (default 1000)");
>> s/ Set/Set
> 
> got it.
> 
>>> +
>>> +struct vfio_mm_token {
>>> +   unsigned long long val;
>>> +};
>>> +
>>> +struct vfio_mm {
>>> +   struct kref kref;
>>> +   int ioasid_sid;
>>> +   struct mutexpasid_lock;
>>> +   struct rb_root  pasid_list;
>>> +   struct list_headnext;
>>> +   struct vfio_mm_tokentoken;
>>> +};
>>> +
>>> +static struct mutex

RE: [PATCH v5 06/15] iommu/vt-d: Support setting ioasid set to domain

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Sunday, July 19, 2020 11:38 PM
> 
> Yi,
> 
> On 7/12/20 1:21 PM, Liu Yi L wrote:
> > From IOMMU p.o.v., PASIDs allocated and managed by external components
> > (e.g. VFIO) will be passed in for gpasid_bind/unbind operation. IOMMU
> > needs some knowledge to check the PASID ownership, hence add an
> > interface for those components to tell the PASID owner.
> >
> > In latest kernel design, PASID ownership is managed by IOASID set
> > where the PASID is allocated from. This patch adds support for setting
> > ioasid set ID to the domains used for nesting/vSVA. Subsequent SVA
> > operations on the PASID will be checked against its IOASID set for proper
> ownership.
> Subsequent SVA operations will check the PASID against its IOASID set for 
> proper
> ownership.

got it.

> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> > v4 -> v5:
> > *) address comments from Eric Auger.
> > ---
> >  drivers/iommu/intel/iommu.c | 22 ++
> > include/linux/intel-iommu.h |  4 
> >  include/linux/iommu.h   |  1 +
> >  3 files changed, 27 insertions(+)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 72ae6a2..4d54198 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -1793,6 +1793,7 @@ static struct dmar_domain *alloc_domain(int flags)
> > if (first_level_by_default())
> > domain->flags |= DOMAIN_FLAG_USE_FIRST_LEVEL;
> > domain->has_iotlb_device = false;
> > +   domain->ioasid_sid = INVALID_IOASID_SET;
> > INIT_LIST_HEAD(&domain->devices);
> >
> > return domain;
> > @@ -6039,6 +6040,27 @@ intel_iommu_domain_set_attr(struct iommu_domain
> *domain,
> > }
> > spin_unlock_irqrestore(&device_domain_lock, flags);
> > break;
> > +   case DOMAIN_ATTR_IOASID_SID:
> > +   {
> > +   int sid = *(int *)data;
> > +
> > +   if (!(dmar_domain->flags & DOMAIN_FLAG_NESTING_MODE)) {
> > +   ret = -ENODEV;
> > +   break;
> > +   }
> > +   spin_lock_irqsave(&device_domain_lock, flags);
> I think the lock should be taken before the DOMAIN_FLAG_NESTING_MODE check.
> Otherwise, the flags can be theretically changed inbetween the check and the 
> test
> below?

I see. will correct it.

Thanks,
Yi Liu

> Thanks
> 
> Eric
> > +   if (dmar_domain->ioasid_sid != INVALID_IOASID_SET &&
> > +   dmar_domain->ioasid_sid != sid) {
> > +   pr_warn_ratelimited("multi ioasid_set (%d:%d) setting",
> > +   dmar_domain->ioasid_sid, sid);
> > +   ret = -EBUSY;
> > +   spin_unlock_irqrestore(&device_domain_lock, flags);
> > +   break;
> > +   }
> > +   dmar_domain->ioasid_sid = sid;
> > +   spin_unlock_irqrestore(&device_domain_lock, flags);
> > +   break;
> > +   }
> > default:
> > ret = -EINVAL;
> > break;
> > diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> > index 3f23c26..0d0ab32 100644
> > --- a/include/linux/intel-iommu.h
> > +++ b/include/linux/intel-iommu.h
> > @@ -549,6 +549,10 @@ struct dmar_domain {
> >2 == 1GiB, 3 == 512GiB, 4 == 1TiB */
> > u64 max_addr;   /* maximum mapped address */
> >
> > +   int ioasid_sid; /*
> > +* the ioasid set which tracks all
> > +* PASIDs used by the domain.
> > +*/
> > int default_pasid;  /*
> >  * The default pasid used for non-SVM
> >  * traffic on mediated devices.
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h index
> > 7ca9d48..e84a1d5 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -124,6 +124,7 @@ enum iommu_attr {
> > DOMAIN_ATTR_FSL_PAMUV1,
> > DOMAIN_ATTR_NESTING,/* two stages of translation */
> > DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> > +   DOMAIN_ATTR_IOASID_SID,
> > DOMAIN_ATTR_MAX,
> >  };
> >
> >

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v5 05/15] vfio: Add PASID allocation/free support

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Sunday, July 19, 2020 11:39 PM
> 
> Yi,
> 
> On 7/12/20 1:21 PM, Liu Yi L wrote:
> > Shared Virtual Addressing (a.k.a Shared Virtual Memory) allows sharing
> > multiple process virtual address spaces with the device for simplified
> > programming model. PASID is used to tag an virtual address space in
> > DMA requests and to identify the related translation structure in
> > IOMMU. When a PASID-capable device is assigned to a VM, we want the
> > same capability of using PASID to tag guest process virtual address
> > spaces to achieve virtual SVA (vSVA).
> >
> > PASID management for guest is vendor specific. Some vendors (e.g.
> > Intel
> > VT-d) requires system-wide managed PASIDs cross all devices,
> > regardless
> across?

yep. will correct it.

> > of whether a device is used by host or assigned to guest. Other
> > vendors (e.g. ARM SMMU) may allow PASIDs managed per-device thus could
> > be fully delegated to the guest for assigned devices.
> >
> > For system-wide managed PASIDs, this patch introduces a vfio module to
> > handle explicit PASID alloc/free requests from guest. Allocated PASIDs
> > are associated to a process (or, mm_struct) in IOASID core. A vfio_mm
> > object is introduced to track mm_struct. Multiple VFIO containers
> > within a process share the same vfio_mm object.
> >
> > A quota mechanism is provided to prevent malicious user from
> > exhausting available PASIDs. Currently the quota is a global parameter
> > applied to all VFIO devices. In the future per-device quota might be
> > supported too.
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Suggested-by: Alex Williamson 
> > Signed-off-by: Liu Yi L 
> > ---
> > v4 -> v5:
> > *) address comments from Eric Auger.
> > *) address the comments from Alex on the pasid free range support. Added
> >per vfio_mm pasid r-b tree.
> >https://lore.kernel.org/kvm/20200709082751.32074...@x1.home/
> >
> > v3 -> v4:
> > *) fix lock leam in vfio_mm_get_from_task()
> > *) drop pasid_quota field in struct vfio_mm
> > *) vfio_mm_get_from_task() returns ERR_PTR(-ENOTTY) when
> > !CONFIG_VFIO_PASID
> >
> > v1 -> v2:
> > *) added in v2, split from the pasid alloc/free support of v1
> > ---
> >  drivers/vfio/Kconfig  |   5 +
> >  drivers/vfio/Makefile |   1 +
> >  drivers/vfio/vfio_pasid.c | 235
> ++
> >  include/linux/vfio.h  |  28 ++
> >  4 files changed, 269 insertions(+)
> >  create mode 100644 drivers/vfio/vfio_pasid.c
> >
> > diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index
> > fd17db9..3d8a108 100644
> > --- a/drivers/vfio/Kconfig
> > +++ b/drivers/vfio/Kconfig
> > @@ -19,6 +19,11 @@ config VFIO_VIRQFD
> > depends on VFIO && EVENTFD
> > default n
> >
> > +config VFIO_PASID
> > +   tristate
> > +   depends on IOASID && VFIO
> > +   default n
> > +
> >  menuconfig VFIO
> > tristate "VFIO Non-Privileged userspace driver framework"
> > depends on IOMMU_API
> > diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index
> > de67c47..bb836a3 100644
> > --- a/drivers/vfio/Makefile
> > +++ b/drivers/vfio/Makefile
> > @@ -3,6 +3,7 @@ vfio_virqfd-y := virqfd.o
> >
> >  obj-$(CONFIG_VFIO) += vfio.o
> >  obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o
> > +obj-$(CONFIG_VFIO_PASID) += vfio_pasid.o
> >  obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
> >  obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
> >  obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o diff --git
> > a/drivers/vfio/vfio_pasid.c b/drivers/vfio/vfio_pasid.c new file mode
> > 100644 index 000..66e6054e
> > --- /dev/null
> > +++ b/drivers/vfio/vfio_pasid.c
> > @@ -0,0 +1,235 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2020 Intel Corporation.
> > + * Author: Liu Yi L 
> > + *
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#define DRIVER_VERSION  "0.1"
> > +#define DRIVER_AUTHOR   "Liu Yi L "
> > +#define DRIVER_DESC "PASID management for VFIO bus drivers"
> > +
> > +#define VFIO_DEFAULT_PASID_QUOTA   1000
> > +static int pasid_quota = VFIO_DEFAULT_PASID_QUOTA;
> > +module_param_named(pasid_quota, pasid_quota, uint, 0444);
> > +MODULE_PARM_DESC(pasid_quota,
> > +" Set the quota for max number of PASIDs that an application is
> > +allowed to request (default 1000)");
> s/ Set/Set

got it.

> > +
> > +struct vfio_mm_token {
> > +   unsigned long long val;
> > +};
> > +
> > +struct vfio_mm {
> > +   struct kref kref;
> > +   int ioasid_sid;
> > +   struct mutexpasid_lock;
> > +   struct rb_root  pasid_list;
> > +   struct list_headnext;
> > +   struct vfio_mm_tokentoken;
> > +};
> > +
> > +static struct mutexvfio_mm_lock;
> > +static struct list_head   

RE: [PATCH v5 04/15] vfio/type1: Report iommu nesting info to userspace

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Saturday, July 18, 2020 1:34 AM
> 
> Yi,
> 
> On 7/12/20 1:20 PM, Liu Yi L wrote:
> > This patch exports iommu nesting capability info to user space through
> > VFIO. User space is expected to check this info for supported uAPIs (e.g.
> it is not only to check the supported uAPIS but rather to know which
> callbacks it must call upon vIOMMU events and which features are
> supported by the physical IOMMU.

yes, will refine the description per your comment.

> > PASID alloc/free, bind page table, and cache invalidation) and the vendor
> > specific format information for first level/stage page table that will be
> > bound to.
> >
> > The nesting info is available only after the nesting iommu type is set
> > for a container.
> to NESTED type

you mean "The nesting info is available only after container set to be NESTED 
type."

right?

>  Current implementation imposes one limitation - one
> > nesting container should include at most one group. The philosophy of
> > vfio container is having all groups/devices within the container share
> > the same IOMMU context. When vSVA is enabled, one IOMMU context could
> > include one 2nd-level address space and multiple 1st-level address spaces.
> > While the 2nd-level address space is reasonably sharable by multiple groups
> > , blindly sharing 1st-level address spaces across all groups within the
> > container might instead break the guest expectation. In the future sub/
> > super container concept might be introduced to allow partial address space
> > sharing within an IOMMU context. But for now let's go with this restriction
> > by requiring singleton container for using nesting iommu features. Below
> > link has the related discussion about this decision.
> 
> Maybe add a note about SMMU related changes spotted by Jean.

I guess you mean the comments Jean gave in patch 3/15, right? I'll
copy his comments and also give the below link as well.

https://lore.kernel.org/kvm/20200717090900.GC4850@myrica/

> >
> > https://lkml.org/lkml/2020/5/15/1028
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Liu Yi L 
> > ---
> > v4 -> v5:
> > *) address comments from Eric Auger.
> > *) return struct iommu_nesting_info for
> VFIO_IOMMU_TYPE1_INFO_CAP_NESTING as
> >cap is much "cheap", if needs extension in future, just define another 
> > cap.
> >https://lore.kernel.org/kvm/20200708132947.5b7ee...@x1.home/
> >
> > v3 -> v4:
> > *) address comments against v3.
> >
> > v1 -> v2:
> > *) added in v2
> > ---
> >  drivers/vfio/vfio_iommu_type1.c | 102
> +++-
> >  include/uapi/linux/vfio.h   |  19 
> >  2 files changed, 109 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/vfio/vfio_iommu_type1.c 
> > b/drivers/vfio/vfio_iommu_type1.c
> > index 3bd70ff..ed80104 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -62,18 +62,20 @@ MODULE_PARM_DESC(dma_entry_limit,
> >  "Maximum number of user DMA mappings per container (65535).");
> >
> >  struct vfio_iommu {
> > -   struct list_headdomain_list;
> > -   struct list_headiova_list;
> > -   struct vfio_domain  *external_domain; /* domain for external user */
> > -   struct mutexlock;
> > -   struct rb_root  dma_list;
> > -   struct blocking_notifier_head notifier;
> > -   unsigned intdma_avail;
> > -   uint64_tpgsize_bitmap;
> > -   boolv2;
> > -   boolnesting;
> > -   booldirty_page_tracking;
> > -   boolpinned_page_dirty_scope;
> > +   struct list_headdomain_list;
> > +   struct list_headiova_list;
> > +   /* domain for external user */
> > +   struct vfio_domain  *external_domain;
> > +   struct mutexlock;
> > +   struct rb_root  dma_list;
> > +   struct blocking_notifier_head   notifier;
> > +   unsigned intdma_avail;
> > +   uint64_tpgsize_bitmap;
> > +   boolv2;
> > +   boolnesting;
> > +   booldirty_page_tracking;
> > +   boolpinned_page_dirty_scope;
> > +   struct iommu_nesting_info   *nesting_info;
> >  };
> >
> >  struct vfio_domain {
> > @@ -130,6 +132,9 @@ struct vfio_regions {
> >  #define IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)\
> > (!list_empty(&iommu->domain_list))
> >
> > +#define CONTAINER_HAS_DOMAIN(iommu)(((iommu)->external_domain) || \
> > +(!list_empty(&(iommu)->domain_list)))
> > +
> >  #define DIRTY_BITMAP_BYTES(n)  (ALIGN(n, BITS_PER_TYPE(u64)) /
> BITS_PER_BYTE)
> 

RE: [PATCH v5 03/15] iommu/smmu: Report empty domain nesting info

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> 
> Yi,
> 
> On 7/12/20 1:20 PM, Liu Yi L wrote:
> > This patch is added as instead of returning a boolean for
> > DOMAIN_ATTR_NESTING,
> > iommu_domain_get_attr() should return an iommu_nesting_info handle.
> 
> you may add in the commit message you return an empty nesting info struct for 
> now
> as true nesting is not yet supported by the SMMUs.

will do.

> Besides:
> Reviewed-by: Eric Auger 

thanks.

Regards,
Yi Liu

> Thanks
> 
> Eric
> >
> > Cc: Will Deacon 
> > Cc: Robin Murphy 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Suggested-by: Jean-Philippe Brucker 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> > v4 -> v5:
> > *) address comments from Eric Auger.
> > ---
> >  drivers/iommu/arm-smmu-v3.c | 29 +++--
> >  drivers/iommu/arm-smmu.c| 29 +++--
> >  2 files changed, 54 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> > index f578677..ec815d7 100644
> > --- a/drivers/iommu/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm-smmu-v3.c
> > @@ -3019,6 +3019,32 @@ static struct iommu_group
> *arm_smmu_device_group(struct device *dev)
> > return group;
> >  }
> >
> > +static int arm_smmu_domain_nesting_info(struct arm_smmu_domain
> *smmu_domain,
> > +   void *data)
> > +{
> > +   struct iommu_nesting_info *info = (struct iommu_nesting_info *)data;
> > +   unsigned int size;
> > +
> > +   if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
> > +   return -ENODEV;
> > +
> > +   size = sizeof(struct iommu_nesting_info);
> > +
> > +   /*
> > +* if provided buffer size is smaller than expected, should
> > +* return 0 and also the expected buffer size to caller.
> > +*/
> > +   if (info->size < size) {
> > +   info->size = size;
> > +   return 0;
> > +   }
> > +
> > +   /* report an empty iommu_nesting_info for now */
> > +   memset(info, 0x0, size);
> > +   info->size = size;
> > +   return 0;
> > +}
> > +
> >  static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> > enum iommu_attr attr, void *data)  { @@ -
> 3028,8 +3054,7 @@
> > static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> > case IOMMU_DOMAIN_UNMANAGED:
> > switch (attr) {
> > case DOMAIN_ATTR_NESTING:
> > -   *(int *)data = (smmu_domain->stage ==
> ARM_SMMU_DOMAIN_NESTED);
> > -   return 0;
> > +   return arm_smmu_domain_nesting_info(smmu_domain,
> data);
> > default:
> > return -ENODEV;
> > }
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index
> > 243bc4c..09e2f1b 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -1506,6 +1506,32 @@ static struct iommu_group
> *arm_smmu_device_group(struct device *dev)
> > return group;
> >  }
> >
> > +static int arm_smmu_domain_nesting_info(struct arm_smmu_domain
> *smmu_domain,
> > +   void *data)
> > +{
> > +   struct iommu_nesting_info *info = (struct iommu_nesting_info *)data;
> > +   unsigned int size;
> > +
> > +   if (!info || smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
> > +   return -ENODEV;
> > +
> > +   size = sizeof(struct iommu_nesting_info);
> > +
> > +   /*
> > +* if provided buffer size is smaller than expected, should
> > +* return 0 and also the expected buffer size to caller.
> > +*/
> > +   if (info->size < size) {
> > +   info->size = size;
> > +   return 0;
> > +   }
> > +
> > +   /* report an empty iommu_nesting_info for now */
> > +   memset(info, 0x0, size);
> > +   info->size = size;
> > +   return 0;
> > +}
> > +
> >  static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> > enum iommu_attr attr, void *data)  { @@ -
> 1515,8 +1541,7 @@
> > static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
> > case IOMMU_DOMAIN_UNMANAGED:
> > switch (attr) {
> > case DOMAIN_ATTR_NESTING:
> > -   *(int *)data = (smmu_domain->stage ==
> ARM_SMMU_DOMAIN_NESTED);
> > -   return 0;
> > +   return arm_smmu_domain_nesting_info(smmu_domain,
> data);
> > default:
> > return -ENODEV;
> > }
> >

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v5 02/15] iommu: Report domain nesting info

2020-07-20 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric 
> Sent: Saturday, July 18, 2020 12:29 AM
> 
> Hi Yi,
> 
> On 7/12/20 1:20 PM, Liu Yi L wrote:
> > IOMMUs that support nesting translation needs report the capability info
> s/needs/need to report

yep.

> > to userspace, e.g. the format of first level/stage paging structures.
> It gives information about requirements the userspace needs to implement
> plus other features characterizing the physical implementation.

got it. will add it in next version.

> >
> > This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can get
> > nesting info after setting DOMAIN_ATTR_NESTING.
> I guess you meant after selecting VFIO_TYPE1_NESTING_IOMMU?

yes, it is. ok, perhaps, it's better to say get nesting info after selecting
VFIO_TYPE1_NESTING_IOMMU.

> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> > v4 -> v5:
> > *) address comments from Eric Auger.
> >
> > v3 -> v4:
> > *) split the SMMU driver changes to be a separate patch
> > *) move the @addr_width and @pasid_bits from vendor specific
> >part to generic part.
> > *) tweak the description for the @features field of struct
> >iommu_nesting_info.
> > *) add description on the @data[] field of struct iommu_nesting_info
> >
> > v2 -> v3:
> > *) remvoe cap/ecap_mask in iommu_nesting_info.
> > *) reuse DOMAIN_ATTR_NESTING to get nesting info.
> > *) return an empty iommu_nesting_info for SMMU drivers per Jean'
> >suggestion.
> > ---
> >  include/uapi/linux/iommu.h | 77
> ++
> >  1 file changed, 77 insertions(+)
> >
> > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> > index 1afc661..d2a47c4 100644
> > --- a/include/uapi/linux/iommu.h
> > +++ b/include/uapi/linux/iommu.h
> > @@ -332,4 +332,81 @@ struct iommu_gpasid_bind_data {
> > } vendor;
> >  };
> >
> > +/*
> > + * struct iommu_nesting_info - Information for nesting-capable IOMMU.
> > + *user space should check it before using
> > + *nesting capability.
> > + *
> > + * @size:  size of the whole structure
> > + * @format:PASID table entry format, the same definition as struct
> > + * iommu_gpasid_bind_data @format.
> > + * @features:  supported nesting features.
> > + * @flags: currently reserved for future extension.
> > + * @addr_width:The output addr width of first level/stage translation
> > + * @pasid_bits:Maximum supported PASID bits, 0 represents no PASID
> > + * support.
> > + * @data:  vendor specific cap info. data[] structure type can be deduced
> > + * from @format field.
> > + *
> > + *
> +===+===
> ===+
> > + * | feature   |  Notes   |
> > + *
> +===+===
> ===+
> > + * | SYSWIDE_PASID |  PASIDs are managed in system-wide, instead of per   |
> s/in system-wide/system-wide ?

got it.

> > + * |   |  device. When a device is assigned to userspace or   |
> > + * |   |  VM, proper uAPI (userspace driver framework uAPI,   |
> > + * |   |  e.g. VFIO) must be used to allocate/free PASIDs for |
> > + * |   |  the assigned device.
> Isn't it possible to be more explicit, something like:
>   |
> System-wide PASID management is mandated by the physical IOMMU. All
> PASIDs allocation must be mediated through the TBD API.

yep, I can add it.

> > + * +---+--+
> > + * | BIND_PGTBL|  The owner of the first level/stage page table must  |
> > + * |   |  explicitly bind the page table to associated PASID  |
> > + * |   |  (either the one specified in bind request or the|
> > + * |   |  default PASID of iommu domain), through userspace   |
> > + * |   |  driver framework uAPI (e.g. VFIO_IOMMU_NESTING_OP). |
> As per your answer in https://lkml.org/lkml/2020/7/6/383, I now
> understand ARM would not expose that BIND_PGTBL nesting feature,

yes, that's my point.

> I still
> think the above wording is a bit confusing. Maybe you may explicitly
> talk about the PASID *entry* that needs to be passed from guest to host.
> On ARM we directly pass the PASID table but when reading the above
> description I fail to determine if this does not fit that description.

yes, I can do it.

> > + * +---+--+
> > + * | CACHE_INVLD   |  The owner of the first level/stage page table must  |
> > + * |   |  explicitly invalidate the IOMMU cache through uAPI  |
> > + * |   |  provided by userspace driver framework (e.g. VFIO)  |
>

Re: arm64: Internal error: Oops: qcom_iommu_tlb_inv_context free_io_pgtable_ops on db410c

2020-07-20 Thread Arnd Bergmann
On Mon, Jul 20, 2020 at 8:36 AM Naresh Kamboju
 wrote:
>
> This kernel oops while boot linux mainline kernel on arm64  db410c device.
>
> metadata:
>   git branch: master
>   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>   git commit: f8456690ba8eb18ea4714e68554e242a04f65cff
>   git describe: v5.8-rc5-48-gf8456690ba8e
>   make_kernelversion: 5.8.0-rc5
>   kernel-config:
> https://builds.tuxbuild.com/2aLnwV7BLStU0t1R1QPwHQ/kernel.config

Thanks for the report. Adding freedreno folks to Cc, as this may have something
to do with that driver.

>
> [5.444121] Unable to handle kernel NULL pointer dereference at
> virtual address 0018
> [5.456615]   ESR = 0x9604
> [5.464471]   SET = 0, FnV = 0
> [5.464487]   EA = 0, S1PTW = 0
> [5.466521] Data abort info:
> [5.469971]   ISV = 0, ISS = 0x0004
> [5.472768]   CM = 0, WnR = 0
> [5.476172] user pgtable: 4k pages, 48-bit VAs, pgdp=bacba000
> [5.479349] [0018] pgd=, p4d=
> [5.485820] Internal error: Oops: 9604 [#1] PREEMPT SMP
> [5.492448] Modules linked in: crct10dif_ce adv7511(+)
> qcom_spmi_temp_alarm cec msm(+) mdt_loader qcom_camss videobuf2_dma_sg
> drm_kms_helper v4l2_fwnode videobuf2_memops videobuf2_v4l2 qcom_rng
> videobuf2_common i2c_qcom_cci display_connector socinfo drm qrtr ns
> rmtfs_mem fuse
> [5.500256] CPU: 0 PID: 286 Comm: systemd-udevd Not tainted 5.8.0-rc5 #1
> [5.522484] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> [5.529170] pstate: 2005 (nzCv daif -PAN -UAO BTYPE=--)
> [5.535856] pc : qcom_iommu_tlb_inv_context+0x18/0xa8
> [5.541148] lr : free_io_pgtable_ops+0x28/0x58
> [5.546350] sp : 80001219b5f0
> [5.550689] x29: 80001219b5f0 x28: 0013
> [5.554078] x27: 0100 x26: 36add3b8
> [5.559459] x25: 8915e910 x24: 3a5458c0
> [5.564753] x23: 0003 x22: 36a37058
> [5.570049] x21: 36a3a100 x20: 36a3a480
> [5.575344] x19: 36a37158 x18: 
> [5.580639] x17:  x16: 
> [5.585935] x15: 0004 x14: 0368
> [5.591229] x13:  x12: 39c61798
> [5.596525] x11: 39c616d0 x10: 4000
> [5.601820] x9 :  x8 : 39c616f8
> [5.607114] x7 :  x6 : 09f699a0
> [5.612410] x5 : 80001219b520 x4 : 36a3a000
> [5.617705] x3 : 09f69904 x2 : 
> [5.623001] x1 : 8000107e27e8 x0 : 3a545810
> [5.628297] Call trace:
> [5.633592]  qcom_iommu_tlb_inv_context+0x18/0xa8

This means that dev_iommu_fwspec_get() has returned NULL
in qcom_iommu_tlb_inv_context(), either because dev->iommu
is NULL, or because dev->iommu->fwspec is NULL.

qcom_iommu_tlb_inv_context() does not check for a NULL
pointer before using the returned object.

The bug is either in the lack of error handling, or the fact
that it's possible to get into this function for a device
that has not been fully set up.

> [5.635764]  free_io_pgtable_ops+0x28/0x58
> [5.640624]  qcom_iommu_domain_free+0x38/0x60
> [5.644617]  iommu_group_release+0x4c/0x70
> [5.649045]  kobject_put+0x6c/0x120
> [5.653035]  kobject_del+0x64/0x90
> [5.656421]  kobject_put+0xfc/0x120
> [5.659893]  iommu_group_remove_device+0xdc/0xf0
> [5.663281]  iommu_release_device+0x44/0x70
> [5.668142]  iommu_bus_notifier+0xbc/0xd0
> [5.672048]  notifier_call_chain+0x54/0x98
> [5.676214]  blocking_notifier_call_chain+0x48/0x70
> [5.680209]  device_del+0x26c/0x3a0
> [5.684981]  platform_device_del.part.0+0x1c/0x88
> [5.688453]  platform_device_unregister+0x24/0x40
> [5.693316]  of_platform_device_destroy+0xe4/0xf8
> [5.698002]  device_for_each_child+0x5c/0xa8
> [5.702689]  of_platform_depopulate+0x3c/0x80
> [5.707144]  msm_pdev_probe+0x1c4/0x308 [msm]

It was triggered by a failure in msm_pdev_probe(), which was
calls of_platform_depopulate() in its error handling code.
This is a combination of two problems:

a) Whatever caused msm_pdev_probe() to fail means that
the gpu won't be usable, though it should not have caused the
kernel to crash.

b) the error handling itself causing additional problems due
to failed unwinding.

> [5.711286]  platform_drv_probe+0x54/0xa8
> [5.715624]  really_probe+0xd8/0x320
> [5.719617]  driver_probe_device+0x58/0xb8
> [5.723263]  device_driver_attach+0x74/0x80
> [5.727168]  __driver_attach+0x58/0xe0
> [5.731248]  bus_for_each_dev+0x70/0xc0
> [5.735067]  driver_attach+0x24/0x30
> [5.738801]  bus_add_driver+0x14c/0x1f0
> [5.742619]  driver_register+0x64/0x120
> [5.746178]  __platform_driver_register+0x48/0x58
> [5.750099]  msm_drm_register+0x58/0x70 [msm]
> [5.754861]  do_o