Re: [PATCH] iommu/arm-smmu: Report USF more clearly

2019-09-13 Thread Russell King - ARM Linux admin
On Fri, Sep 13, 2019 at 12:48:37PM +0100, Robin Murphy wrote:
> Although CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is a welcome tool
> for smoking out inadequate firmware, the failure mode is non-obvious
> and can be confusing for end users. Add some special-case reporting of
> Unidentified Stream Faults to help clarify this particular symptom.

Having encountered this on a board that turned up this week, it may
be better to use the hex representation of the stream ID, especially
as it seems normal for the stream ID to be made up of implementation
defined bitfields.

If we want to stick with decimal, maybe masking the stream ID with
the number of allowable bits would be a good idea, so that the
decimal value remains meaningful should other bits be non-zero?

> CC: Douglas Anderson 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/arm-smmu.c | 5 +
>  drivers/iommu/arm-smmu.h | 2 ++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index b7cf24402a94..76ac8c180695 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -499,6 +499,11 @@ static irqreturn_t arm_smmu_global_fault(int irq, void 
> *dev)
>   dev_err_ratelimited(smmu->dev,
>   "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
> 0x%08x\n",
>   gfsr, gfsynr0, gfsynr1, gfsynr2);
> + if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) &&
> + (gfsr & sGFSR_USF))
> + dev_err_ratelimited(smmu->dev,
> + "Stream ID %hu may not be described by firmware, try 
> booting with \"arm-smmu.disable_bypass=0\"\n",
> + (u16)gfsynr1);
>  
>   arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sGFSR, gfsr);
>   return IRQ_HANDLED;
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index c9c13b5785f2..46f7e161e83e 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -79,6 +79,8 @@
>  #define ID7_MINORGENMASK(3, 0)
>  
>  #define ARM_SMMU_GR0_sGFSR   0x48
> +#define sGFSR_USFBIT(2)

I do wonder if this is another instance where writing "(1 << 1)"
would have resulted in less chance of a mistake being made...
wrapping stuff up into macros is not always better!

9.6.15SMMU_sGFSR, Global Fault Status Register

The SMMU_sGFSR bit assignments are:

USF, bit[1]   Unidentified stream fault. The possible values of this
  bit are:
  0  No Unidentified stream fault.
  1  Unidentified stream fault.

So this wants to be:

#define sGFSR_USF   BIT(1)


-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: Report USF more clearly

2019-09-13 Thread Doug Anderson
Hi,

On Fri, Sep 13, 2019 at 4:48 AM Robin Murphy  wrote:
>
> Although CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is a welcome tool
> for smoking out inadequate firmware, the failure mode is non-obvious
> and can be confusing for end users. Add some special-case reporting of
> Unidentified Stream Faults to help clarify this particular symptom.
>
> CC: Douglas Anderson 

nit that I believe that "Cc" (lowercase 2nd c) is correct.

> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/arm-smmu.c | 5 +
>  drivers/iommu/arm-smmu.h | 2 ++
>  2 files changed, 7 insertions(+)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index b7cf24402a94..76ac8c180695 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -499,6 +499,11 @@ static irqreturn_t arm_smmu_global_fault(int irq, void 
> *dev)
> dev_err_ratelimited(smmu->dev,
> "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
> 0x%08x\n",
> gfsr, gfsynr0, gfsynr1, gfsynr2);
> +   if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) &&
> +   (gfsr & sGFSR_USF))
> +   dev_err_ratelimited(smmu->dev,
> +   "Stream ID %hu may not be described by firmware, try 
> booting with \"arm-smmu.disable_bypass=0\"\n",
> +   (u16)gfsynr1);

In general it seems like a sane idea to surface an error like this.  I
guess a few nits:

1. "By firmware" might be a bit misleading.  In most cases I'm aware
of the problem is in the device tree that was bundled together with
the kernel.  If there are actually cases where firmware has baked in a
device tree and it got this wrong then we might want to spend time
figuring out what to do about it.

2. Presumably booting with "arm-smmu.disable_bypass=0" is in most
cases the least desirable option available.  I always consider kernel
command line parameters as something of a last resort for
configuration and would only be something that and end user might do
if they were given a kernel compiled by someone else (like if someone
where taking a prebuilt Linux distro and trying to install it onto a
generic PC).  Are you seeing cases where this is happening?  If people
are compiling their own kernel I'd argue that telling them to set
"CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT" to "no" is much better
than trying to jam a command line option on.  Command line options
don't scale well.

3. Any chance you could make it more obvious that this change is
undesirable and a last resort?  AKA:

"Stream ID x blocked for security reasons; allow anyway by booting
with arm-smmu.disable_bypass=0"

-Doug
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 2/2] of: Let of_for_each_phandle fallback to non-negative cell_count

2019-09-13 Thread Rob Herring
On Sat, 24 Aug 2019 15:28:46 +0200, =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= 
 wrote:
> Referencing device tree nodes from a property allows to pass arguments.
> This is for example used for referencing gpios. This looks as follows:
> 
>   gpio_ctrl: gpio-controller {
>   #gpio-cells = <2>
>   ...
>   }
> 
>   someothernode {
>   gpios = <_ctrl 5 0 _ctrl 3 0>;
>   ...
>   }
> 
> To know the number of arguments this must be either fixed, or the
> referenced node is checked for a $cells_name (here: "#gpio-cells")
> property and with this information the start of the second reference can
> be determined.
> 
> Currently regulators are referenced with no additional arguments. To
> allow some optional arguments without having to change all referenced
> nodes this change introduces a way to specify a default cell_count. So
> when a phandle is parsed we check for the $cells_name property and use
> it as before if present. If it is not present we fall back to
> cells_count if non-negative and only fail if cells_count is smaller than
> zero.
> 
> Signed-off-by: Uwe Kleine-König 
> ---
>  drivers/of/base.c | 25 +
>  1 file changed, 17 insertions(+), 8 deletions(-)
> 

Applied both patches, thanks.

Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: iommu/amd: Flushing and locking fixes

2019-09-13 Thread Sironi, Filippo


> On 11. Sep 2019, at 13:34, Joerg Roedel  wrote:
> 
> Hi Filippo,
> 
> On Tue, Sep 10, 2019 at 07:49:20PM +0200, Filippo Sironi wrote:
>> This patch series introduce patches to take the domain lock whenever we call
>> functions that end up calling __domain_flush_pages.  Holding the domain lock 
>> is
>> necessary since __domain_flush_pages traverses the device list, which is
>> protected by the domain lock.
>> 
>> The first patch in the series adds a completion right after an IOTLB flush in
>> attach_device.
> 
> Thanks for pointing out these locking issues and your fixes. I have been
> looking into it a bit and it seems there is more problems to take care
> of.
> 
> The first problem is the racy access to domain->updated, which is best
> fixed by moving that info onto the stack don't keep it in the domain
> structure.
> 
> Other than that, I think your patches are kind of the big hammer
> approach to fix it. As they are, they destroy the scalability of the
> dma-api path. So we need something more fine-grained, also if we keep in
> mind that the actual cases where we need to flush something in the
> dma-api path are very rare. The default should be to not take any lock
> in that path.
> 
> How does the attached patch look to you? It is completly untested but
> should give an idea of a better way to fix these locking issues.
> 
> Regards,
> 
>   Joerg
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 61de81965c44..bb93a2bbb73d 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -1435,9 +1435,10 @@ static void free_pagetable(struct protection_domain 
> *domain)
>  * another level increases the size of the address space by 9 bits to a size 
> up
>  * to 64 bits.
>  */
> -static void increase_address_space(struct protection_domain *domain,
> +static bool increase_address_space(struct protection_domain *domain,
>  gfp_t gfp)
> {
> + bool updated = false;
>   unsigned long flags;
>   u64 *pte;
> 
> @@ -1455,27 +1456,30 @@ static void increase_address_space(struct 
> protection_domain *domain,
>   iommu_virt_to_phys(domain->pt_root));
>   domain->pt_root  = pte;
>   domain->mode+= 1;
> - domain->updated  = true;
> + updated  = true;
> 
> out:
>   spin_unlock_irqrestore(>lock, flags);
> 
> - return;
> + return updated;
> }
> 
> static u64 *alloc_pte(struct protection_domain *domain,
> unsigned long address,
> unsigned long page_size,
> u64 **pte_page,
> -   gfp_t gfp)
> +   gfp_t gfp,
> +   bool *updated)
> {
>   int level, end_lvl;
>   u64 *pte, *page;
> 
>   BUG_ON(!is_power_of_2(page_size));
> 
> + *updated = false;
> +
>   while (address > PM_LEVEL_SIZE(domain->mode))
> - increase_address_space(domain, gfp);
> + *updated = increase_address_space(domain, gfp) || *updated;
> 
>   level   = domain->mode - 1;
>   pte = >pt_root[PM_LEVEL_INDEX(level, address)];
> @@ -1501,7 +1505,7 @@ static u64 *alloc_pte(struct protection_domain *domain,
>   if (cmpxchg64(pte, __pte, __npte) != __pte)
>   free_page((unsigned long)page);
>   else if (pte_level == PAGE_MODE_7_LEVEL)
> - domain->updated = true;
> + *updated = true;
> 
>   continue;
>   }
> @@ -1617,6 +1621,7 @@ static int iommu_map_page(struct protection_domain *dom,
>   struct page *freelist = NULL;
>   u64 __pte, *pte;
>   int i, count;
> + bool updated;
> 
>   BUG_ON(!IS_ALIGNED(bus_addr, page_size));
>   BUG_ON(!IS_ALIGNED(phys_addr, page_size));
> @@ -1625,7 +1630,7 @@ static int iommu_map_page(struct protection_domain *dom,
>   return -EINVAL;
> 
>   count = PAGE_SIZE_PTE_COUNT(page_size);
> - pte   = alloc_pte(dom, bus_addr, page_size, NULL, gfp);
> + pte   = alloc_pte(dom, bus_addr, page_size, NULL, gfp, );
> 
>   if (!pte)
>   return -ENOMEM;
> @@ -1634,7 +1639,7 @@ static int iommu_map_page(struct protection_domain *dom,
>   freelist = free_clear_pte([i], pte[i], freelist);
> 
>   if (freelist != NULL)
> - dom->updated = true;
> + updated = true;
> 
>   if (count > 1) {
>   __pte = PAGE_SIZE_PTE(__sme_set(phys_addr), page_size);
> @@ -1650,7 +1655,8 @@ static int iommu_map_page(struct protection_domain *dom,
>   for (i = 0; i < count; ++i)
>   pte[i] = __pte;
> 
> - update_domain(dom);
> + if (updated)
> + update_domain(dom);
> 
>   /* Everything flushed out, free pages now */
>   free_page_list(freelist);
> @@ -2041,6 +2047,13 @@ static int __attach_device(struct iommu_dev_data 
> *dev_data,
> 

[PATCH 3/4] iommu/amd: Introduce first_pte_l7() helper

2019-09-13 Thread Andrei Dulea via iommu
Given an arbitrary pte that is part of a large mapping, this function
returns the first pte of the series (and optionally the mapped size and
number of PTEs)
It will be re-used in a subsequent patch to replace an existing L7
mapping.

Signed-off-by: Andrei Dulea 
---
 drivers/iommu/amd_iommu.c | 40 ---
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index c7e28a8d25d1..a227e7a9b8b7 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -501,6 +501,29 @@ static void iommu_uninit_device(struct device *dev)
 */
 }
 
+/*
+ * Helper function to get the first pte of a large mapping
+ */
+static u64 *first_pte_l7(u64 *pte, unsigned long *page_size,
+unsigned long *count)
+{
+   unsigned long pte_mask, pg_size, cnt;
+   u64 *fpte;
+
+   pg_size  = PTE_PAGE_SIZE(*pte);
+   cnt  = PAGE_SIZE_PTE_COUNT(pg_size);
+   pte_mask = ~((cnt << 3) - 1);
+   fpte = (u64 *)(((unsigned long)pte) & pte_mask);
+
+   if (page_size)
+   *page_size = pg_size;
+
+   if (count)
+   *count = cnt;
+
+   return fpte;
+}
+
 /
  *
  * Interrupt handling functions
@@ -1567,17 +1590,12 @@ static u64 *fetch_pte(struct protection_domain *domain,
*page_size = PTE_LEVEL_PAGE_SIZE(level);
}
 
-   if (PM_PTE_LEVEL(*pte) == 0x07) {
-   unsigned long pte_mask;
-
-   /*
-* If we have a series of large PTEs, make
-* sure to return a pointer to the first one.
-*/
-   *page_size = pte_mask = PTE_PAGE_SIZE(*pte);
-   pte_mask   = ~((PAGE_SIZE_PTE_COUNT(pte_mask) << 3) - 1);
-   pte= (u64 *)(((unsigned long)pte) & pte_mask);
-   }
+   /*
+* If we have a series of large PTEs, make
+* sure to return a pointer to the first one.
+*/
+   if (PM_PTE_LEVEL(*pte) == PAGE_MODE_7_LEVEL)
+   pte = first_pte_l7(pte, page_size, NULL);
 
return pte;
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu/amd: Fix downgrading default page-sizes in alloc_pte()

2019-09-13 Thread Andrei Dulea via iommu
Downgrading an existing large mapping to a mapping using smaller
page-sizes works only for the mappings created with page-mode 7 (i.e.
non-default page size).

Treat large mappings created with page-mode 0 (i.e. default page size)
like a non-present mapping and allow to overwrite it in alloc_pte().

While around, make sure that we flush the TLB only if we change an
existing mapping, otherwise we might end up acting on garbage PTEs.

Signed-off-by: Andrei Dulea 
---
 drivers/iommu/amd_iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 138547446345..c7e28a8d25d1 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1490,6 +1490,7 @@ static u64 *alloc_pte(struct protection_domain *domain,
pte_level = PM_PTE_LEVEL(__pte);
 
if (!IOMMU_PTE_PRESENT(__pte) ||
+   pte_level == PAGE_MODE_NONE ||
pte_level == PAGE_MODE_7_LEVEL) {
page = (u64 *)get_zeroed_page(gfp);
if (!page)
@@ -1500,7 +1501,7 @@ static u64 *alloc_pte(struct protection_domain *domain,
/* pte could have been changed somewhere. */
if (cmpxchg64(pte, __pte, __npte) != __pte)
free_page((unsigned long)page);
-   else if (pte_level == PAGE_MODE_7_LEVEL)
+   else if (IOMMU_PTE_PRESENT(__pte))
domain->updated = true;
 
continue;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/4] iommu/amd: Unmap all L7 PTEs when downgrading page-sizes

2019-09-13 Thread Andrei Dulea via iommu
When replacing a large mapping created with page-mode 7 (i.e.
non-default page size), tear down the entire series of replicated PTEs.
Besides providing access to the old mapping, another thing that might go
wrong with this issue is on the fetch_pte() code path that can return a
PDE entry of the newly re-mapped range.

While at it, make sure that we flush the TLB in case alloc_pte() fails
and returns NULL at a lower level.

Fixes: 6d568ef9a622 ("iommu/amd: Allow downgrading page-sizes in alloc_pte()")
Signed-off-by: Andrei Dulea 
---
 drivers/iommu/amd_iommu.c | 30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index a227e7a9b8b7..fda9923542c9 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1512,10 +1512,32 @@ static u64 *alloc_pte(struct protection_domain *domain,
__pte = *pte;
pte_level = PM_PTE_LEVEL(__pte);
 
-   if (!IOMMU_PTE_PRESENT(__pte) ||
-   pte_level == PAGE_MODE_NONE ||
+   /*
+* If we replace a series of large PTEs, we need
+* to tear down all of them.
+*/
+   if (IOMMU_PTE_PRESENT(__pte) &&
pte_level == PAGE_MODE_7_LEVEL) {
+   unsigned long count, i;
+   u64 *lpte;
+
+   lpte = first_pte_l7(pte, NULL, );
+
+   /*
+* Unmap the replicated PTEs that still match the
+* original large mapping
+*/
+   for (i = 0; i < count; ++i)
+   cmpxchg64([i], __pte, 0ULL);
+
+   domain->updated = true;
+   continue;
+   }
+
+   if (!IOMMU_PTE_PRESENT(__pte) ||
+   pte_level == PAGE_MODE_NONE) {
page = (u64 *)get_zeroed_page(gfp);
+
if (!page)
return NULL;
 
@@ -1646,8 +1668,10 @@ static int iommu_map_page(struct protection_domain *dom,
count = PAGE_SIZE_PTE_COUNT(page_size);
pte   = alloc_pte(dom, bus_addr, page_size, NULL, gfp);
 
-   if (!pte)
+   if (!pte) {
+   update_domain(dom);
return -ENOMEM;
+   }
 
for (i = 0; i < count; ++i)
freelist = free_clear_pte([i], pte[i], freelist);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/4] iommu/amd: Fix pages leak in free_pagetable()

2019-09-13 Thread Andrei Dulea via iommu
Take into account the gathered freelist in free_sub_pt(), otherwise we
end up leaking all that pages.

Fixes: 409afa44f9ba ("iommu/amd: Introduce free_sub_pt() function")
Signed-off-by: Andrei Dulea 
---
 drivers/iommu/amd_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 1ed3b98324ba..138547446345 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1425,7 +1425,7 @@ static void free_pagetable(struct protection_domain 
*domain)
BUG_ON(domain->mode < PAGE_MODE_NONE ||
   domain->mode > PAGE_MODE_6_LEVEL);
 
-   free_sub_pt(root, domain->mode, freelist);
+   freelist = free_sub_pt(root, domain->mode, freelist);
 
free_page_list(freelist);
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/4] iommu/amd: re-mapping fixes

2019-09-13 Thread Andrei Dulea via iommu
This patch series tries to address a few issues encountered when
replacing existing mappings:
.> pages leak in free_pagetable()
.> allow downgrading default page-sizes in alloc_pte()
.> tear-down all the replicated PTEs of a large mapping when downgrading
to smaller mappings

Andrei Dulea (4):
  iommu/amd: Fix pages leak in free_pagetable()
  iommu/amd: Fix downgrading default page-sizes in alloc_pte()
  iommu/amd: Introduce first_pte_l7() helper
  iommu/amd: Unmap all L7 PTEs when downgrading page-sizes

 drivers/iommu/amd_iommu.c | 73 +++
 1 file changed, 58 insertions(+), 15 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: Report USF more clearly

2019-09-13 Thread Robin Murphy

On 13/09/2019 15:35, Qian Cai wrote:

On Fri, 2019-09-13 at 12:48 +0100, Robin Murphy wrote:

Although CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is a welcome tool
for smoking out inadequate firmware, the failure mode is non-obvious
and can be confusing for end users. Add some special-case reporting of
Unidentified Stream Faults to help clarify this particular symptom.

CC: Douglas Anderson 
Signed-off-by: Robin Murphy 
---
  drivers/iommu/arm-smmu.c | 5 +
  drivers/iommu/arm-smmu.h | 2 ++
  2 files changed, 7 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b7cf24402a94..76ac8c180695 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -499,6 +499,11 @@ static irqreturn_t arm_smmu_global_fault(int irq, void 
*dev)
dev_err_ratelimited(smmu->dev,
"\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
0x%08x\n",
gfsr, gfsynr0, gfsynr1, gfsynr2);
+   if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) &&
+   (gfsr & sGFSR_USF))
+   dev_err_ratelimited(smmu->dev,
+   "Stream ID %hu may not be described by firmware, try booting with 
\"arm-smmu.disable_bypass=0\"\n",
+   (u16)gfsynr1);


dev_err_once(), i.e., don't need to remind people to set "arm-
smmu.disable_bypass=0" multiple times.


Indeed, but in many cases it then quickly gets buried by an unending 
storm of repeated faults (not every console has capture and scrollback...)


Given that it's a "this is why your machine is on fire" kind of message, 
I figured that it's probably best to err on the side of visibility.


Robin.

  
  	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sGFSR, gfsr);

return IRQ_HANDLED;
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index c9c13b5785f2..46f7e161e83e 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -79,6 +79,8 @@
  #define ID7_MINOR GENMASK(3, 0)
  
  #define ARM_SMMU_GR0_sGFSR		0x48

+#define sGFSR_USF  BIT(2)
+
  #define ARM_SMMU_GR0_sGFSYNR0 0x50
  #define ARM_SMMU_GR0_sGFSYNR1 0x54
  #define ARM_SMMU_GR0_sGFSYNR2 0x58

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: Report USF more clearly

2019-09-13 Thread Qian Cai
On Fri, 2019-09-13 at 12:48 +0100, Robin Murphy wrote:
> Although CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is a welcome tool
> for smoking out inadequate firmware, the failure mode is non-obvious
> and can be confusing for end users. Add some special-case reporting of
> Unidentified Stream Faults to help clarify this particular symptom.
> 
> CC: Douglas Anderson 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/arm-smmu.c | 5 +
>  drivers/iommu/arm-smmu.h | 2 ++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index b7cf24402a94..76ac8c180695 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -499,6 +499,11 @@ static irqreturn_t arm_smmu_global_fault(int irq, void 
> *dev)
>   dev_err_ratelimited(smmu->dev,
>   "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
> 0x%08x\n",
>   gfsr, gfsynr0, gfsynr1, gfsynr2);
> + if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) &&
> + (gfsr & sGFSR_USF))
> + dev_err_ratelimited(smmu->dev,
> + "Stream ID %hu may not be described by firmware, try 
> booting with \"arm-smmu.disable_bypass=0\"\n",
> + (u16)gfsynr1);

dev_err_once(), i.e., don't need to remind people to set "arm-
smmu.disable_bypass=0" multiple times.

>  
>   arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sGFSR, gfsr);
>   return IRQ_HANDLED;
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index c9c13b5785f2..46f7e161e83e 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -79,6 +79,8 @@
>  #define ID7_MINORGENMASK(3, 0)
>  
>  #define ARM_SMMU_GR0_sGFSR   0x48
> +#define sGFSR_USFBIT(2)
> +
>  #define ARM_SMMU_GR0_sGFSYNR00x50
>  #define ARM_SMMU_GR0_sGFSYNR10x54
>  #define ARM_SMMU_GR0_sGFSYNR20x58
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: Report USF more clearly

2019-09-13 Thread Robin Murphy

On 13/09/2019 12:48, Robin Murphy wrote:

Although CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is a welcome tool
for smoking out inadequate firmware, the failure mode is non-obvious
and can be confusing for end users. Add some special-case reporting of
Unidentified Stream Faults to help clarify this particular symptom.

CC: Douglas Anderson 
Signed-off-by: Robin Murphy 
---
  drivers/iommu/arm-smmu.c | 5 +
  drivers/iommu/arm-smmu.h | 2 ++
  2 files changed, 7 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b7cf24402a94..76ac8c180695 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -499,6 +499,11 @@ static irqreturn_t arm_smmu_global_fault(int irq, void 
*dev)
dev_err_ratelimited(smmu->dev,
"\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
0x%08x\n",
gfsr, gfsynr0, gfsynr1, gfsynr2);
+   if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) &&
+   (gfsr & sGFSR_USF))
+   dev_err_ratelimited(smmu->dev,
+   "Stream ID %hu may not be described by firmware, try booting with 
\"arm-smmu.disable_bypass=0\"\n",
+   (u16)gfsynr1);
  
  	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sGFSR, gfsr);

return IRQ_HANDLED;
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index c9c13b5785f2..46f7e161e83e 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -79,6 +79,8 @@
  #define ID7_MINOR GENMASK(3, 0)
  
  #define ARM_SMMU_GR0_sGFSR		0x48

+#define sGFSR_USF  BIT(2)


Sigh... and of course what I actually meant here was that this is the 
2nd bit, which is bit 1, which is also 2. I blame Friday :(


Robin.


+
  #define ARM_SMMU_GR0_sGFSYNR0 0x50
  #define ARM_SMMU_GR0_sGFSYNR1 0x54
  #define ARM_SMMU_GR0_sGFSYNR2 0x58


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/arm-smmu: Report USF more clearly

2019-09-13 Thread Robin Murphy
Although CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is a welcome tool
for smoking out inadequate firmware, the failure mode is non-obvious
and can be confusing for end users. Add some special-case reporting of
Unidentified Stream Faults to help clarify this particular symptom.

CC: Douglas Anderson 
Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 5 +
 drivers/iommu/arm-smmu.h | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b7cf24402a94..76ac8c180695 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -499,6 +499,11 @@ static irqreturn_t arm_smmu_global_fault(int irq, void 
*dev)
dev_err_ratelimited(smmu->dev,
"\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
0x%08x\n",
gfsr, gfsynr0, gfsynr1, gfsynr2);
+   if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) &&
+   (gfsr & sGFSR_USF))
+   dev_err_ratelimited(smmu->dev,
+   "Stream ID %hu may not be described by firmware, try 
booting with \"arm-smmu.disable_bypass=0\"\n",
+   (u16)gfsynr1);
 
arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sGFSR, gfsr);
return IRQ_HANDLED;
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index c9c13b5785f2..46f7e161e83e 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -79,6 +79,8 @@
 #define ID7_MINOR  GENMASK(3, 0)
 
 #define ARM_SMMU_GR0_sGFSR 0x48
+#define sGFSR_USF  BIT(2)
+
 #define ARM_SMMU_GR0_sGFSYNR0  0x50
 #define ARM_SMMU_GR0_sGFSYNR1  0x54
 #define ARM_SMMU_GR0_sGFSYNR2  0x58
-- 
2.21.0.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 1/2] iommu: pass cell_count = -1 to of_for_each_phandle with cells_name

2019-09-13 Thread Joerg Roedel
On Thu, Sep 12, 2019 at 09:43:53AM +0200, Uwe Kleine-König wrote:
> On Tue, Sep 03, 2019 at 02:52:10PM +0200, Joerg Roedel wrote:
> > Acked-by: Joerg Roedel 
> 
> Does this ack mean that Rob is expected to apply this together with
> patch 2?

"Expected" is a strong word. I'd more phrase it like I am fine with this
patch going through his tree.

Regards,

Joerg


Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file

2019-09-13 Thread Guo Ren
Another idea is seperate remote TLB invalidate into two instructions:

 - sfence.vma.b.asyc
 - sfence.vma.b.barrier // wait all async TLB invalidate operations
finished for all harts.

(I remember who mentioned me separate them into two instructions after
session. Anup? Is the idea right ?)

Actually, I never consider asyc TLB invalidate before, because current our
light iommu did not need it.

Thx all people attend the session :) Let's continue the talk.


Guo Ren  于 2019年9月12日周四 22:59写道:

> Thx Will for reply.
>
> On Thu, Sep 12, 2019 at 3:03 PM Will Deacon  wrote:
> >
> > On Sun, Sep 08, 2019 at 07:52:55AM +0800, Guo Ren wrote:
> > > On Mon, Jun 24, 2019 at 6:40 PM Will Deacon  wrote:
> > > > > I'll keep my system use the same ASID for SMP + IOMMU :P
> > > >
> > > > You will want a separate allocator for that:
> > > >
> > > >
> https://lkml.kernel.org/r/20190610184714.6786-2-jean-philippe.bruc...@arm.com
> > >
> > > Yes, it is hard to maintain ASID between IOMMU and CPUMMU or different
> > > system, because it's difficult to synchronize the IO_ASID when the CPU
> > > ASID is rollover.
> > > But we could still use hardware broadcast TLB invalidation instruction
> > > to uniformly manage the ASID and IO_ASID, or OTHER_ASID in our IOMMU.
> >
> > That's probably a bad idea, because you'll likely stall execution on the
> > CPU until the IOTLB has completed invalidation. In the case of ATS, I
> think
> > an endpoint ATC is permitted to take over a minute to respond. In
> reality, I
> > suspect the worst you'll ever see would be in the msec range, but that's
> > still an unacceptable period of time to hold a CPU.
> Just as I've said in the session that IOTLB invalidate delay is
> another topic, My main proposal is to introduce stage1.pgd and
> stage2.pgd as address space identifiers between different TLB systems
> based on vmid, asid. My last part of sildes will show you how to
> translate stage1/2.pgd to as/vmid in PCI ATS system and the method
> could work with SMMU-v3 and intel Vt-d. (It's regret for me there is
> no time to show you the whole slides.)
>
> In our light IOMMU implementation, there's no IOTLB invalidate delay
> problem. Becasue IOMMU is very close to CPU MMU and interconnect's
> delay is the same with SMP CPUs MMU (no PCI, VM supported).
>
> To solve the problem, we could define a async mode in sfence.vma.b to
> slove the problem and finished with per_cpu_irq/exception.
>
> --
> Best Regards
>  Guo Ren
>
> ML: https://lore.kernel.org/linux-csky/
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu