[PATCH v2 RFC/RFT 0/5] Save single pages from CMA area

2019-03-26 Thread Nicolin Chen
This series of patches try to save single pages from CMA area bypassing
all CMA single page alloctions and allocating normal pages instead, as
all addresses within one single page are contiguous.

We had once applied the PATCH-5 but reverted it as actually not all the
callers handled the fallback allocations. Per Robin's suggestion, let's
stuff alloc_pages()/free_page() fallbacks to those callers before having
PATCH-5.

Changlog
v1->v2:
 * PATCH-2: Initialized page pointer to NULL

Nicolin Chen (5):
  ARM: dma-mapping: Add fallback normal page allocations
  dma-remap: Run alloc_pages() on failure
  iommu: amd_iommu: Add fallback normal page allocations
  arm64: dma-mapping: Add fallback normal page allocations
  dma-contiguous: Do not allocate a single page from CMA area

 arch/arm/mm/dma-mapping.c   | 13 ++---
 arch/arm64/mm/dma-mapping.c | 19 ---
 drivers/iommu/amd_iommu.c   |  3 +++
 kernel/dma/contiguous.c | 22 +++---
 kernel/dma/remap.c  |  4 ++--
 5 files changed, 46 insertions(+), 15 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 RFC/RFT 2/5] dma-remap: Run alloc_pages() on failure

2019-03-26 Thread Nicolin Chen
The CMA allocation will skip allocations of single pages to save CMA
resource. This requires its callers to rebound those page allocations
from normal area.

So this patch moves the alloc_pages() call to the fallback routines.

Signed-off-by: Nicolin Chen 
---
Changlog
v1->v2:
 * PATCH-2: Initialized page pointer to NULL

 kernel/dma/remap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c
index 2b750f13bc8f..c2076c6d6c17 100644
--- a/kernel/dma/remap.c
+++ b/kernel/dma/remap.c
@@ -109,14 +109,14 @@ int __init dma_atomic_pool_init(gfp_t gfp, pgprot_t prot)
 {
unsigned int pool_size_order = get_order(atomic_pool_size);
unsigned long nr_pages = atomic_pool_size >> PAGE_SHIFT;
-   struct page *page;
+   struct page *page = NULL;
void *addr;
int ret;
 
if (dev_get_cma_area(NULL))
page = dma_alloc_from_contiguous(NULL, nr_pages,
 pool_size_order, false);
-   else
+   if (!page)
page = alloc_pages(gfp, pool_size_order);
if (!page)
goto out;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 RFC/RFT 5/5] dma-contiguous: Do not allocate a single page from CMA area

2019-03-26 Thread Nicolin Chen
The addresses within a single page are always contiguous, so it's
not so necessary to always allocate one single page from CMA area.
Since the CMA area has a limited predefined size of space, it may
run out of space in heavy use cases, where there might be quite a
lot CMA pages being allocated for single pages.

However, there is also a concern that a device might care where a
page comes from -- it might expect the page from CMA area and act
differently if the page doesn't.

This patch tries to skip of one-page size allocations and returns
NULL so as to let callers allocate normal pages unless the device
has its own CMA area. This would save resources from the CMA area
for more CMA allocations. And it'd also reduce CMA fragmentations
resulted from trivial allocations.

Signed-off-by: Nicolin Chen 
---
 kernel/dma/contiguous.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index b2a87905846d..09074bd04793 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -186,16 +186,32 @@ int __init dma_contiguous_reserve_area(phys_addr_t size, 
phys_addr_t base,
  *
  * This function allocates memory buffer for specified device. It uses
  * device specific contiguous memory area if available or the default
- * global one. Requires architecture specific dev_get_cma_area() helper
- * function.
+ * global one.
+ *
+ * However, it skips one-page size of allocations from the global area.
+ * As the addresses within one page are always contiguous, so there is
+ * no need to waste CMA pages for that kind; it also helps reduce the
+ * fragmentations in the CMA area. So a caller should be the rebounder
+ * in such case to allocate a normal page upon NULL return value.
+ *
+ * Requires architecture specific dev_get_cma_area() helper function.
  */
 struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
   unsigned int align, bool no_warn)
 {
+   struct cma *cma;
+
if (align > CONFIG_CMA_ALIGNMENT)
align = CONFIG_CMA_ALIGNMENT;
 
-   return cma_alloc(dev_get_cma_area(dev), count, align, no_warn);
+   if (dev && dev->cma_area)
+   cma = dev->cma_area;
+   else if (count > 1)
+   cma = dma_contiguous_default_area;
+   else
+   return NULL;
+
+   return cma_alloc(cma, count, align, no_warn);
 }
 
 /**
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 RFC/RFT 4/5] arm64: dma-mapping: Add fallback normal page allocations

2019-03-26 Thread Nicolin Chen
The cma allocation will skip allocations of single pages to save CMA
resource. This requires its callers to rebound those page allocations
from normal area. So this patch adds fallback routines.

Signed-off-by: Nicolin Chen 
---
 arch/arm64/mm/dma-mapping.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 78c0a72f822c..be230254 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -156,17 +156,20 @@ static void *__iommu_alloc_attrs(struct device *dev, 
size_t size,
}
} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
+   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
struct page *page;
 
-   page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-   get_order(size), gfp & __GFP_NOWARN);
+   page = dma_alloc_from_contiguous(dev, count, get_order(size),
+gfp & __GFP_NOWARN);
+   if (!page)
+   page = alloc_pages(gfp, get_order(size));
if (!page)
return NULL;
 
*handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
if (*handle == DMA_MAPPING_ERROR) {
-   dma_release_from_contiguous(dev, page,
-   size >> PAGE_SHIFT);
+   if (!dma_release_from_contiguous(dev, page, count))
+   __free_pages(page, get_order(size));
return NULL;
}
addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
@@ -178,8 +181,8 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t 
size,
memset(addr, 0, size);
} else {
iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
-   dma_release_from_contiguous(dev, page,
-   size >> PAGE_SHIFT);
+   if (!dma_release_from_contiguous(dev, page, count))
+   __free_pages(page, get_order(size));
}
} else {
pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
@@ -201,6 +204,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t 
size,
 static void __iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
   dma_addr_t handle, unsigned long attrs)
 {
+   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
size_t iosize = size;
 
size = PAGE_ALIGN(size);
@@ -222,7 +226,8 @@ static void __iommu_free_attrs(struct device *dev, size_t 
size, void *cpu_addr,
struct page *page = vmalloc_to_page(cpu_addr);
 
iommu_dma_unmap_page(dev, handle, iosize, 0, attrs);
-   dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
+   if (!dma_release_from_contiguous(dev, page, count))
+   __free_pages(page, get_order(size));
dma_common_free_remap(cpu_addr, size, VM_USERMAP);
} else if (is_vmalloc_addr(cpu_addr)){
struct vm_struct *area = find_vm_area(cpu_addr);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 RFC/RFT 1/5] ARM: dma-mapping: Add fallback normal page allocations

2019-03-26 Thread Nicolin Chen
The CMA allocation will skip allocations of single pages to save CMA
resource. This requires its callers to rebound those page allocations
from normal area. So this patch adds fallback routines.

Signed-off-by: Nicolin Chen 
---
 arch/arm/mm/dma-mapping.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 8a90f298af96..febaf637a25b 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -589,6 +589,8 @@ static void *__alloc_from_contiguous(struct device *dev, 
size_t size,
void *ptr = NULL;
 
page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
+   if (!page)
+   page = alloc_pages(gfp, order);
if (!page)
return NULL;
 
@@ -600,7 +602,8 @@ static void *__alloc_from_contiguous(struct device *dev, 
size_t size,
if (PageHighMem(page)) {
ptr = __dma_alloc_remap(page, size, GFP_KERNEL, prot, caller);
if (!ptr) {
-   dma_release_from_contiguous(dev, page, count);
+   if (!dma_release_from_contiguous(dev, page, count))
+   __free_pages(page, get_order(size));
return NULL;
}
} else {
@@ -622,7 +625,8 @@ static void __free_from_contiguous(struct device *dev, 
struct page *page,
else
__dma_remap(page, size, PAGE_KERNEL);
}
-   dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
+   if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
+   __free_pages(page, get_order(size));
 }
 
 static inline pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot)
@@ -1295,6 +1299,8 @@ static struct page **__iommu_alloc_buffer(struct device 
*dev, size_t size,
 
page = dma_alloc_from_contiguous(dev, count, order,
 gfp & __GFP_NOWARN);
+   if (!page)
+   page = alloc_pages(gfp, order);
if (!page)
goto error;
 
@@ -1369,7 +1375,8 @@ static int __iommu_free_buffer(struct device *dev, struct 
page **pages,
int i;
 
if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-   dma_release_from_contiguous(dev, pages[0], count);
+   if (!dma_release_from_contiguous(dev, pages[0], count))
+   __free_pages(page[0], get_order(size));
} else {
for (i = 0; i < count; i++)
if (pages[i])
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 RFC/RFT 3/5] iommu: amd_iommu: Add fallback normal page allocations

2019-03-26 Thread Nicolin Chen
The CMA allocation will skip allocations of single pages to save CMA
resource. This requires its callers to rebound those page allocations
from normal area. So this patch adds fallback routines.

Note: amd_iommu driver uses dma_alloc_from_contiguous() as a fallback
  allocation and uses alloc_pages() as its first round allocation.
  This's in reverse order than other callers. So the alloc_pages()
  added by this change becomes a second fallback, though it likely
  won't succeed since the alloc_pages() has failed once.

Signed-off-by: Nicolin Chen 
---
 drivers/iommu/amd_iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 21cb088d6687..2aa4818f5249 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2701,6 +2701,9 @@ static void *alloc_coherent(struct device *dev, size_t 
size,
 
page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
get_order(size), flag & __GFP_NOWARN);
+   if (!page)
+   page = alloc_pages(flag | __GFP_NOWARN,
+  get_order(size));
if (!page)
return NULL;
}
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC/RFT 2/5] dma-remap: Run alloc_pages() on failure

2019-03-26 Thread Nicolin Chen
On Tue, Mar 26, 2019 at 03:49:56PM -0700, Nicolin Chen wrote:
> @@ -116,7 +116,7 @@ int __init dma_atomic_pool_init(gfp_t gfp, pgprot_t prot)
>   if (dev_get_cma_area(NULL))
>   page = dma_alloc_from_contiguous(NULL, nr_pages,
>pool_size_order, false);
> - else
> + if (!page)
>   page = alloc_pages(gfp, pool_size_order);

Just realized a missing change in this patch, will send v2.
Please ignore this version.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC/RFT 4/5] arm64: dma-mapping: Add fallback normal page allocations

2019-03-26 Thread Nicolin Chen
The cma allocation will skip allocations of single pages to save CMA
resource. This requires its callers to rebound those page allocations
from normal area. So this patch adds fallback routines.

Signed-off-by: Nicolin Chen 
---
 arch/arm64/mm/dma-mapping.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 78c0a72f822c..be230254 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -156,17 +156,20 @@ static void *__iommu_alloc_attrs(struct device *dev, 
size_t size,
}
} else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
+   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
struct page *page;
 
-   page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-   get_order(size), gfp & __GFP_NOWARN);
+   page = dma_alloc_from_contiguous(dev, count, get_order(size),
+gfp & __GFP_NOWARN);
+   if (!page)
+   page = alloc_pages(gfp, get_order(size));
if (!page)
return NULL;
 
*handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
if (*handle == DMA_MAPPING_ERROR) {
-   dma_release_from_contiguous(dev, page,
-   size >> PAGE_SHIFT);
+   if (!dma_release_from_contiguous(dev, page, count))
+   __free_pages(page, get_order(size));
return NULL;
}
addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
@@ -178,8 +181,8 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t 
size,
memset(addr, 0, size);
} else {
iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
-   dma_release_from_contiguous(dev, page,
-   size >> PAGE_SHIFT);
+   if (!dma_release_from_contiguous(dev, page, count))
+   __free_pages(page, get_order(size));
}
} else {
pgprot_t prot = arch_dma_mmap_pgprot(dev, PAGE_KERNEL, attrs);
@@ -201,6 +204,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t 
size,
 static void __iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
   dma_addr_t handle, unsigned long attrs)
 {
+   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
size_t iosize = size;
 
size = PAGE_ALIGN(size);
@@ -222,7 +226,8 @@ static void __iommu_free_attrs(struct device *dev, size_t 
size, void *cpu_addr,
struct page *page = vmalloc_to_page(cpu_addr);
 
iommu_dma_unmap_page(dev, handle, iosize, 0, attrs);
-   dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
+   if (!dma_release_from_contiguous(dev, page, count))
+   __free_pages(page, get_order(size));
dma_common_free_remap(cpu_addr, size, VM_USERMAP);
} else if (is_vmalloc_addr(cpu_addr)){
struct vm_struct *area = find_vm_area(cpu_addr);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC/RFT 3/5] iommu: amd_iommu: Add fallback normal page allocations

2019-03-26 Thread Nicolin Chen
The CMA allocation will skip allocations of single pages to save CMA
resource. This requires its callers to rebound those page allocations
from normal area. So this patch adds fallback routines.

Note: amd_iommu driver uses dma_alloc_from_contiguous() as a fallback
  allocation and uses alloc_pages() as its first round allocation.
  This's in reverse order than other callers. So the alloc_pages()
  added by this change becomes a second fallback, though it likely
  won't succeed since the alloc_pages() has failed once.

Signed-off-by: Nicolin Chen 
---
 drivers/iommu/amd_iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 21cb088d6687..2aa4818f5249 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2701,6 +2701,9 @@ static void *alloc_coherent(struct device *dev, size_t 
size,
 
page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
get_order(size), flag & __GFP_NOWARN);
+   if (!page)
+   page = alloc_pages(flag | __GFP_NOWARN,
+  get_order(size));
if (!page)
return NULL;
}
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC/RFT 5/5] dma-contiguous: Do not allocate a single page from CMA area

2019-03-26 Thread Nicolin Chen
The addresses within a single page are always contiguous, so it's
not so necessary to always allocate one single page from CMA area.
Since the CMA area has a limited predefined size of space, it may
run out of space in heavy use cases, where there might be quite a
lot CMA pages being allocated for single pages.

However, there is also a concern that a device might care where a
page comes from -- it might expect the page from CMA area and act
differently if the page doesn't.

This patch tries to skip of one-page size allocations and returns
NULL so as to let callers allocate normal pages unless the device
has its own CMA area. This would save resources from the CMA area
for more CMA allocations. And it'd also reduce CMA fragmentations
resulted from trivial allocations.

Signed-off-by: Nicolin Chen 
---
 kernel/dma/contiguous.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index b2a87905846d..09074bd04793 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -186,16 +186,32 @@ int __init dma_contiguous_reserve_area(phys_addr_t size, 
phys_addr_t base,
  *
  * This function allocates memory buffer for specified device. It uses
  * device specific contiguous memory area if available or the default
- * global one. Requires architecture specific dev_get_cma_area() helper
- * function.
+ * global one.
+ *
+ * However, it skips one-page size of allocations from the global area.
+ * As the addresses within one page are always contiguous, so there is
+ * no need to waste CMA pages for that kind; it also helps reduce the
+ * fragmentations in the CMA area. So a caller should be the rebounder
+ * in such case to allocate a normal page upon NULL return value.
+ *
+ * Requires architecture specific dev_get_cma_area() helper function.
  */
 struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
   unsigned int align, bool no_warn)
 {
+   struct cma *cma;
+
if (align > CONFIG_CMA_ALIGNMENT)
align = CONFIG_CMA_ALIGNMENT;
 
-   return cma_alloc(dev_get_cma_area(dev), count, align, no_warn);
+   if (dev && dev->cma_area)
+   cma = dev->cma_area;
+   else if (count > 1)
+   cma = dma_contiguous_default_area;
+   else
+   return NULL;
+
+   return cma_alloc(cma, count, align, no_warn);
 }
 
 /**
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC/RFT 1/5] ARM: dma-mapping: Add fallback normal page allocations

2019-03-26 Thread Nicolin Chen
The CMA allocation will skip allocations of single pages to save CMA
resource. This requires its callers to rebound those page allocations
from normal area. So this patch adds fallback routines.

Signed-off-by: Nicolin Chen 
---
 arch/arm/mm/dma-mapping.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 8a90f298af96..febaf637a25b 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -589,6 +589,8 @@ static void *__alloc_from_contiguous(struct device *dev, 
size_t size,
void *ptr = NULL;
 
page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
+   if (!page)
+   page = alloc_pages(gfp, order);
if (!page)
return NULL;
 
@@ -600,7 +602,8 @@ static void *__alloc_from_contiguous(struct device *dev, 
size_t size,
if (PageHighMem(page)) {
ptr = __dma_alloc_remap(page, size, GFP_KERNEL, prot, caller);
if (!ptr) {
-   dma_release_from_contiguous(dev, page, count);
+   if (!dma_release_from_contiguous(dev, page, count))
+   __free_pages(page, get_order(size));
return NULL;
}
} else {
@@ -622,7 +625,8 @@ static void __free_from_contiguous(struct device *dev, 
struct page *page,
else
__dma_remap(page, size, PAGE_KERNEL);
}
-   dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
+   if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
+   __free_pages(page, get_order(size));
 }
 
 static inline pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot)
@@ -1295,6 +1299,8 @@ static struct page **__iommu_alloc_buffer(struct device 
*dev, size_t size,
 
page = dma_alloc_from_contiguous(dev, count, order,
 gfp & __GFP_NOWARN);
+   if (!page)
+   page = alloc_pages(gfp, order);
if (!page)
goto error;
 
@@ -1369,7 +1375,8 @@ static int __iommu_free_buffer(struct device *dev, struct 
page **pages,
int i;
 
if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
-   dma_release_from_contiguous(dev, pages[0], count);
+   if (!dma_release_from_contiguous(dev, pages[0], count))
+   __free_pages(page[0], get_order(size));
} else {
for (i = 0; i < count; i++)
if (pages[i])
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC/RFT 2/5] dma-remap: Run alloc_pages() on failure

2019-03-26 Thread Nicolin Chen
The CMA allocation will skip allocations of single pages to save CMA
resource. This requires its callers to rebound those page allocations
from normal area.

So this patch moves the alloc_pages() call to the fallback routines.

Signed-off-by: Nicolin Chen 
---
 kernel/dma/remap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c
index 2b750f13bc8f..daab2f3186a2 100644
--- a/kernel/dma/remap.c
+++ b/kernel/dma/remap.c
@@ -116,7 +116,7 @@ int __init dma_atomic_pool_init(gfp_t gfp, pgprot_t prot)
if (dev_get_cma_area(NULL))
page = dma_alloc_from_contiguous(NULL, nr_pages,
 pool_size_order, false);
-   else
+   if (!page)
page = alloc_pages(gfp, pool_size_order);
if (!page)
goto out;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC/RFT 0/5] Save single pages from CMA area

2019-03-26 Thread Nicolin Chen
This series of patches try to save single pages from CMA area bypassing
all CMA single page alloctions and allocating normal pages instead, as
all addresses within one single page are contiguous.

We had once applied the PATCH-5 but reverted it as actually not all the
callers handled the fallback allocations. Per Robin's suggestion, let's
stuff alloc_pages()/free_page() fallbacks to those callers before having
PATCH-5.

Nicolin Chen (5):
  ARM: dma-mapping: Add fallback normal page allocations
  dma-remap: Run alloc_pages() on failure
  iommu: amd_iommu: Add fallback normal page allocations
  arm64: dma-mapping: Add fallback normal page allocations
  dma-contiguous: Do not allocate a single page from CMA area

 arch/arm/mm/dma-mapping.c   | 13 ++---
 arch/arm64/mm/dma-mapping.c | 19 ---
 drivers/iommu/amd_iommu.c   |  3 +++
 kernel/dma/contiguous.c | 22 +++---
 kernel/dma/remap.c  |  2 +-
 5 files changed, 45 insertions(+), 14 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 9/9] vfio/type1: Handle different mdev isolation type

2019-03-26 Thread Alex Williamson
On Mon, 25 Mar 2019 09:30:36 +0800
Lu Baolu  wrote:

> This adds the support to determine the isolation type
> of a mediated device group by checking whether it has
> an iommu device. If an iommu device exists, an iommu
> domain will be allocated and then attached to the iommu
> device. Otherwise, keep the same behavior as it is.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Signed-off-by: Sanjay Kumar 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Lu Baolu 
> Reviewed-by: Jean-Philippe Brucker 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 55 +
>  1 file changed, 42 insertions(+), 13 deletions(-)


Acked-by: Alex Williamson 


> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index ccc4165474aa..b91cafcd5181 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -559,7 +559,7 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>   mutex_lock(>lock);
>  
>   /* Fail if notifier list is empty */
> - if ((!iommu->external_domain) || (!iommu->notifier.head)) {
> + if (!iommu->notifier.head) {
>   ret = -EINVAL;
>   goto pin_done;
>   }
> @@ -641,11 +641,6 @@ static int vfio_iommu_type1_unpin_pages(void *iommu_data,
>  
>   mutex_lock(>lock);
>  
> - if (!iommu->external_domain) {
> - mutex_unlock(>lock);
> - return -EINVAL;
> - }
> -
>   do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu);
>   for (i = 0; i < npage; i++) {
>   struct vfio_dma *dma;
> @@ -1368,13 +1363,40 @@ static void vfio_iommu_detach_group(struct 
> vfio_domain *domain,
>   iommu_detach_group(domain->domain, group->iommu_group);
>  }
>  
> +static bool vfio_bus_is_mdev(struct bus_type *bus)
> +{
> + struct bus_type *mdev_bus;
> + bool ret = false;
> +
> + mdev_bus = symbol_get(mdev_bus_type);
> + if (mdev_bus) {
> + ret = (bus == mdev_bus);
> + symbol_put(mdev_bus_type);
> + }
> +
> + return ret;
> +}
> +
> +static int vfio_mdev_iommu_device(struct device *dev, void *data)
> +{
> + struct device **old = data, *new;
> +
> + new = vfio_mdev_get_iommu_device(dev);
> + if (!new || (*old && *old != new))
> + return -EINVAL;
> +
> + *old = new;
> +
> + return 0;
> +}
> +
>  static int vfio_iommu_type1_attach_group(void *iommu_data,
>struct iommu_group *iommu_group)
>  {
>   struct vfio_iommu *iommu = iommu_data;
>   struct vfio_group *group;
>   struct vfio_domain *domain, *d;
> - struct bus_type *bus = NULL, *mdev_bus;
> + struct bus_type *bus = NULL;
>   int ret;
>   bool resv_msi, msi_remap;
>   phys_addr_t resv_msi_base;
> @@ -1409,23 +1431,30 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>   if (ret)
>   goto out_free;
>  
> - mdev_bus = symbol_get(mdev_bus_type);
> + if (vfio_bus_is_mdev(bus)) {
> + struct device *iommu_device = NULL;
>  
> - if (mdev_bus) {
> - if ((bus == mdev_bus) && !iommu_present(bus)) {
> - symbol_put(mdev_bus_type);
> + group->mdev_group = true;
> +
> + /* Determine the isolation type */
> + ret = iommu_group_for_each_dev(iommu_group, _device,
> +vfio_mdev_iommu_device);
> + if (ret || !iommu_device) {
>   if (!iommu->external_domain) {
>   INIT_LIST_HEAD(>group_list);
>   iommu->external_domain = domain;
> - } else
> + } else {
>   kfree(domain);
> + }
>  
>   list_add(>next,
>>external_domain->group_list);
>   mutex_unlock(>lock);
> +
>   return 0;
>   }
> - symbol_put(mdev_bus_type);
> +
> + bus = iommu_device->bus;
>   }
>  
>   domain->domain = iommu_domain_alloc(bus);

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 8/9] vfio/type1: Add domain at(de)taching group helpers

2019-03-26 Thread Alex Williamson
On Mon, 25 Mar 2019 09:30:35 +0800
Lu Baolu  wrote:

> This adds helpers to attach or detach a domain to a
> group. This will replace iommu_attach_group() which
> only works for non-mdev devices.
> 
> If a domain is attaching to a group which includes the
> mediated devices, it should attach to the iommu device
> (a pci device which represents the mdev in iommu scope)
> instead. The added helper supports attaching domain to
> groups for both pci and mdev devices.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Signed-off-by: Sanjay Kumar 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Lu Baolu 
> Reviewed-by: Jean-Philippe Brucker 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 84 ++---
>  1 file changed, 77 insertions(+), 7 deletions(-)


Acked-by: Alex Williamson 


> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 73652e21efec..ccc4165474aa 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -91,6 +91,7 @@ struct vfio_dma {
>  struct vfio_group {
>   struct iommu_group  *iommu_group;
>   struct list_headnext;
> + boolmdev_group; /* An mdev group */
>  };
>  
>  /*
> @@ -1298,6 +1299,75 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group 
> *group, phys_addr_t *base)
>   return ret;
>  }
>  
> +static struct device *vfio_mdev_get_iommu_device(struct device *dev)
> +{
> + struct device *(*fn)(struct device *dev);
> + struct device *iommu_device;
> +
> + fn = symbol_get(mdev_get_iommu_device);
> + if (fn) {
> + iommu_device = fn(dev);
> + symbol_put(mdev_get_iommu_device);
> +
> + return iommu_device;
> + }
> +
> + return NULL;
> +}
> +
> +static int vfio_mdev_attach_domain(struct device *dev, void *data)
> +{
> + struct iommu_domain *domain = data;
> + struct device *iommu_device;
> +
> + iommu_device = vfio_mdev_get_iommu_device(dev);
> + if (iommu_device) {
> + if (iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
> + return iommu_aux_attach_device(domain, iommu_device);
> + else
> + return iommu_attach_device(domain, iommu_device);
> + }
> +
> + return -EINVAL;
> +}
> +
> +static int vfio_mdev_detach_domain(struct device *dev, void *data)
> +{
> + struct iommu_domain *domain = data;
> + struct device *iommu_device;
> +
> + iommu_device = vfio_mdev_get_iommu_device(dev);
> + if (iommu_device) {
> + if (iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
> + iommu_aux_detach_device(domain, iommu_device);
> + else
> + iommu_detach_device(domain, iommu_device);
> + }
> +
> + return 0;
> +}
> +
> +static int vfio_iommu_attach_group(struct vfio_domain *domain,
> +struct vfio_group *group)
> +{
> + if (group->mdev_group)
> + return iommu_group_for_each_dev(group->iommu_group,
> + domain->domain,
> + vfio_mdev_attach_domain);
> + else
> + return iommu_attach_group(domain->domain, group->iommu_group);
> +}
> +
> +static void vfio_iommu_detach_group(struct vfio_domain *domain,
> + struct vfio_group *group)
> +{
> + if (group->mdev_group)
> + iommu_group_for_each_dev(group->iommu_group, domain->domain,
> +  vfio_mdev_detach_domain);
> + else
> + iommu_detach_group(domain->domain, group->iommu_group);
> +}
> +
>  static int vfio_iommu_type1_attach_group(void *iommu_data,
>struct iommu_group *iommu_group)
>  {
> @@ -1373,7 +1443,7 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>   goto out_domain;
>   }
>  
> - ret = iommu_attach_group(domain->domain, iommu_group);
> + ret = vfio_iommu_attach_group(domain, group);
>   if (ret)
>   goto out_domain;
>  
> @@ -1405,8 +1475,8 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>   list_for_each_entry(d, >domain_list, next) {
>   if (d->domain->ops == domain->domain->ops &&
>   d->prot == domain->prot) {
> - iommu_detach_group(domain->domain, iommu_group);
> - if (!iommu_attach_group(d->domain, iommu_group)) {
> + vfio_iommu_detach_group(domain, group);
> + if (!vfio_iommu_attach_group(d, group)) {
>   list_add(>next, >group_list);
>   iommu_domain_free(domain->domain);
>   kfree(domain);
> @@ -1414,7 +1484,7 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>   

Re: [PATCH v8 7/9] vfio/mdev: Add iommu related member in mdev_device

2019-03-26 Thread Alex Williamson
On Mon, 25 Mar 2019 09:30:34 +0800
Lu Baolu  wrote:

> A parent device might create different types of mediated
> devices. For example, a mediated device could be created
> by the parent device with full isolation and protection
> provided by the IOMMU. One usage case could be found on
> Intel platforms where a mediated device is an assignable
> subset of a PCI, the DMA requests on behalf of it are all
> tagged with a PASID. Since IOMMU supports PASID-granular
> translations (scalable mode in VT-d 3.0), this mediated
> device could be individually protected and isolated by an
> IOMMU.
> 
> This patch adds a new member in the struct mdev_device to
> indicate that the mediated device represented by mdev could
> be isolated and protected by attaching a domain to a device
> represented by mdev->iommu_device. It also adds a helper to
> add or set the iommu device.
> 
> * mdev_device->iommu_device
>   - This, if set, indicates that the mediated device could
> be fully isolated and protected by IOMMU via attaching
> an iommu domain to this device. If empty, it indicates
> using vendor defined isolation, hence bypass IOMMU.
> 
> * mdev_set/get_iommu_device(dev, iommu_device)
>   - Set or get the iommu device which represents this mdev
> in IOMMU's device scope. Drivers don't need to set the
> iommu device if it uses vendor defined isolation.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Cc: Liu Yi L 
> Suggested-by: Kevin Tian 
> Suggested-by: Alex Williamson 
> Signed-off-by: Lu Baolu 
> Reviewed-by: Jean-Philippe Brucker 
> ---
>  drivers/vfio/mdev/mdev_core.c| 18 ++
>  drivers/vfio/mdev/mdev_private.h |  1 +
>  include/linux/mdev.h | 14 ++
>  3 files changed, 33 insertions(+)


Acked-by: Alex Williamson 


> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index b96fedc77ee5..1b6435529166 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -390,6 +390,24 @@ int mdev_device_remove(struct device *dev, bool 
> force_remove)
>   return 0;
>  }
>  
> +int mdev_set_iommu_device(struct device *dev, struct device *iommu_device)
> +{
> + struct mdev_device *mdev = to_mdev_device(dev);
> +
> + mdev->iommu_device = iommu_device;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(mdev_set_iommu_device);
> +
> +struct device *mdev_get_iommu_device(struct device *dev)
> +{
> + struct mdev_device *mdev = to_mdev_device(dev);
> +
> + return mdev->iommu_device;
> +}
> +EXPORT_SYMBOL(mdev_get_iommu_device);
> +
>  static int __init mdev_init(void)
>  {
>   return mdev_bus_register();
> diff --git a/drivers/vfio/mdev/mdev_private.h 
> b/drivers/vfio/mdev/mdev_private.h
> index 379758c52b1b..bfb7b22a7cb6 100644
> --- a/drivers/vfio/mdev/mdev_private.h
> +++ b/drivers/vfio/mdev/mdev_private.h
> @@ -34,6 +34,7 @@ struct mdev_device {
>   struct list_head next;
>   struct kobject *type_kobj;
>   bool active;
> + struct device *iommu_device;
>  };
>  
>  #define to_mdev_device(dev)  container_of(dev, struct mdev_device, dev)
> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
> index d7aee90e5da5..df2ea39f47ee 100644
> --- a/include/linux/mdev.h
> +++ b/include/linux/mdev.h
> @@ -15,6 +15,20 @@
>  
>  struct mdev_device;
>  
> +/*
> + * Called by the parent device driver to set the device which represents
> + * this mdev in iommu protection scope. By default, the iommu device is
> + * NULL, that indicates using vendor defined isolation.
> + *
> + * @dev: the mediated device that iommu will isolate.
> + * @iommu_device: a pci device which represents the iommu for @dev.
> + *
> + * Return 0 for success, otherwise negative error value.
> + */
> +int mdev_set_iommu_device(struct device *dev, struct device *iommu_device);
> +
> +struct device *mdev_get_iommu_device(struct device *dev);
> +
>  /**
>   * struct mdev_parent_ops - Structure to be registered for each parent 
> device to
>   * register the device to mdev module.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release

2019-03-26 Thread John Garry

On 26/03/2019 12:31, Geert Uytterhoeven wrote:

Hi John,

CC robh

On Tue, Mar 26, 2019 at 12:42 PM John Garry  wrote:

Memory is incorrectly freed using the direct ops, as dma_map_ops = NULL.
Oops...

After reversing the order of the calls to arch_teardown_dma_ops() and
devres_release_all(), dma_map_ops is still valid, and the DMA memory is
now released using __iommu_free_attrs():

+sata_rcar ee30.sata: dmam_release:32: size 2048 vaddr ff8012145000 
dma_handle 0x0xf000 attrs 0x0
+sata_rcar ee30.sata: dma_free_attrs:289: size 2048, ops = iommu_dma_ops
+sata_rcar ee30.sata: dma_free_attrs:311: calling __iommu_free_attrs()
---
 drivers/base/dd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 8ac10af17c0043a3..d62487d024559620 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -968,9 +968,9 @@ static void __device_release_driver(struct device *dev, 
struct device *parent)
  drv->remove(dev);

  device_links_driver_cleanup(dev);
- arch_teardown_dma_ops(dev);

  devres_release_all(dev);
+ arch_teardown_dma_ops(dev);
  dev->driver = NULL;


Hi guys,

Could there still be the same problem in the error path of really_probe():

static int really_probe(struct device *dev, struct device_driver *drv)
{

[...]

goto done;

probe_failed:
arch_teardown_dma_ops(dev);
dma_failed:
if (dev->bus)
blocking_notifier_call_chain(>bus->p->bus_notifier,
 BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
pinctrl_bind_failed:
device_links_no_driver(dev);
devres_release_all(dev);
driver_sysfs_remove(dev);
dev->driver = NULL;
dev_set_drvdata(dev, NULL);

We seem to be able to call arch_teardown_dma_ops() prior to
devres_release_all() if we reach probe_failed label.


Yes, this looks like another instance of the same problem.
And test_remove doesn't expose this, as it doesn't exercise the full cycle.



Hi Geert,

OK, thanks.

There is another devres_release_all() call in device_release(). Not sure 
if there is another potential problem there also...


As for this issue, I'll look to fix unless anyone else doing it.

Cheers,
John


Gr{oetje,eeting}s,

Geert




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted

2019-03-26 Thread Michael S. Tsirkin
On Wed, Jan 30, 2019 at 08:44:27AM +0100, Christoph Hellwig wrote:
> On Tue, Jan 29, 2019 at 09:36:08PM -0500, Michael S. Tsirkin wrote:
> > This has been discussed ad nauseum. virtio is all about compatibility.
> > Losing a couple of lines of code isn't worth breaking working setups.
> > People that want "just use DMA API no tricks" now have the option.
> > Setting a flag in a feature bit map is literally a single line
> > of code in the hypervisor. So stop pushing for breaking working
> > legacy setups and just fix it in the right place.
> 
> I agree with the legacy aspect.  What I am missing is an extremely
> strong wording that says you SHOULD always set this flag for new
> hosts, including an explanation why.


So as far as power is concerned, IIUC the issue they are struggling with is
that some platforms do not support pass-through mode in the emulated IOMMU.
Disabling PLATFORM_ACCESS is so far a way around that, unfortunately just for
virtio devices.  I would like virtio-iommu to be able to address that need as
well.

-- 
MST
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release

2019-03-26 Thread Geert Uytterhoeven
Hi John,

CC robh

On Tue, Mar 26, 2019 at 12:42 PM John Garry  wrote:
> > Memory is incorrectly freed using the direct ops, as dma_map_ops = NULL.
> > Oops...
> >
> > After reversing the order of the calls to arch_teardown_dma_ops() and
> > devres_release_all(), dma_map_ops is still valid, and the DMA memory is
> > now released using __iommu_free_attrs():
> >
> > +sata_rcar ee30.sata: dmam_release:32: size 2048 vaddr 
> > ff8012145000 dma_handle 0x0xf000 attrs 0x0
> > +sata_rcar ee30.sata: dma_free_attrs:289: size 2048, ops = 
> > iommu_dma_ops
> > +sata_rcar ee30.sata: dma_free_attrs:311: calling 
> > __iommu_free_attrs()
> > ---
> >  drivers/base/dd.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > index 8ac10af17c0043a3..d62487d024559620 100644
> > --- a/drivers/base/dd.c
> > +++ b/drivers/base/dd.c
> > @@ -968,9 +968,9 @@ static void __device_release_driver(struct device *dev, 
> > struct device *parent)
> >   drv->remove(dev);
> >
> >   device_links_driver_cleanup(dev);
> > - arch_teardown_dma_ops(dev);
> >
> >   devres_release_all(dev);
> > + arch_teardown_dma_ops(dev);
> >   dev->driver = NULL;
>
> Hi guys,
>
> Could there still be the same problem in the error path of really_probe():
>
> static int really_probe(struct device *dev, struct device_driver *drv)
> {
>
> [...]
>
> goto done;
>
> probe_failed:
> arch_teardown_dma_ops(dev);
> dma_failed:
> if (dev->bus)
> blocking_notifier_call_chain(>bus->p->bus_notifier,
>  BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
> pinctrl_bind_failed:
> device_links_no_driver(dev);
> devres_release_all(dev);
> driver_sysfs_remove(dev);
> dev->driver = NULL;
> dev_set_drvdata(dev, NULL);
>
> We seem to be able to call arch_teardown_dma_ops() prior to
> devres_release_all() if we reach probe_failed label.

Yes, this looks like another instance of the same problem.
And test_remove doesn't expose this, as it doesn't exercise the full cycle.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release

2019-03-26 Thread John Garry


Memory is incorrectly freed using the direct ops, as dma_map_ops = NULL.
Oops...

After reversing the order of the calls to arch_teardown_dma_ops() and
devres_release_all(), dma_map_ops is still valid, and the DMA memory is
now released using __iommu_free_attrs():

+sata_rcar ee30.sata: dmam_release:32: size 2048 vaddr ff8012145000 
dma_handle 0x0xf000 attrs 0x0
+sata_rcar ee30.sata: dma_free_attrs:289: size 2048, ops = iommu_dma_ops
+sata_rcar ee30.sata: dma_free_attrs:311: calling __iommu_free_attrs()
---
 drivers/base/dd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 8ac10af17c0043a3..d62487d024559620 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -968,9 +968,9 @@ static void __device_release_driver(struct device *dev, 
struct device *parent)
drv->remove(dev);

device_links_driver_cleanup(dev);
-   arch_teardown_dma_ops(dev);

devres_release_all(dev);
+   arch_teardown_dma_ops(dev);
dev->driver = NULL;


Hi guys,

Could there still be the same problem in the error path of really_probe():

static int really_probe(struct device *dev, struct device_driver *drv)
{

[...]

goto done;

probe_failed:
arch_teardown_dma_ops(dev);
dma_failed:
if (dev->bus)
blocking_notifier_call_chain(>bus->p->bus_notifier,
 BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
pinctrl_bind_failed:
device_links_no_driver(dev);
devres_release_all(dev);
driver_sysfs_remove(dev);
dev->driver = NULL;
dev_set_drvdata(dev, NULL);

We seem to be able to call arch_teardown_dma_ops() prior to 
devres_release_all() if we reach probe_failed label.


We have seen this crash when our driver probe fails on a dev branch 
based on v5.1-rc1:


[   87.896707] hisi_sas_v3_hw :74:02.0: Adding to iommu group 2
[   87.909765] scsi host1: hisi_sas_v3_hw
[   89.127958] hisi_sas_v3_hw :74:02.0: evaluate _DSM failed
[   89.134043] BUG: Bad page state in process swapper/0  pfn:313f5
[   89.139965] page:7ec4fd40 count:1 mapcount:0 
mapping: index:0x0

[   89.147960] flags: 0xfffe0001000(reserved)
[   89.152398] raw: 0fffe0001000 7ec4fd48 7ec4fd48 

[   89.160130] raw:   0001 


[   89.167861] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
[   89.174290] bad because of flags: 0x1000(reserved)
[   89.179070] Modules linked in:
[   89.182117] CPU: 49 PID: 1 Comm: swapper/0 Not tainted 
5.1.0-rc1-43081-g22d97fd-dirty #1433
[   89.190453] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI 
RC0 - V1.12.01 01/29/2019

[   89.198876] Call trace:
[   89.201316]  dump_backtrace+0x0/0x118
[   89.204965]  show_stack+0x14/0x1c
[   89.208272]  dump_stack+0xa4/0xc8
[   89.211576]  bad_page+0xe4/0x13c
[   89.214791]  free_pages_check_bad+0x4c/0xc0
[   89.218961]  __free_pages_ok+0x30c/0x340
[   89.222871]  __free_pages+0x30/0x44
[   89.226347]  __dma_direct_free_pages+0x30/0x38
[   89.230777]  dma_direct_free+0x24/0x38
[   89.234513]  dma_free_attrs+0x9c/0xd8
[   89.238161]  dmam_release+0x20/0x28
[   89.241640]  release_nodes+0x17c/0x220
[   89.245375]  devres_release_all+0x34/0x54
[   89.249371]  really_probe+0xc4/0x2c8
[   89.252933]  driver_probe_device+0x58/0xfc
[   89.257016]  device_driver_attach+0x68/0x70
[   89.261185]  __driver_attach+0x94/0xdc
[   89.264921]  bus_for_each_dev+0x5c/0xb4
[   89.268744]  driver_attach+0x20/0x28
[   89.272306]  bus_add_driver+0x14c/0x200
[   89.276128]  driver_register+0x6c/0x124
[   89.279953]  __pci_register_driver+0x48/0x50
[   89.284213]  sas_v3_pci_driver_init+0x20/0x28
[   89.288557]  do_one_initcall+0x40/0x25c
[   89.292381]  kernel_init_freeable+0x2b8/0x3c0
[   89.296727]  kernel_init+0x10/0x100
[   89.300202]  ret_from_fork+0x10/0x18
[   89.303773] Disabling lock debugging due to kernel taint
[   89.309076] BUG: Bad page state in process swapper/0  pfn:313f6
[   89.314988] page:7ec4fd80 count:1 mapcount:0 
mapping: index:0x0

[   89.322983] flags: 0xfffe0001000(reserved)
[   89.327417] raw: 0fffe0001000 7ec4fd88 7ec4fd88 

[   89.335149] raw:   0001 



Thanks,
John



dev_set_drvdata(dev, NULL);
if (dev->pm_domain && dev->pm_domain->dismiss)




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 8/9] vfio/type1: Add domain at(de)taching group helpers

2019-03-26 Thread Kirti Wankhede



On 2/22/2019 7:49 AM, Lu Baolu wrote:
> This adds helpers to attach or detach a domain to a
> group. This will replace iommu_attach_group() which
> only works for non-mdev devices.
> 
> If a domain is attaching to a group which includes the
> mediated devices, it should attach to the iommu device
> (a pci device which represents the mdev in iommu scope)
> instead. The added helper supports attaching domain to
> groups for both pci and mdev devices.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Signed-off-by: Sanjay Kumar 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Lu Baolu 
> Reviewed-by: Jean-Philippe Brucker 

Reviewed-by: Kirti Wankhede 

Thanks,
Kirti


> ---
>  drivers/vfio/vfio_iommu_type1.c | 84 ++---
>  1 file changed, 77 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 73652e21efec..ccc4165474aa 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -91,6 +91,7 @@ struct vfio_dma {
>  struct vfio_group {
>   struct iommu_group  *iommu_group;
>   struct list_headnext;
> + boolmdev_group; /* An mdev group */
>  };
>  
>  /*
> @@ -1298,6 +1299,75 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group 
> *group, phys_addr_t *base)
>   return ret;
>  }
>  
> +static struct device *vfio_mdev_get_iommu_device(struct device *dev)
> +{
> + struct device *(*fn)(struct device *dev);
> + struct device *iommu_device;
> +
> + fn = symbol_get(mdev_get_iommu_device);
> + if (fn) {
> + iommu_device = fn(dev);
> + symbol_put(mdev_get_iommu_device);
> +
> + return iommu_device;
> + }
> +
> + return NULL;
> +}
> +
> +static int vfio_mdev_attach_domain(struct device *dev, void *data)
> +{
> + struct iommu_domain *domain = data;
> + struct device *iommu_device;
> +
> + iommu_device = vfio_mdev_get_iommu_device(dev);
> + if (iommu_device) {
> + if (iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
> + return iommu_aux_attach_device(domain, iommu_device);
> + else
> + return iommu_attach_device(domain, iommu_device);
> + }
> +
> + return -EINVAL;
> +}
> +
> +static int vfio_mdev_detach_domain(struct device *dev, void *data)
> +{
> + struct iommu_domain *domain = data;
> + struct device *iommu_device;
> +
> + iommu_device = vfio_mdev_get_iommu_device(dev);
> + if (iommu_device) {
> + if (iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
> + iommu_aux_detach_device(domain, iommu_device);
> + else
> + iommu_detach_device(domain, iommu_device);
> + }
> +
> + return 0;
> +}
> +
> +static int vfio_iommu_attach_group(struct vfio_domain *domain,
> +struct vfio_group *group)
> +{
> + if (group->mdev_group)
> + return iommu_group_for_each_dev(group->iommu_group,
> + domain->domain,
> + vfio_mdev_attach_domain);
> + else
> + return iommu_attach_group(domain->domain, group->iommu_group);
> +}
> +
> +static void vfio_iommu_detach_group(struct vfio_domain *domain,
> + struct vfio_group *group)
> +{
> + if (group->mdev_group)
> + iommu_group_for_each_dev(group->iommu_group, domain->domain,
> +  vfio_mdev_detach_domain);
> + else
> + iommu_detach_group(domain->domain, group->iommu_group);
> +}
> +
>  static int vfio_iommu_type1_attach_group(void *iommu_data,
>struct iommu_group *iommu_group)
>  {
> @@ -1373,7 +1443,7 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>   goto out_domain;
>   }
>  
> - ret = iommu_attach_group(domain->domain, iommu_group);
> + ret = vfio_iommu_attach_group(domain, group);
>   if (ret)
>   goto out_domain;
>  
> @@ -1405,8 +1475,8 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>   list_for_each_entry(d, >domain_list, next) {
>   if (d->domain->ops == domain->domain->ops &&
>   d->prot == domain->prot) {
> - iommu_detach_group(domain->domain, iommu_group);
> - if (!iommu_attach_group(d->domain, iommu_group)) {
> + vfio_iommu_detach_group(domain, group);
> + if (!vfio_iommu_attach_group(d, group)) {
>   list_add(>next, >group_list);
>   iommu_domain_free(domain->domain);
>   kfree(domain);
> @@ -1414,7 +1484,7 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>

Re: [PATCH v8 7/9] vfio/mdev: Add iommu related member in mdev_device

2019-03-26 Thread Kirti Wankhede



On 3/25/2019 7:00 AM, Lu Baolu wrote:
> A parent device might create different types of mediated
> devices. For example, a mediated device could be created
> by the parent device with full isolation and protection
> provided by the IOMMU. One usage case could be found on
> Intel platforms where a mediated device is an assignable
> subset of a PCI, the DMA requests on behalf of it are all
> tagged with a PASID. Since IOMMU supports PASID-granular
> translations (scalable mode in VT-d 3.0), this mediated
> device could be individually protected and isolated by an
> IOMMU.
> 
> This patch adds a new member in the struct mdev_device to
> indicate that the mediated device represented by mdev could
> be isolated and protected by attaching a domain to a device
> represented by mdev->iommu_device. It also adds a helper to
> add or set the iommu device.
> 
> * mdev_device->iommu_device
>   - This, if set, indicates that the mediated device could
> be fully isolated and protected by IOMMU via attaching
> an iommu domain to this device. If empty, it indicates
> using vendor defined isolation, hence bypass IOMMU.
> 
> * mdev_set/get_iommu_device(dev, iommu_device)
>   - Set or get the iommu device which represents this mdev
> in IOMMU's device scope. Drivers don't need to set the
> iommu device if it uses vendor defined isolation.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Cc: Liu Yi L 
> Suggested-by: Kevin Tian 
> Suggested-by: Alex Williamson 
> Signed-off-by: Lu Baolu 
> Reviewed-by: Jean-Philippe Brucker 
> ---
>  drivers/vfio/mdev/mdev_core.c| 18 ++
>  drivers/vfio/mdev/mdev_private.h |  1 +
>  include/linux/mdev.h | 14 ++
>  3 files changed, 33 insertions(+)
> 
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index b96fedc77ee5..1b6435529166 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -390,6 +390,24 @@ int mdev_device_remove(struct device *dev, bool 
> force_remove)
>   return 0;
>  }
>  
> +int mdev_set_iommu_device(struct device *dev, struct device *iommu_device)
> +{
> + struct mdev_device *mdev = to_mdev_device(dev);
> +
> + mdev->iommu_device = iommu_device;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(mdev_set_iommu_device);
> +
> +struct device *mdev_get_iommu_device(struct device *dev)
> +{
> + struct mdev_device *mdev = to_mdev_device(dev);
> +
> + return mdev->iommu_device;
> +}
> +EXPORT_SYMBOL(mdev_get_iommu_device);
> +
>  static int __init mdev_init(void)
>  {
>   return mdev_bus_register();
> diff --git a/drivers/vfio/mdev/mdev_private.h 
> b/drivers/vfio/mdev/mdev_private.h
> index 379758c52b1b..bfb7b22a7cb6 100644
> --- a/drivers/vfio/mdev/mdev_private.h
> +++ b/drivers/vfio/mdev/mdev_private.h
> @@ -34,6 +34,7 @@ struct mdev_device {
>   struct list_head next;
>   struct kobject *type_kobj;
>   bool active;
> + struct device *iommu_device;
>  };
>  
>  #define to_mdev_device(dev)  container_of(dev, struct mdev_device, dev)
> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
> index d7aee90e5da5..df2ea39f47ee 100644
> --- a/include/linux/mdev.h
> +++ b/include/linux/mdev.h
> @@ -15,6 +15,20 @@
>  
>  struct mdev_device;
>  
> +/*
> + * Called by the parent device driver to set the device which represents
> + * this mdev in iommu protection scope. By default, the iommu device is
> + * NULL, that indicates using vendor defined isolation.
> + *
> + * @dev: the mediated device that iommu will isolate.
> + * @iommu_device: a pci device which represents the iommu for @dev.
> + *
> + * Return 0 for success, otherwise negative error value.
> + */
> +int mdev_set_iommu_device(struct device *dev, struct device *iommu_device);
> +
> +struct device *mdev_get_iommu_device(struct device *dev);
> +
>  /**
>   * struct mdev_parent_ops - Structure to be registered for each parent 
> device to
>   * register the device to mdev module.
> 

Reviewed-by: Kirti Wankhede 

Thanks,
Kirti
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/4] firmware: qcom_scm-64: Add atomic version of qcom_scm_call

2019-03-26 Thread Vivek Gautam



On 3/26/2019 2:39 AM, Bjorn Andersson wrote:

On Sun 09 Sep 23:25 PDT 2018, Vivek Gautam wrote:


There are scnenarios where drivers are required to make a
scm call in atomic context, such as in one of the qcom's
arm-smmu-500 errata [1].

[1] ("https://source.codeaurora.org/quic/la/kernel/msm-4.9/
   tree/drivers/iommu/arm-smmu.c?h=msm-4.9#n4842")

Signed-off-by: Vivek Gautam 

Reviewed-by: Bjorn Andersson 



Thanks Bjorn for reviewing and testing this series.
I will repost the series on latest head.

Best regards
Vivek



Regards,
Bjorn


---
  drivers/firmware/qcom_scm-64.c | 136 -
  1 file changed, 92 insertions(+), 44 deletions(-)

diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index 688525dd4aee..3a8c867cdf51 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -70,32 +70,71 @@ static DEFINE_MUTEX(qcom_scm_lock);
  #define FIRST_EXT_ARG_IDX 3
  #define N_REGISTER_ARGS (MAX_QCOM_SCM_ARGS - N_EXT_QCOM_SCM_ARGS + 1)
  
-/**

- * qcom_scm_call() - Invoke a syscall in the secure world
- * @dev:   device
- * @svc_id:service identifier
- * @cmd_id:command identifier
- * @desc:  Descriptor structure containing arguments and return values
- *
- * Sends a command to the SCM and waits for the command to finish processing.
- * This should *only* be called in pre-emptible context.
-*/
-static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
-const struct qcom_scm_desc *desc,
-struct arm_smccc_res *res)
+static void __qcom_scm_call_do(const struct qcom_scm_desc *desc,
+  struct arm_smccc_res *res, u32 fn_id,
+  u64 x5, u32 type)
+{
+   u64 cmd;
+   struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
+
+   cmd = ARM_SMCCC_CALL_VAL(type, qcom_smccc_convention,
+ARM_SMCCC_OWNER_SIP, fn_id);
+
+   quirk.state.a6 = 0;
+
+   do {
+   arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
+   desc->args[1], desc->args[2], x5,
+   quirk.state.a6, 0, res, );
+
+   if (res->a0 == QCOM_SCM_INTERRUPTED)
+   cmd = res->a0;
+
+   } while (res->a0 == QCOM_SCM_INTERRUPTED);
+}
+
+static void qcom_scm_call_do(const struct qcom_scm_desc *desc,
+struct arm_smccc_res *res, u32 fn_id,
+u64 x5, bool atomic)
+{
+   int retry_count = 0;
+
+   if (!atomic) {
+   do {
+   mutex_lock(_scm_lock);
+
+   __qcom_scm_call_do(desc, res, fn_id, x5,
+  ARM_SMCCC_STD_CALL);
+
+   mutex_unlock(_scm_lock);
+
+   if (res->a0 == QCOM_SCM_V2_EBUSY) {
+   if (retry_count++ > QCOM_SCM_EBUSY_MAX_RETRY)
+   break;
+   msleep(QCOM_SCM_EBUSY_WAIT_MS);
+   }
+   }  while (res->a0 == QCOM_SCM_V2_EBUSY);
+   } else {
+   __qcom_scm_call_do(desc, res, fn_id, x5, ARM_SMCCC_FAST_CALL);
+   }
+}
+
+static int ___qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
+   const struct qcom_scm_desc *desc,
+   struct arm_smccc_res *res, bool atomic)
  {
int arglen = desc->arginfo & 0xf;
-   int retry_count = 0, i;
+   int i;
u32 fn_id = QCOM_SCM_FNID(svc_id, cmd_id);
-   u64 cmd, x5 = desc->args[FIRST_EXT_ARG_IDX];
+   u64 x5 = desc->args[FIRST_EXT_ARG_IDX];
dma_addr_t args_phys = 0;
void *args_virt = NULL;
size_t alloc_len;
-   struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
+   gfp_t flag = atomic ? GFP_ATOMIC : GFP_KERNEL;
  
  	if (unlikely(arglen > N_REGISTER_ARGS)) {

alloc_len = N_EXT_QCOM_SCM_ARGS * sizeof(u64);
-   args_virt = kzalloc(PAGE_ALIGN(alloc_len), GFP_KERNEL);
+   args_virt = kzalloc(PAGE_ALIGN(alloc_len), flag);
  
  		if (!args_virt)

return -ENOMEM;
@@ -125,33 +164,7 @@ static int qcom_scm_call(struct device *dev, u32 svc_id, 
u32 cmd_id,
x5 = args_phys;
}
  
-	do {

-   mutex_lock(_scm_lock);
-
-   cmd = ARM_SMCCC_CALL_VAL(ARM_SMCCC_STD_CALL,
-qcom_smccc_convention,
-ARM_SMCCC_OWNER_SIP, fn_id);
-
-   quirk.state.a6 = 0;
-
-   do {
-   arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
- desc->args[1], desc->args[2], x5,
- quirk.state.a6, 0, res, );
-
-