Re: [RESEND][PATCH 0/7] Avoid overflow at boundary_size

2020-09-01 Thread Christoph Hellwig
On Tue, Sep 01, 2020 at 12:54:01AM -0700, Nicolin Chen wrote:
> Hi Christoph,
> 
> On Tue, Sep 01, 2020 at 09:36:23AM +0200, Christoph Hellwig wrote:
> > I really don't like all the open coded smarts in the various drivers.
> > What do you think about a helper like the one in the untested patch
> 
> A helper function will be actually better. I was thinking of
> one yet not very sure about the naming and where to put it.
> 
> > below (on top of your series).  Also please include the original
> > segment boundary patch with the next resend so that the series has
> > the full context.
> 
> I will use your change instead and resend with the ULONG_MAX
> change. But in that case, should I make separate changes for
> different files like this series, or just one single change
> like yours?
> 
> Asking this as I was expecting that those changes would get
> applied by different maintainers. But now it feels like you
> will merge it to your tree at once?

I guess one patch is fine.  I can queue it up in the dma-mapping
tree as a prep patch for the default boundary change.


Re: [RESEND][PATCH 0/7] Avoid overflow at boundary_size

2020-09-01 Thread Nicolin Chen
Hi Christoph,

On Tue, Sep 01, 2020 at 09:36:23AM +0200, Christoph Hellwig wrote:
> I really don't like all the open coded smarts in the various drivers.
> What do you think about a helper like the one in the untested patch

A helper function will be actually better. I was thinking of
one yet not very sure about the naming and where to put it.

> below (on top of your series).  Also please include the original
> segment boundary patch with the next resend so that the series has
> the full context.

I will use your change instead and resend with the ULONG_MAX
change. But in that case, should I make separate changes for
different files like this series, or just one single change
like yours?

Asking this as I was expecting that those changes would get
applied by different maintainers. But now it feels like you
will merge it to your tree at once?

Thanks
Nic

> diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
> index 1ef2c647bd3ec2..6f7de4f4e191e7 100644
> --- a/arch/alpha/kernel/pci_iommu.c
> +++ b/arch/alpha/kernel/pci_iommu.c
> @@ -141,10 +141,7 @@ iommu_arena_find_pages(struct device *dev, struct 
> pci_iommu_arena *arena,
>   unsigned long boundary_size;
>  
>   base = arena->dma_base >> PAGE_SHIFT;
> -
> - boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
> - /* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> - boundary_size = (boundary_size >> PAGE_SHIFT) + 1;
> + boundary_size = dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT);
>  
>   /* Search forward for the first mask-aligned sequence of N free ptes */
>   ptes = arena->ptes;
> diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
> index 945954903bb0ba..b49b73a95067d2 100644
> --- a/arch/ia64/hp/common/sba_iommu.c
> +++ b/arch/ia64/hp/common/sba_iommu.c
> @@ -485,8 +485,7 @@ sba_search_bitmap(struct ioc *ioc, struct device *dev,
>   ASSERT(((unsigned long) ioc->res_hint & (sizeof(unsigned long) - 1UL)) 
> == 0);
>   ASSERT(res_ptr < res_end);
>  
> - /* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> - boundary_size = (dma_get_seg_boundary(dev) >> iovp_shift) + 1;
> + boundary_size = dma_get_seg_boundary_nr_pages(dev, iovp_shift);
>  
>   BUG_ON(ioc->ibase & ~iovp_mask);
>   shift = ioc->ibase >> iovp_shift;
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index c01ccbf8afdd42..cbc2e62db597cf 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -236,11 +236,7 @@ static unsigned long iommu_range_alloc(struct device 
> *dev,
>   }
>   }
>  
> - /* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
> - boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
> -
> - /* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> - boundary_size = (boundary_size >> tbl->it_page_shift) + 1;
> + boundary_size = dma_get_seg_boundary_nr_pages(dev, tbl->it_page_shift);
>  
>   n = iommu_area_alloc(tbl->it_map, limit, start, npages, tbl->it_offset,
>boundary_size, align_mask);
> diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
> index ecb067acc6d532..4a37d8f4de9d9d 100644
> --- a/arch/s390/pci/pci_dma.c
> +++ b/arch/s390/pci/pci_dma.c
> @@ -261,13 +261,11 @@ static unsigned long __dma_alloc_iommu(struct device 
> *dev,
>  unsigned long start, int size)
>  {
>   struct zpci_dev *zdev = to_zpci(to_pci_dev(dev));
> - unsigned long boundary_size;
>  
> - /* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> - boundary_size = (dma_get_seg_boundary(dev) >> PAGE_SHIFT) + 1;
>   return iommu_area_alloc(zdev->iommu_bitmap, zdev->iommu_pages,
>   start, size, zdev->start_dma >> PAGE_SHIFT,
> - boundary_size, 0);
> + dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT),
> + 0);
>  }
>  
>  static dma_addr_t dma_alloc_address(struct device *dev, int size)
> diff --git a/arch/sparc/kernel/iommu-common.c 
> b/arch/sparc/kernel/iommu-common.c
> index 843e71894d0482..e6139c99762e11 100644
> --- a/arch/sparc/kernel/iommu-common.c
> +++ b/arch/sparc/kernel/iommu-common.c
> @@ -166,10 +166,6 @@ unsigned long iommu_tbl_range_alloc(struct device *dev,
>   }
>   }
>  
> - boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
> -
> - /* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
> - boundary_size = (boundary_size >> iommu->table_shift) + 1;
>   /*
>* if the skip_span_boundary_check had been set during init, we set
>* things up so that iommu_is_span_boundary() merely checks if the
> @@ -178,7 +174,11 @@ unsigned long iommu_tbl_range_alloc(struct device *dev,
>   if ((iommu->flags & IOMMU_NO_SPAN_BOUND) != 0) {
>   shift = 0;
>   

Re: [RESEND][PATCH 0/7] Avoid overflow at boundary_size

2020-09-01 Thread Christoph Hellwig
I really don't like all the open coded smarts in the various drivers.
What do you think about a helper like the one in the untested patch
below (on top of your series).  Also please include the original
segment boundary patch with the next resend so that the series has
the full context.

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 1ef2c647bd3ec2..6f7de4f4e191e7 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -141,10 +141,7 @@ iommu_arena_find_pages(struct device *dev, struct 
pci_iommu_arena *arena,
unsigned long boundary_size;
 
base = arena->dma_base >> PAGE_SHIFT;
-
-   boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
-   /* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-   boundary_size = (boundary_size >> PAGE_SHIFT) + 1;
+   boundary_size = dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT);
 
/* Search forward for the first mask-aligned sequence of N free ptes */
ptes = arena->ptes;
diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index 945954903bb0ba..b49b73a95067d2 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -485,8 +485,7 @@ sba_search_bitmap(struct ioc *ioc, struct device *dev,
ASSERT(((unsigned long) ioc->res_hint & (sizeof(unsigned long) - 1UL)) 
== 0);
ASSERT(res_ptr < res_end);
 
-   /* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-   boundary_size = (dma_get_seg_boundary(dev) >> iovp_shift) + 1;
+   boundary_size = dma_get_seg_boundary_nr_pages(dev, iovp_shift);
 
BUG_ON(ioc->ibase & ~iovp_mask);
shift = ioc->ibase >> iovp_shift;
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index c01ccbf8afdd42..cbc2e62db597cf 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -236,11 +236,7 @@ static unsigned long iommu_range_alloc(struct device *dev,
}
}
 
-   /* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
-   boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
-
-   /* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-   boundary_size = (boundary_size >> tbl->it_page_shift) + 1;
+   boundary_size = dma_get_seg_boundary_nr_pages(dev, tbl->it_page_shift);
 
n = iommu_area_alloc(tbl->it_map, limit, start, npages, tbl->it_offset,
 boundary_size, align_mask);
diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index ecb067acc6d532..4a37d8f4de9d9d 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -261,13 +261,11 @@ static unsigned long __dma_alloc_iommu(struct device *dev,
   unsigned long start, int size)
 {
struct zpci_dev *zdev = to_zpci(to_pci_dev(dev));
-   unsigned long boundary_size;
 
-   /* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-   boundary_size = (dma_get_seg_boundary(dev) >> PAGE_SHIFT) + 1;
return iommu_area_alloc(zdev->iommu_bitmap, zdev->iommu_pages,
start, size, zdev->start_dma >> PAGE_SHIFT,
-   boundary_size, 0);
+   dma_get_seg_boundary_nr_pages(dev, PAGE_SHIFT),
+   0);
 }
 
 static dma_addr_t dma_alloc_address(struct device *dev, int size)
diff --git a/arch/sparc/kernel/iommu-common.c b/arch/sparc/kernel/iommu-common.c
index 843e71894d0482..e6139c99762e11 100644
--- a/arch/sparc/kernel/iommu-common.c
+++ b/arch/sparc/kernel/iommu-common.c
@@ -166,10 +166,6 @@ unsigned long iommu_tbl_range_alloc(struct device *dev,
}
}
 
-   boundary_size = dev ? dma_get_seg_boundary(dev) : U32_MAX;
-
-   /* Overflow-free shortcut for: ALIGN(b + 1, 1 << s) >> s */
-   boundary_size = (boundary_size >> iommu->table_shift) + 1;
/*
 * if the skip_span_boundary_check had been set during init, we set
 * things up so that iommu_is_span_boundary() merely checks if the
@@ -178,7 +174,11 @@ unsigned long iommu_tbl_range_alloc(struct device *dev,
if ((iommu->flags & IOMMU_NO_SPAN_BOUND) != 0) {
shift = 0;
boundary_size = iommu->poolsize * iommu->nr_pools;
+   } else {
+   boundary_size = dma_get_seg_boundary_nr_pages(dev,
+   iommu->table_shift);
}
+
n = iommu_area_alloc(iommu->map, limit, start, npages, shift,
 boundary_size, align_mask);
if (n == -1) {
diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index d981c37305ae31..c3e4e2df26a8b8 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -472,8 +472,7 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
outs->dma_length = 0;
 
max_seg_size = 

[RESEND][PATCH 0/7] Avoid overflow at boundary_size

2020-08-31 Thread Nicolin Chen
 For this resend 
The original series have not been acked at any patch. So I am
resending them, being suggested by Niklas.

 Coverletter 
We are expending the default DMA segmentation boundary to its
possible maximum value (ULONG_MAX) to indicate that a device
doesn't specify a boundary limit. So all dma_get_seg_boundary
callers should take a precaution with the return values since
it would easily get overflowed.

I scanned the entire kernel tree for all the existing callers
and found that most of callers may get overflowed in two ways:
either "+ 1" or passing it to ALIGN() that does "+ mask".

According to kernel defines:
#define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
#define ALIGN(x, a) ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

So this series of patches fix the potential overflow with this
overflow-free shortcut.

As I don't have these platforms, testings/comments are welcome.

Thanks
Nic

Nicolin Chen (7):
  powerpc/iommu: Avoid overflow at boundary_size
  alpha: Avoid overflow at boundary_size
  ia64/sba_iommu: Avoid overflow at boundary_size
  s390/pci_dma: Avoid overflow at boundary_size
  sparc: Avoid overflow at boundary_size
  x86/amd_gart: Avoid overflow at boundary_size
  parisc: Avoid overflow at boundary_size

 arch/alpha/kernel/pci_iommu.c| 10 --
 arch/ia64/hp/common/sba_iommu.c  |  4 ++--
 arch/powerpc/kernel/iommu.c  | 11 +--
 arch/s390/pci/pci_dma.c  |  4 ++--
 arch/sparc/kernel/iommu-common.c |  9 +++--
 arch/sparc/kernel/iommu.c|  4 ++--
 arch/sparc/kernel/pci_sun4v.c|  4 ++--
 arch/x86/kernel/amd_gart_64.c|  4 ++--
 drivers/parisc/ccio-dma.c|  4 ++--
 drivers/parisc/sba_iommu.c   |  4 ++--
 10 files changed, 26 insertions(+), 32 deletions(-)

-- 
2.17.1