Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-04 Thread Jerry Snitselaar

On Thu Dec 05 19, Lu Baolu wrote:

Hi,

On 12/5/19 10:25 AM, Jerry Snitselaar wrote:


It seems that iommu pci bus probe didn't enumerate device [01:00.2] and
[02:00.0], the corresponding context entries were not setup. Hence dma
fault generated when devices access the memory.

Do these two devices show in "lspci" output? How do these devices get
enumerated by the system?

Best regards,
baolu



They are there in the output, but it seems out of order:



[   23.446201] pci :01:00.0: Adding to iommu group 25
[   23.448949] pci :01:00.0: Using iommu dma mapping
[   23.450807] pci :01:00.1: Adding to iommu group 25
[   23.452666] pci :01:00.1: DMAR: Device uses a private identity 
domain.

[   23.455063] pci :01:00.2: Adding to iommu group 25
[   23.456881] pci :01:00.4: Adding to iommu group 25
[   23.458693] pci :01:00.4: DMAR: Device uses a private identity 
domain.


Oh, yes!

So device 01:00.0 01:00.1 01:00.2 01:00.4 share a single group. The
default domain for this group has been set to DMA although iommu=pt has
been set. As the result, .0 .2 use DMA, but .1, .4 use IDENTITY. This is
not a valid configuration since all devices in a group should use a same
domain.

Do you mind posting the "lspci -vvv" output of these devices? I want to
figure out why these devices request different domain type.

Best regards,
baolu



01:00.0 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard 
Slave Instrumentation & System Support (rev 05)
Subsystem: Hewlett-Packard Company iLO4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-04 Thread Lu Baolu

Hi,

On 12/5/19 10:25 AM, Jerry Snitselaar wrote:


It seems that iommu pci bus probe didn't enumerate device [01:00.2] and
[02:00.0], the corresponding context entries were not setup. Hence dma
fault generated when devices access the memory.

Do these two devices show in "lspci" output? How do these devices get
enumerated by the system?

Best regards,
baolu



They are there in the output, but it seems out of order:



[   23.446201] pci :01:00.0: Adding to iommu group 25
[   23.448949] pci :01:00.0: Using iommu dma mapping
[   23.450807] pci :01:00.1: Adding to iommu group 25
[   23.452666] pci :01:00.1: DMAR: Device uses a private identity 
domain.

[   23.455063] pci :01:00.2: Adding to iommu group 25
[   23.456881] pci :01:00.4: Adding to iommu group 25
[   23.458693] pci :01:00.4: DMAR: Device uses a private identity 
domain.


Oh, yes!

So device 01:00.0 01:00.1 01:00.2 01:00.4 share a single group. The
default domain for this group has been set to DMA although iommu=pt has
been set. As the result, .0 .2 use DMA, but .1, .4 use IDENTITY. This is
not a valid configuration since all devices in a group should use a same
domain.

Do you mind posting the "lspci -vvv" output of these devices? I want to
figure out why these devices request different domain type.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-04 Thread Jerry Snitselaar

On Thu Dec 05 19, Lu Baolu wrote:

Hi,

On 12/5/19 4:53 AM, Jerry Snitselaar wrote:

Attaching console output (can't get to a point to actually log in) and
config that is used to build that kernel.


[...]
[   21.969477] pci :00:00.0: Adding to iommu group 0
[   21.971390] pci :00:01.0: Adding to iommu group 1
[   21.973173] pci :00:01.1: Adding to iommu group 2
[   21.974930] pci :00:02.0: Adding to iommu group 3
[   21.976672] pci :00:02.1: Adding to iommu group 4
[   21.978446] pci :00:02.2: Adding to iommu group 5
[   21.980224] pci :00:02.3: Adding to iommu group 6
[   21.982096] pci :00:03.0: Adding to iommu group 7
[   21.983868] pci :00:03.1: Adding to iommu group 8
[   21.985644] pci :00:03.2: Adding to iommu group 9
[   21.987484] pci :00:03.3: Adding to iommu group 10
[   21.989830] pci :00:04.0: Adding to iommu group 11
[   21.991738] pci :00:04.1: Adding to iommu group 11
[   21.993557] pci :00:04.2: Adding to iommu group 11
[   21.995360] pci :00:04.3: Adding to iommu group 11
[   21.997145] pci :00:04.4: Adding to iommu group 11
[   21.998915] pci :00:04.5: Adding to iommu group 11
[   22.000694] pci :00:04.6: Adding to iommu group 11
[   22.002569] pci :00:04.7: Adding to iommu group 11
[   22.004556] pci :00:05.0: Adding to iommu group 12
[   22.006388] pci :00:05.2: Adding to iommu group 12
[   22.008186] pci :00:05.4: Adding to iommu group 12
[   22.009968] pci :00:11.0: Adding to iommu group 13
[   22.011815] pci :00:1a.0: Adding to iommu group 14
[   22.013605] pci :00:1c.0: Adding to iommu group 15
[   22.015408] pci :00:1c.7: Adding to iommu group 16
[   22.017216] pci :00:1d.0: Adding to iommu group 17
[   22.018991] pci :00:1e.0: Adding to iommu group 18
[   22.021826] pci :00:1e.0: Using iommu dma mapping
[   22.023783] pci :00:1f.0: Adding to iommu group 19
[   22.025667] pci :00:1f.2: Adding to iommu group 19
[   22.346001] pci :03:00.0: Adding to iommu group 20
[   22.348727] pci :03:00.0: Using iommu dma mapping
[   22.350644] pci :03:00.1: Adding to iommu group 20
[   22.352833] pci :03:00.2: Adding to iommu group 20
[...]

It seems that iommu pci bus probe didn't enumerate device [01:00.2] and
[02:00.0], the corresponding context entries were not setup. Hence dma
fault generated when devices access the memory.

Do these two devices show in "lspci" output? How do these devices get
enumerated by the system?

Best regards,
baolu



They are there in the output, but it seems out of order:

[   22.025667] pci :00:1f.2: Adding to iommu group 19
[   22.028569] pci :00:1f.2: DMAR: Setting identity map- 0xe8fff]
[   22.331183] pci :00:1f.2: DMAR: Setting identity map [0xf4000 - 0xf4fff]
[   22.333546] pci :00:1f.2: DMAR: Setting identity map [0xbdf6e000 - 
0xbdf6efff]
[   22.336099] pci :00:1f.2: DMAR: Setting identity map [0xbdf6f000 - 
0xbdf7efff]
[   22.338604] pci :00:1f.2: DMAR: Setting identity map [0xbdf7f000 - 
0xbdf82fff]
[   22.341189] pci :00:1f.2: DMAR: Setting identity map [0xbdf83000 - 
0xbdf84fff]
[   22.343700] pci :00:1f.2: DMAR: Device uses a private dma domain.
[   22.346001] pci :03:00.0: Adding to iommu group 20
[   22.348727] pci :03:00.0: Using iommu dma mapping
[   22.350644] pci :03:00.1: Adding to iommu group 20
[   22.352833] pci :03:00.2: Adding to iommu group 20
[   22.354619] pci :03:00.3: Adding to iommu group 20


[   22.356423] pci :02:00.0: Adding to iommu group 21
[   22.358999] pci :02:00.0: Using iommu dma mapping


[   22.360785] pci :04:00.0: Adding to iommu group 22
[   22.362623] pci :05:02.0: Adding to iommu group 23
[   22.364412] pci :05:04.0: Adding to iommu group 24
[   22.366172] pci :06:00.0: Adding to iommu group 23
[   22.368762] pci :06:00.0: DMAR: Setting identity map [0xe8000 - 0xe8fff]
[   22.371290] pci :06:00.0: DMAR: Setting identity map [0xf4000 - 0xf4fff]
[   22.373646] pci :06:00.0: DMAR: Setting ide000 - 0xbdf6efff]
[   22.876042] pci :06:00.0: DMAR: Setting identity map [0xbdf6f000 - 
0xbdf7efff]
[   22.878572] pci :06:00.0: DMAR: Setting identity map [0xbdf7f000 - 
0xbdf82fff]
[   22.881167] pci :06:00.0: DMAR: Setting identity map [0xbdf83000 - 
0xbdf84fff]
[   22.883729] pci :06:00.0: DMAR: Device uses a private dma domain.
[   22.885899] pci :06:00.1: Adding to iommu group 23
[   22.888675] pci :06:00.1: DMAR: Setting identity map [0xe8000 - 0xe8fff]
[   22.891216] pci :06:00.1: DMAR: Setting identity map [0xf4000 - 0xf4fff]
[   22.893576] pci :06:00.1: DMAR: Setting identity map [0xbdf6e000 - 
0xbdf6efff]
[   22.896119] pci :06:00.1: DMAR: Setting identity map [0xbdf6f000 - 
0xbdf7efff]
[   22.898620] pci :06:00.1: DMAR: Setting identity map [0xbdf7f000 - 
0xbdf82fff]
[   22.901232] pci :06:00.1: DMAR: Setting identity map [0xbdf83000 - 
0xbdf84fff]

Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-04 Thread Lu Baolu

Hi,

On 12/5/19 4:53 AM, Jerry Snitselaar wrote:

Attaching console output (can't get to a point to actually log in) and
config that is used to build that kernel.


[...]
[   21.969477] pci :00:00.0: Adding to iommu group 0
[   21.971390] pci :00:01.0: Adding to iommu group 1
[   21.973173] pci :00:01.1: Adding to iommu group 2
[   21.974930] pci :00:02.0: Adding to iommu group 3
[   21.976672] pci :00:02.1: Adding to iommu group 4
[   21.978446] pci :00:02.2: Adding to iommu group 5
[   21.980224] pci :00:02.3: Adding to iommu group 6
[   21.982096] pci :00:03.0: Adding to iommu group 7
[   21.983868] pci :00:03.1: Adding to iommu group 8
[   21.985644] pci :00:03.2: Adding to iommu group 9
[   21.987484] pci :00:03.3: Adding to iommu group 10
[   21.989830] pci :00:04.0: Adding to iommu group 11
[   21.991738] pci :00:04.1: Adding to iommu group 11
[   21.993557] pci :00:04.2: Adding to iommu group 11
[   21.995360] pci :00:04.3: Adding to iommu group 11
[   21.997145] pci :00:04.4: Adding to iommu group 11
[   21.998915] pci :00:04.5: Adding to iommu group 11
[   22.000694] pci :00:04.6: Adding to iommu group 11
[   22.002569] pci :00:04.7: Adding to iommu group 11
[   22.004556] pci :00:05.0: Adding to iommu group 12
[   22.006388] pci :00:05.2: Adding to iommu group 12
[   22.008186] pci :00:05.4: Adding to iommu group 12
[   22.009968] pci :00:11.0: Adding to iommu group 13
[   22.011815] pci :00:1a.0: Adding to iommu group 14
[   22.013605] pci :00:1c.0: Adding to iommu group 15
[   22.015408] pci :00:1c.7: Adding to iommu group 16
[   22.017216] pci :00:1d.0: Adding to iommu group 17
[   22.018991] pci :00:1e.0: Adding to iommu group 18
[   22.021826] pci :00:1e.0: Using iommu dma mapping
[   22.023783] pci :00:1f.0: Adding to iommu group 19
[   22.025667] pci :00:1f.2: Adding to iommu group 19
[   22.346001] pci :03:00.0: Adding to iommu group 20
[   22.348727] pci :03:00.0: Using iommu dma mapping
[   22.350644] pci :03:00.1: Adding to iommu group 20
[   22.352833] pci :03:00.2: Adding to iommu group 20
[...]

It seems that iommu pci bus probe didn't enumerate device [01:00.2] and
[02:00.0], the corresponding context entries were not setup. Hence dma
fault generated when devices access the memory.

Do these two devices show in "lspci" output? How do these devices get
enumerated by the system?

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 7/8] linux/log2.h: Fix 64bit calculations in roundup/down_pow_two()

2019-12-04 Thread Lu Baolu

Hi,

On 12/3/19 7:47 PM, Nicolas Saenz Julienne wrote:

Some users need to make sure their rounding function accepts and returns
64bit long variables regardless of the architecture. Sadly
roundup/rounddown_pow_two() takes and returns unsigned longs. It turns
out ilog2() already handles 32/64bit calculations properly, and being
the building block to the round functions we can rework them as a
wrapper around it.

Suggested-by: Robin Murphy 
Signed-off-by: Nicolas Saenz Julienne 
---
  drivers/clk/clk-divider.c|  8 ++--
  drivers/clk/sunxi/clk-sunxi.c|  2 +-
  drivers/infiniband/hw/hfi1/chip.c|  4 +-
  drivers/infiniband/hw/hfi1/init.c|  4 +-
  drivers/infiniband/hw/mlx4/srq.c |  2 +-
  drivers/infiniband/hw/mthca/mthca_srq.c  |  2 +-
  drivers/infiniband/sw/rxe/rxe_qp.c   |  4 +-
  drivers/iommu/intel-iommu.c  |  4 +-
  drivers/iommu/intel-svm.c|  4 +-
  drivers/iommu/intel_irq_remapping.c  |  2 +-


For changes in drivers/iommu/intel*.c,

Reviewed-by: Lu Baolu 

Best regards,
baolu


  drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c |  4 +-
  drivers/net/ethernet/marvell/sky2.c  |  2 +-
  drivers/net/ethernet/rocker/rocker_hw.h  |  4 +-
  drivers/net/ethernet/sfc/ef10.c  |  2 +-
  drivers/net/ethernet/sfc/efx.h   |  2 +-
  drivers/net/ethernet/sfc/falcon/efx.h|  2 +-
  drivers/pci/msi.c|  2 +-
  include/linux/log2.h | 44 +---
  kernel/kexec_core.c  |  3 +-
  lib/rhashtable.c |  2 +-
  net/sunrpc/xprtrdma/verbs.c  |  2 +-
  21 files changed, 41 insertions(+), 64 deletions(-)

diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c
index 098b2b01f0af..ba947e4c8193 100644
--- a/drivers/clk/clk-divider.c
+++ b/drivers/clk/clk-divider.c
@@ -222,7 +222,7 @@ static int _div_round_up(const struct clk_div_table *table,
int div = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
  
  	if (flags & CLK_DIVIDER_POWER_OF_TWO)

-   div = __roundup_pow_of_two(div);
+   div = roundup_pow_of_two(div);
if (table)
div = _round_up_table(table, div);
  
@@ -240,8 +240,8 @@ static int _div_round_closest(const struct clk_div_table *table,

down = parent_rate / rate;
  
  	if (flags & CLK_DIVIDER_POWER_OF_TWO) {

-   up = __roundup_pow_of_two(up);
-   down = __rounddown_pow_of_two(down);
+   up = roundup_pow_of_two(up);
+   down = rounddown_pow_of_two(down);
} else if (table) {
up = _round_up_table(table, up);
down = _round_down_table(table, down);
@@ -278,7 +278,7 @@ static int _next_div(const struct clk_div_table *table, int 
div,
div++;
  
  	if (flags & CLK_DIVIDER_POWER_OF_TWO)

-   return __roundup_pow_of_two(div);
+   return roundup_pow_of_two(div);
if (table)
return _round_up_table(table, div);
  
diff --git a/drivers/clk/sunxi/clk-sunxi.c b/drivers/clk/sunxi/clk-sunxi.c

index 27201fd26e44..faec99dc09c0 100644
--- a/drivers/clk/sunxi/clk-sunxi.c
+++ b/drivers/clk/sunxi/clk-sunxi.c
@@ -311,7 +311,7 @@ static void sun6i_get_ahb1_factors(struct factors_request 
*req)
  
  		calcm = DIV_ROUND_UP(div, 1 << calcp);

} else {
-   calcp = __roundup_pow_of_two(div);
+   calcp = roundup_pow_of_two(div);
calcp = calcp > 3 ? 3 : calcp;
}
  
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c

index 9b1fb84a3d45..96b1d343c32f 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -14199,10 +14199,10 @@ static int qos_rmt_entries(struct hfi1_devdata *dd, 
unsigned int *mp,
max_by_vl = krcvqs[i];
if (max_by_vl > 32)
goto no_qos;
-   m = ilog2(__roundup_pow_of_two(max_by_vl));
+   m = ilog2(roundup_pow_of_two(max_by_vl));
  
  	/* determine bits for vl */

-   n = ilog2(__roundup_pow_of_two(num_vls));
+   n = ilog2(roundup_pow_of_two(num_vls));
  
  	/* reject if too much is used */

if ((m + n) > 7)
diff --git a/drivers/infiniband/hw/hfi1/init.c 
b/drivers/infiniband/hw/hfi1/init.c
index 26b792bb1027..838c789c7cce 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -467,7 +467,7 @@ int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int 
numa,
 * MTU supported.
 */
if (rcd->egrbufs.size < hfi1_max_mtu) {
-   rcd->egrbufs.size = __roundup_pow_of_two(hfi1_max_mtu);
+   rcd->egrbufs.size = roundup_pow_of_two(hfi1_max_mtu);
hfi1_cdbg(PROC,
  "ctxt%u: eager bufs 

Re: [PATCH v2 1/8] dt-bindings: arm-smmu: Add Adreno GPU variant

2019-12-04 Thread Rob Clark
On Wed, Dec 4, 2019 at 7:56 AM Robin Murphy  wrote:
>
> On 22/11/2019 11:31 pm, Jordan Crouse wrote:
> > Add a compatible string to identify SMMUs that are attached
> > to Adreno GPU devices that wish to support split pagetables.
>
> A software policy decision is not, in itself, a good justification for a
> DT property. Is the GPU SMMU fundamentally different in hardware* from
> the other SMMU(s) in any given SoC?

The GPU CP has some sort of mechanism to switch pagetables.. although
I guess under the firmware it is all the same.  Jordan should know
better..

BR,
-R

> (* where "hardware" may encompass hypervisor shenanigans)
>
> > Signed-off-by: Jordan Crouse 
> > ---
> >
> >   Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 6 ++
> >   1 file changed, 6 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
> > b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > index 6515dbe..db9f826 100644
> > --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > @@ -31,6 +31,12 @@ properties:
> > - qcom,sdm845-smmu-v2
> > - const: qcom,smmu-v2
> >
> > +  - description: Qcom Adreno GPU SMMU iplementing split pagetables
> > +items:
> > +  - enum:
> > +  - qcom,adreno-smmu-v2
> > +  - const: qcom,smmu-v2
>
> Given that we already have per-SoC compatibles for Qcom SMMUs in
> general, this seems suspiciously vague.
>
> Robin.
>
> > +
> > - description: Qcom SoCs implementing "arm,mmu-500"
> >   items:
> > - enum:
> >
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/vt-d: Consolidate various cache flush ops

2019-12-04 Thread Jacob Pan
On Wed, 4 Dec 2019 08:32:17 +0800
Lu Baolu  wrote:

> Hi Jacob,
> 
> On 12/4/19 12:50 AM, Jacob Pan wrote:
> > On Tue, 3 Dec 2019 10:44:45 +0800
> > Lu Baolu  wrote:
> >   
> >> Hi Jacob,
> >>
> >> On 12/3/19 4:02 AM, Jacob Pan wrote:  
> >>> On Fri, 22 Nov 2019 11:04:44 +0800
> >>> Lu Baolu  wrote:
> >>>  
>  Intel VT-d 3.0 introduces more caches and interfaces for software
>  to flush when it runs in the scalable mode. Currently various
>  cache flush helpers are scattered around. This consolidates them
>  by putting them in the existing iommu_flush structure.
> 
>  /* struct iommu_flush - Intel IOMMU cache invalidation ops
> *
> * @cc_inv: invalidate context cache
> * @iotlb_inv: Invalidate IOTLB and paging structure caches
>  when software
> * has changed second-level tables.
> * @p_iotlb_inv: Invalidate IOTLB and paging structure caches
>  when software
> *   has changed first-level tables.
> * @pc_inv: invalidate pasid cache
> * @dev_tlb_inv: invalidate cached mappings used by
>  requests-without-PASID
> *   from the Device-TLB on a endpoint device.
> * @p_dev_tlb_inv: invalidate cached mappings used by
>  requests-with-PASID
> * from the Device-TLB on an endpoint device
> */
>  struct iommu_flush {
>    void (*cc_inv)(struct intel_iommu *iommu, u16 did,
>   u16 sid, u8 fm, u64 type);
>    void (*iotlb_inv)(struct intel_iommu *iommu, u16 did,
>  u64 addr, unsigned int size_order, u64 type);
>    void (*p_iotlb_inv)(struct intel_iommu *iommu, u16 did,
>  u32 pasid, u64 addr, unsigned long npages, bool ih);
>    void (*pc_inv)(struct intel_iommu *iommu, u16 did, u32
>  pasid, u64 granu);
>    void (*dev_tlb_inv)(struct intel_iommu *iommu, u16 sid,
>  u16 pfsid, u16 qdep, u64 addr, unsigned int mask);
>    void (*p_dev_tlb_inv)(struct intel_iommu *iommu, u16
>  sid, u16 pfsid, u32 pasid, u16 qdep, u64 addr,
>  unsigned long npages);
>  };
> 
>  The name of each cache flush ops is defined according to the spec
>  section 6.5 so that people are easy to look up them in the spec.
>  
> >>> Nice consolidation. For nested SVM, I also introduced cache
> >>> flushed helpers as needed.
> >>> https://lkml.org/lkml/2019/10/24/857
> >>>
> >>> Should I wait for yours to be merged or you want to extend the
> >>> this consolidation after SVA/SVM cache flush? I expect to send my
> >>> v8 shortly.  
> >>
> >> Please base your v8 patch on this series. So it could get more
> >> chances for test.
> >>  
> > Sounds good.  
> 
> I am sorry I need to spend more time on this patch series. Please go
> ahead without it.
> 
NP, let me know when you need testing.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] powerpc: ensure that swiotlb buffer is allocated from low memory

2019-12-04 Thread Mike Rapoport
From: Mike Rapoport 

Some powerpc platforms (e.g. 85xx) limit DMA-able memory way below 4G. If a
system has more physical memory than this limit, the swiotlb buffer is not
addressable because it is allocated from memblock using top-down mode.

Force memblock to bottom-up mode before calling swiotlb_init() to ensure
that the swiotlb buffer is DMA-able.

Link: https://lkml.kernel.org/r/f1ebb706-73df-430e-9020-c214ec8ed...@xenosoft.de
Reported-by: Christian Zigotzky 
Signed-off-by: Mike Rapoport 
Cc: Benjamin Herrenschmidt 
Cc: Christoph Hellwig 
Cc: Darren Stevens 
Cc: mad skateman 
Cc: Michael Ellerman 
Cc: Nicolas Saenz Julienne 
Cc: Paul Mackerras 
Cc: Robin Murphy 
Cc: Rob Herring 
---
 arch/powerpc/mm/mem.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index be941d382c8d..14c2c53e3f9e 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -260,6 +260,14 @@ void __init mem_init(void)
BUILD_BUG_ON(MMU_PAGE_COUNT > 16);
 
 #ifdef CONFIG_SWIOTLB
+   /*
+* Some platforms (e.g. 85xx) limit DMA-able memory way below
+* 4G. We force memblock to bottom-up mode to ensure that the
+* memory allocated in swiotlb_init() is DMA-able.
+* As it's the last memblock allocation, no need to reset it
+* back to to-down.
+*/
+   memblock_set_bottom_up(true);
swiotlb_init(0);
 #endif
 
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 7/8] linux/log2.h: Fix 64bit calculations in roundup/down_pow_two()

2019-12-04 Thread Martin Habets
For the changes under drivers/net/ethernet/sfc:
Reviewed-by: Martin Habets 

On 03/12/2019 11:47, Nicolas Saenz Julienne wrote:
> Some users need to make sure their rounding function accepts and returns
> 64bit long variables regardless of the architecture. Sadly
> roundup/rounddown_pow_two() takes and returns unsigned longs. It turns
> out ilog2() already handles 32/64bit calculations properly, and being
> the building block to the round functions we can rework them as a
> wrapper around it.
> 
> Suggested-by: Robin Murphy 
> Signed-off-by: Nicolas Saenz Julienne 
> ---
>  drivers/clk/clk-divider.c|  8 ++--
>  drivers/clk/sunxi/clk-sunxi.c|  2 +-
>  drivers/infiniband/hw/hfi1/chip.c|  4 +-
>  drivers/infiniband/hw/hfi1/init.c|  4 +-
>  drivers/infiniband/hw/mlx4/srq.c |  2 +-
>  drivers/infiniband/hw/mthca/mthca_srq.c  |  2 +-
>  drivers/infiniband/sw/rxe/rxe_qp.c   |  4 +-
>  drivers/iommu/intel-iommu.c  |  4 +-
>  drivers/iommu/intel-svm.c|  4 +-
>  drivers/iommu/intel_irq_remapping.c  |  2 +-
>  drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c |  4 +-
>  drivers/net/ethernet/marvell/sky2.c  |  2 +-
>  drivers/net/ethernet/rocker/rocker_hw.h  |  4 +-
>  drivers/net/ethernet/sfc/ef10.c  |  2 +-
>  drivers/net/ethernet/sfc/efx.h   |  2 +-
>  drivers/net/ethernet/sfc/falcon/efx.h|  2 +-
>  drivers/pci/msi.c|  2 +-
>  include/linux/log2.h | 44 +---
>  kernel/kexec_core.c  |  3 +-
>  lib/rhashtable.c |  2 +-
>  net/sunrpc/xprtrdma/verbs.c  |  2 +-
>  21 files changed, 41 insertions(+), 64 deletions(-)
> 
> diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c
> index 098b2b01f0af..ba947e4c8193 100644
> --- a/drivers/clk/clk-divider.c
> +++ b/drivers/clk/clk-divider.c
> @@ -222,7 +222,7 @@ static int _div_round_up(const struct clk_div_table 
> *table,
>   int div = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
>  
>   if (flags & CLK_DIVIDER_POWER_OF_TWO)
> - div = __roundup_pow_of_two(div);
> + div = roundup_pow_of_two(div);
>   if (table)
>   div = _round_up_table(table, div);
>  
> @@ -240,8 +240,8 @@ static int _div_round_closest(const struct clk_div_table 
> *table,
>   down = parent_rate / rate;
>  
>   if (flags & CLK_DIVIDER_POWER_OF_TWO) {
> - up = __roundup_pow_of_two(up);
> - down = __rounddown_pow_of_two(down);
> + up = roundup_pow_of_two(up);
> + down = rounddown_pow_of_two(down);
>   } else if (table) {
>   up = _round_up_table(table, up);
>   down = _round_down_table(table, down);
> @@ -278,7 +278,7 @@ static int _next_div(const struct clk_div_table *table, 
> int div,
>   div++;
>  
>   if (flags & CLK_DIVIDER_POWER_OF_TWO)
> - return __roundup_pow_of_two(div);
> + return roundup_pow_of_two(div);
>   if (table)
>   return _round_up_table(table, div);
>  
> diff --git a/drivers/clk/sunxi/clk-sunxi.c b/drivers/clk/sunxi/clk-sunxi.c
> index 27201fd26e44..faec99dc09c0 100644
> --- a/drivers/clk/sunxi/clk-sunxi.c
> +++ b/drivers/clk/sunxi/clk-sunxi.c
> @@ -311,7 +311,7 @@ static void sun6i_get_ahb1_factors(struct factors_request 
> *req)
>  
>   calcm = DIV_ROUND_UP(div, 1 << calcp);
>   } else {
> - calcp = __roundup_pow_of_two(div);
> + calcp = roundup_pow_of_two(div);
>   calcp = calcp > 3 ? 3 : calcp;
>   }
>  
> diff --git a/drivers/infiniband/hw/hfi1/chip.c 
> b/drivers/infiniband/hw/hfi1/chip.c
> index 9b1fb84a3d45..96b1d343c32f 100644
> --- a/drivers/infiniband/hw/hfi1/chip.c
> +++ b/drivers/infiniband/hw/hfi1/chip.c
> @@ -14199,10 +14199,10 @@ static int qos_rmt_entries(struct hfi1_devdata *dd, 
> unsigned int *mp,
>   max_by_vl = krcvqs[i];
>   if (max_by_vl > 32)
>   goto no_qos;
> - m = ilog2(__roundup_pow_of_two(max_by_vl));
> + m = ilog2(roundup_pow_of_two(max_by_vl));
>  
>   /* determine bits for vl */
> - n = ilog2(__roundup_pow_of_two(num_vls));
> + n = ilog2(roundup_pow_of_two(num_vls));
>  
>   /* reject if too much is used */
>   if ((m + n) > 7)
> diff --git a/drivers/infiniband/hw/hfi1/init.c 
> b/drivers/infiniband/hw/hfi1/init.c
> index 26b792bb1027..838c789c7cce 100644
> --- a/drivers/infiniband/hw/hfi1/init.c
> +++ b/drivers/infiniband/hw/hfi1/init.c
> @@ -467,7 +467,7 @@ int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int 
> numa,
>* MTU supported.
>*/
>   if (rcd->egrbufs.size < hfi1_max_mtu) {
> - rcd->egrbufs.size = __roundup_pow_of_two(hfi1_max_mtu);
> + rcd->egrbufs.size = 

Re: [PATCH v2 4/8] iommu/arm-smmu: Add split pagetables for Adreno IOMMU implementations

2019-12-04 Thread Robin Murphy

On 22/11/2019 11:31 pm, Jordan Crouse wrote:

Add implementation specific support to enable split pagetables for
SMMU implementations attached to Adreno GPUs on Qualcomm targets.

To enable split pagetables the driver will set an attribute on the domain.
if conditions are correct, set up the hardware to support equally sized
TTBR0 and TTBR1 regions and programs the domain pagetable to TTBR1 to make
it available for global buffers while allowing the GPU the chance to
switch the TTBR0 at runtime for per-context pagetables.

After programming the context, the value of the domain attribute can be
queried to see if split pagetables were successfully programmed. The
domain geometry will be updated so that the caller can determine the
start of the region to generate correct virtual addresses.


Why is any of this in impl? It all looks like perfectly generic 
architectural TTBR1 setup to me. As long as DOMAIN_ATTR_SPLIT_TABLES is 
explicitly an opt-in for callers, I'm OK with them having to trust that 
SEP_UPSTREAM is good enough. Or, even better, make the value of 
DOMAIN_ATTR_SPLIT_TABLES not a boolean but the actual split point, where 
the default of 0 would logically mean "no split".



Signed-off-by: Jordan Crouse 
---

  drivers/iommu/arm-smmu-impl.c |  3 ++
  drivers/iommu/arm-smmu-qcom.c | 96 +++
  drivers/iommu/arm-smmu.c  | 41 ++
  drivers/iommu/arm-smmu.h  | 11 +
  4 files changed, 143 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index 33ed682..1e91231 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -174,5 +174,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
if (of_device_is_compatible(smmu->dev->of_node, "qcom,sdm845-smmu-500"))
return qcom_smmu_impl_init(smmu);
  
+	if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu-v2"))

+   return adreno_smmu_impl_init(smmu);
+
return smmu;
  }
diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c
index 24c071c..6591e49 100644
--- a/drivers/iommu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm-smmu-qcom.c
@@ -11,6 +11,102 @@ struct qcom_smmu {
struct arm_smmu_device smmu;
  };
  
+#define TG0_4K  0

+#define TG0_64K 1
+#define TG0_16K 2
+
+#define TG1_16K 1
+#define TG1_4K  2
+#define TG1_64K 3
+
+/*
+ * Set up split pagetables for Adreno SMMUs that will keep a static TTBR1 for
+ * global buffers and dynamically switch TTBR0 from the GPU for context 
specific
+ * pagetables.
+ */
+static int adreno_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
+   struct io_pgtable_cfg *pgtbl_cfg)
+{
+   struct arm_smmu_cfg *cfg = _domain->cfg;
+   struct arm_smmu_cb *cb = _domain->smmu->cbs[cfg->cbndx];
+   u32 tcr, tg0;
+
+   /*
+* Return error if split pagetables are not enabled so that arm-smmu
+* do the default configuration
+*/
+   if (!(pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1))
+   return -EINVAL;
+
+   /* Get the bank configuration from the pagetable config */
+   tcr = arm_smmu_lpae_tcr(pgtbl_cfg) & 0x;


The intent is that arm_smmu_lpae_tcr() should inherently return the 
appropriate half of the TCR based on pgtable_cfg. It seems like a lot of 
this complexity stems from missing that; sorry if it was unclear.


Robin.


+
+   /*
+* The TCR configuration for TTBR0 and TTBR1 is (almost) identical so
+* just duplicate the T0 configuration and shift it
+*/
+   cb->tcr[0] = (tcr << 16) | tcr;
+
+   /*
+* The (almost) above refers to the granule size field which is
+* different for TTBR0 and TTBR1. With the TTBR1 quirk enabled,
+* io-pgtable-arm will write the T1 appropriate granule size for tg.
+* Translate the configuration from the T1 field to get the right value
+* for T0
+*/
+   if (pgtbl_cfg->arm_lpae_s1_cfg.tcr.tg == TG1_4K)
+   tg0 = TG0_4K;
+   else if (pgtbl_cfg->arm_lpae_s1_cfg.tcr.tg == TG1_16K)
+   tg0 = TG0_16K;
+   else
+   tg0 = TG0_64K;
+
+   /* clear and set the correct value for TG0  */
+   cb->tcr[0] &= ~TCR_TG0;
+   cb->tcr[0] |= FIELD_PREP(TCR_TG0, tg0);
+
+   /*
+* arm_smmu_lape_tcr2 sets SEP_UPSTREAM which is always the appropriate
+* SEP for Adreno IOMMU
+*/
+   cb->tcr[1] = arm_smmu_lpae_tcr2(pgtbl_cfg);
+   cb->tcr[1] |= TCR2_AS;
+
+   /* TTBRs */
+   cb->ttbr[0] = FIELD_PREP(TTBRn_ASID, cfg->asid);
+   cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+   cb->ttbr[1] |= FIELD_PREP(TTBRn_ASID, cfg->asid);
+
+   /* MAIRs */
+   cb->mair[0] = pgtbl_cfg->arm_lpae_s1_cfg.mair;
+   cb->mair[1] = pgtbl_cfg->arm_lpae_s1_cfg.mair >> 32;
+
+   return 0;
+}
+
+static int 

RE: [PATCH v2] iommu/amd: Disable IOMMU on Stoney Ridge systems

2019-12-04 Thread Deucher, Alexander
> -Original Message-
> From: Deucher, Alexander
> Sent: Monday, December 2, 2019 11:37 AM
> To: Lucas Stach ; Kai-Heng Feng
> ; j...@8bytes.org; Koenig, Christian
> (christian.koe...@amd.com) 
> Cc: iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org
> Subject: RE: [PATCH v2] iommu/amd: Disable IOMMU on Stoney Ridge
> systems
> 
> > -Original Message-
> > From: Lucas Stach 
> > Sent: Sunday, December 1, 2019 7:43 AM
> > To: Kai-Heng Feng ; j...@8bytes.org
> > Cc: Deucher, Alexander ;
> > iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org
> > Subject: Re: [PATCH v2] iommu/amd: Disable IOMMU on Stoney Ridge
> > systems
> >
> > Am Freitag, den 29.11.2019, 22:21 +0800 schrieb Kai-Heng Feng:
> > > Serious screen flickering when Stoney Ridge outputs to a 4K monitor.
> > >
> > > According to Alex Deucher, IOMMU isn't enabled on Windows, so let's
> > > do the same here to avoid screen flickering on 4K monitor.
> >
> > This doesn't seem like a good solution, especially if there isn't a
> > method for the user to opt-out.  Some users might prefer having the
> > IOMMU support to 4K display output.
> >
> > But before using the big hammer of disabling or breaking one of those
> > features, we should take a look at what's the issue here. Screen
> > flickering caused by the IOMMU being active hints to the IOMMU not
> > being able to sustain the translation bandwidth required by the high-
> > bandwidth isochronous transfers caused by 4K scanout, most likely due
> > to insufficient TLB space.
> >
> > As far as I know the framebuffer memory for the display buffers is
> > located in stolen RAM, and thus contigous in memory. I don't know the
> > details of the GPU integration on those APUs, but maybe there even is
> > a way to bypass the IOMMU for the stolen VRAM regions?
> >
> > If there isn't and all GPU traffic passes through the IOMMU when
> > active, we should check if the stolen RAM is mapped with hugepages on
> > the IOMMU side. All the stolen RAM can most likely be mapped with a
> > few hugepage mappings, which should reduce IOMMU TLB demand by a
> large margin.
> 
> The is no issue when we scan out of the carve out region.  The issue occurs
> when we scan out of regular system memory (scatter/gather).  Many newer
> laptops have very small carve out regions (e.g., 32 MB), so we have to use
> regular system pages to support multiple high resolution displays.  The
> problem is, the latency gets too high at some point when the IOMMU is
> involved.  Huge pages would probably help in this case, but I'm not sure if
> there is any way to guarantee that we get huge pages for system memory.  I
> guess we could use CMA or something like that.

Thomas recently sent out a patch set to add huge page support to ttm:
https://patchwork.freedesktop.org/series/70090/
We'd still need a way to guarantee huge pages for the display buffer.

Alex

> 
> Alex
> 
> >
> > Regards,
> > Lucas
> >
> > > Cc: Alex Deucher 
> > > Bug:
> > >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> > > tl
> > >
> >
> ab.freedesktop.org%2Fdrm%2Famd%2Fissues%2F961data=02%7C01%
> > 7Calexa
> > >
> >
> nder.deucher%40amd.com%7C30540b2bf2be417c4d9508d7765bf07f%7C3dd
> > 8961fe4
> > >
> >
> 884e608e11a82d994e183d%7C0%7C0%7C637108010075463266sdata=1
> > ZIZUWos
> > > cPiB4auOY10jlGzoFeWszYMDBQG0CtrrOO8%3Dreserved=0
> > > Signed-off-by: Kai-Heng Feng 
> > > ---
> > > v2:
> > > - Find Stoney graphics instead of host bridge.
> > >
> > >  drivers/iommu/amd_iommu_init.c | 13 -
> > >  1 file changed, 12 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/iommu/amd_iommu_init.c
> > > b/drivers/iommu/amd_iommu_init.c index 568c52317757..139aa6fdadda
> > > 100644
> > > --- a/drivers/iommu/amd_iommu_init.c
> > > +++ b/drivers/iommu/amd_iommu_init.c
> > > @@ -2516,6 +2516,7 @@ static int __init early_amd_iommu_init(void)
> > >   struct acpi_table_header *ivrs_base;
> > >   acpi_status status;
> > >   int i, remap_cache_sz, ret = 0;
> > > + u32 pci_id;
> > >
> > >   if (!amd_iommu_detected)
> > >   return -ENODEV;
> > > @@ -2603,6 +2604,16 @@ static int __init early_amd_iommu_init(void)
> > >   if (ret)
> > >   goto out;
> > >
> > > + /* Disable IOMMU if there's Stoney Ridge graphics */
> > > + for (i = 0; i < 32; i++) {
> > > + pci_id = read_pci_config(0, i, 0, 0);
> > > + if ((pci_id & 0x) == 0x1002 && (pci_id >> 16) == 0x98e4) {
> > > + pr_info("Disable IOMMU on Stoney Ridge\n");
> > > + amd_iommu_disabled = true;
> > > + break;
> > > + }
> > > + }
> > > +
> > >   /* Disable any previously enabled IOMMUs */
> > >   if (!is_kdump_kernel() || amd_iommu_disabled)
> > >   disable_iommus();
> > > @@ -2711,7 +2722,7 @@ static int __init state_next(void)
> > >   ret = early_amd_iommu_init();
> > >   init_state = ret ? IOMMU_INIT_ERROR :
> > IOMMU_ACPI_FINISHED;
> > >   if 

Re: [PATCH v2 1/8] dt-bindings: arm-smmu: Add Adreno GPU variant

2019-12-04 Thread Robin Murphy

On 22/11/2019 11:31 pm, Jordan Crouse wrote:

Add a compatible string to identify SMMUs that are attached
to Adreno GPU devices that wish to support split pagetables.


A software policy decision is not, in itself, a good justification for a 
DT property. Is the GPU SMMU fundamentally different in hardware* from 
the other SMMU(s) in any given SoC?


(* where "hardware" may encompass hypervisor shenanigans)


Signed-off-by: Jordan Crouse 
---

  Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index 6515dbe..db9f826 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -31,6 +31,12 @@ properties:
- qcom,sdm845-smmu-v2
- const: qcom,smmu-v2
  
+  - description: Qcom Adreno GPU SMMU iplementing split pagetables

+items:
+  - enum:
+  - qcom,adreno-smmu-v2
+  - const: qcom,smmu-v2


Given that we already have per-SoC compatibles for Qcom SMMUs in 
general, this seems suspiciously vague.


Robin.


+
- description: Qcom SoCs implementing "arm,mmu-500"
  items:
- enum:


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] powerpc: ensure that swiotlb buffer is allocated from low memory

2019-12-04 Thread Christoph Hellwig
On Wed, Dec 04, 2019 at 02:35:24PM +0200, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> Some powerpc platforms (e.g. 85xx) limit DMA-able memory way below 4G. If a
> system has more physical memory than this limit, the swiotlb buffer is not
> addressable because it is allocated from memblock using top-down mode.
> 
> Force memblock to bottom-up mode before calling swiotlb_init() to ensure
> that the swiotlb buffer is DMA-able.
> 
> Link: 
> https://lkml.kernel.org/r/f1ebb706-73df-430e-9020-c214ec8ed...@xenosoft.de
> Reported-by: Christian Zigotzky 
> Signed-off-by: Mike Rapoport 

Looks good:

Reviewed-by: Christoph Hellwig 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] dma-mapping: force unencryped devices are always addressing limited

2019-12-04 Thread Christoph Hellwig
Devices that are forced to DMA through swiotlb need to be treated as if
they are addressing limited.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-direct.h | 1 +
 kernel/dma/direct.c| 8 ++--
 kernel/dma/mapping.c   | 3 +++
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 24b8684aa21d..83aac21434c6 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -85,4 +85,5 @@ int dma_direct_mmap(struct device *dev, struct vm_area_struct 
*vma,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs);
 int dma_direct_supported(struct device *dev, u64 mask);
+bool dma_direct_addressing_limited(struct device *dev);
 #endif /* _LINUX_DMA_DIRECT_H */
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 6af7ae83c4ad..450f3abe5cb5 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -497,11 +497,15 @@ int dma_direct_supported(struct device *dev, u64 mask)
return mask >= __phys_to_dma(dev, min_mask);
 }
 
+bool dma_direct_addressing_limited(struct device *dev)
+{
+   return force_dma_unencrypted(dev) || swiotlb_force == SWIOTLB_FORCE;
+}
+
 size_t dma_direct_max_mapping_size(struct device *dev)
 {
/* If SWIOTLB is active, use its maximum mapping size */
-   if (is_swiotlb_active() &&
-   (dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE))
+   if (is_swiotlb_active() && dma_addressing_limited(dev))
return swiotlb_max_mapping_size(dev);
return SIZE_MAX;
 }
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 1dbe6d725962..ebc60633d89a 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -416,6 +416,9 @@ EXPORT_SYMBOL_GPL(dma_get_merge_boundary);
  */
 bool dma_addressing_limited(struct device *dev)
 {
+   if (dma_is_direct(get_dma_ops(dev)) &&
+   dma_direct_addressing_limited(dev))
+   return true;
return min_not_zero(dma_get_mask(dev), dev->bus_dma_limit) <
dma_get_required_mask(dev);
 }
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


make dma_addressing_limited work for memory encryption setups v2

2019-12-04 Thread Christoph Hellwig
Hi all,

this little series fixes dma_addressing_limited to return true for
systems that use bounce buffers due to memory encryption.

Changes since v1:
 - take SWIOTLB_FORCE into account
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] dma-mapping: move dma_addressing_limited out of line

2019-12-04 Thread Christoph Hellwig
This function isn't used in the fast path, and moving it out of line
will reduce include clutter with the next change.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-mapping.h | 14 +-
 kernel/dma/mapping.c| 15 +++
 2 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 330ad58fbf4d..be0421e570b8 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -684,19 +684,7 @@ static inline int dma_coerce_mask_and_coherent(struct 
device *dev, u64 mask)
return dma_set_mask_and_coherent(dev, mask);
 }
 
-/**
- * dma_addressing_limited - return if the device is addressing limited
- * @dev:   device to check
- *
- * Return %true if the devices DMA mask is too small to address all memory in
- * the system, else %false.  Lack of addressing bits is the prime reason for
- * bounce buffering, but might not be the only one.
- */
-static inline bool dma_addressing_limited(struct device *dev)
-{
-   return min_not_zero(dma_get_mask(dev), dev->bus_dma_limit) <
-   dma_get_required_mask(dev);
-}
+bool dma_addressing_limited(struct device *dev);
 
 #ifdef CONFIG_ARCH_HAS_SETUP_DMA_OPS
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 12ff766ec1fa..1dbe6d725962 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -405,3 +405,18 @@ unsigned long dma_get_merge_boundary(struct device *dev)
return ops->get_merge_boundary(dev);
 }
 EXPORT_SYMBOL_GPL(dma_get_merge_boundary);
+
+/**
+ * dma_addressing_limited - return if the device is addressing limited
+ * @dev:   device to check
+ *
+ * Return %true if the devices DMA mask is too small to address all memory in
+ * the system, else %false.  Lack of addressing bits is the prime reason for
+ * bounce buffering, but might not be the only one.
+ */
+bool dma_addressing_limited(struct device *dev)
+{
+   return min_not_zero(dma_get_mask(dev), dev->bus_dma_limit) <
+   dma_get_required_mask(dev);
+}
+EXPORT_SYMBOL_GPL(dma_addressing_limited);
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Bug 205201 - Booting halts if Dawicontrol DC-2976 UW SCSI board installed, unless RAM size limited to 3500M

2019-12-04 Thread Christian Zigotzky
I think we have to wait to Roland’s test results with his SCSI PCI card.

Christian

Sent from my iPhone

> On 4. Dec 2019, at 09:56, Christoph Hellwig  wrote:
> 
>> On Wed, Nov 27, 2019 at 08:56:25AM +0200, Mike Rapoport wrote:
>>> On Tue, Nov 26, 2019 at 05:40:26PM +0100, Christoph Hellwig wrote:
 On Tue, Nov 26, 2019 at 12:26:38PM +0100, Christian Zigotzky wrote:
 Hello Christoph,
 
 The PCI TV card works with your patch! I was able to patch your Git kernel 
 with the patch above.
 
 I haven't found any error messages in the dmesg yet.
>>> 
>>> Thanks.  Unfortunately this is a bit of a hack as we need to set
>>> the mask based on runtime information like the magic FSL PCIe window.
>>> Let me try to draft something better up, and thanks already for testing
>>> this one!
>> 
>> Maybe we'll simply force bottom up allocation before calling
>> swiotlb_init()? Anyway, it's the last memblock allocation.
> 
> So I think we should go with this fix (plus a source code comment) for
> now.  Revamping the whole memory initialization is going to take a
> while, and this fix also is easily backportable.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Bug 205201 - Booting halts if Dawicontrol DC-2976 UW SCSI board installed, unless RAM size limited to 3500M

2019-12-04 Thread Christoph Hellwig
On Wed, Nov 27, 2019 at 08:56:25AM +0200, Mike Rapoport wrote:
> On Tue, Nov 26, 2019 at 05:40:26PM +0100, Christoph Hellwig wrote:
> > On Tue, Nov 26, 2019 at 12:26:38PM +0100, Christian Zigotzky wrote:
> > > Hello Christoph,
> > >
> > > The PCI TV card works with your patch! I was able to patch your Git 
> > > kernel 
> > > with the patch above.
> > >
> > > I haven't found any error messages in the dmesg yet.
> > 
> > Thanks.  Unfortunately this is a bit of a hack as we need to set
> > the mask based on runtime information like the magic FSL PCIe window.
> > Let me try to draft something better up, and thanks already for testing
> > this one!
> 
> Maybe we'll simply force bottom up allocation before calling
> swiotlb_init()? Anyway, it's the last memblock allocation.

So I think we should go with this fix (plus a source code comment) for
now.  Revamping the whole memory initialization is going to take a
while, and this fix also is easily backportable.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu