Re: [PATCH v4 1/3] dt-bindings: memory: mediatek: Convert SMI to DT

2020-11-01 Thread Krzysztof Kozlowski
On Mon, 2 Nov 2020 at 06:31, Yong Wu  wrote:
>
> On Sat, 2020-10-31 at 12:36 +0100, Krzysztof Kozlowski wrote:
> > On Fri, Oct 30, 2020 at 05:12:52PM +0800, Yong Wu wrote:
> > > Convert MediaTek SMI to DT schema.
> > >
> > > CC: Fabien Parent 
> > > CC: Ming-Fan Chen 
> > > CC: Matthias Brugger 
> > > Signed-off-by: Yong Wu 
> > > ---
> > >  .../mediatek,smi-common.txt   |  50 ---
> > >  .../mediatek,smi-common.yaml  | 140 ++
> > >  .../memory-controllers/mediatek,smi-larb.txt  |  50 ---
> > >  .../memory-controllers/mediatek,smi-larb.yaml | 129 
> > >  4 files changed, 269 insertions(+), 100 deletions(-)
> > >  delete mode 100644 
> > > Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> >
> > +Cc Honghui Zhang,
>
> As comment [1], Honghui's address is not valid now. I will act for him.
>
> >
> > Your Ack is needed as you contributed descriptions to the bindings and
> > work is being relicensed to GPL-2.0-only OR BSD-2-Clause.
>
> "GPL-2.0-only OR BSD-2-Clause" is required when we run check-patch.
>
> If I still use "GPL-2.0-only", then the contributors' Ack/SoB is not
> needed, right?

That would be one solution but I was thinking to proceed with only
your agreement. You were the main contributor to these files. Honghui
added a few descriptions. Other developers added only compatibles.
Since we cannot reach Honghui, I would assume that your agreement
(Sign-off) is enough.

Best regards,
Krzysztof
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 2/2] firmware: QCOM_SCM: Allow qcom_scm driver to be loadable as a permenent module

2020-11-01 Thread Kalle Valo
+ ath10k list

John Stultz  writes:

> Allow the qcom_scm driver to be loadable as a permenent module.
>
> This still uses the "depends on QCOM_SCM || !QCOM_SCM" bit to
> ensure that drivers that call into the qcom_scm driver are
> also built as modules. While not ideal in some cases its the
> only safe way I can find to avoid build errors without having
> those drivers select QCOM_SCM and have to force it on (as
> QCOM_SCM=n can be valid for those drivers).
>
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Andy Gross 
> Cc: Bjorn Andersson 
> Cc: Joerg Roedel 
> Cc: Thomas Gleixner 
> Cc: Jason Cooper 
> Cc: Marc Zyngier 
> Cc: Linus Walleij 
> Cc: Vinod Koul 
> Cc: Kalle Valo 
> Cc: Maulik Shah 
> Cc: Lina Iyer 
> Cc: Saravana Kannan 
> Cc: Todd Kjos 
> Cc: Greg Kroah-Hartman 
> Cc: linux-arm-...@vger.kernel.org
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-g...@vger.kernel.org
> Acked-by: Greg Kroah-Hartman 
> Signed-off-by: John Stultz 
> ---
> v3:
> * Fix __arm_smccc_smc build issue reported by
>   kernel test robot 
> v4:
> * Add "depends on QCOM_SCM || !QCOM_SCM" bit to ath10k
>   config that requires it.
> v5:
> * Fix QCOM_QCM typo in Kconfig, it should be QCOM_SCM
> ---
>  drivers/firmware/Kconfig| 4 ++--
>  drivers/firmware/Makefile   | 3 ++-
>  drivers/firmware/qcom_scm.c | 4 
>  drivers/iommu/Kconfig   | 2 ++
>  drivers/net/wireless/ath/ath10k/Kconfig | 1 +
>  5 files changed, 11 insertions(+), 3 deletions(-)

For ath10k part:

Acked-by: Kalle Valo 

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] dma-mapping fix for 5.10

2020-11-01 Thread Christoph Hellwig
On Sat, Oct 31, 2020 at 12:50:44PM -0700, Linus Torvalds wrote:
> So this is just a stylistic nit, and has no impact on this pull (which
> I've done). But looking at the patch, it triggers one of my "this is
> wrong" patterns.

Adding the author and maintainer of that code so that they can sort it
out.

> 
> In particular, this:
> 
> u64 dma_start = 0;
> ...
> for (dma_start = ~0ULL; r->size; r++) {
> 
> is actually completely bogus in theory, and it's a horribly horribly
> bad pattern to have.
> 
> The thing that I hate about that parttern is "~0ULL", which is simply wrong.
> 
> The correct pattern for "all bits set" is ~0. NOTHING ELSE. No extra
> letters at the end.
> 
> Why? Because using an unsigned type is wrong, and will not extend the
> bits up to a potentially bigger size.
> 
> So adding that "ULL" is not just three extra characters to type, it
> actually _detracts_ from the code and makes it more fragile and
> potentially wrong.
> 
> It so happens, that yes, in the kernel, "ull" us 64-bit, and you get
> the right results. So the code _works_. But it's wrong, and it now
> requires that the types match exactly (ie it would not be broken if
> somebody ever were to say "I want to use use 128-bit dma addresses and
> u128").
> 
> Another example is using "~0ul", which would give different results on
> a 32-bit kernel and a 64-bit kernel. Again: DON'T DO THAT.
> 
> I repeat: the right thing to do for "all bits set" is just a plain ~0
> or -1. Either of those are fine (technically assumes a 2's complement
> machine, but let's just be honest: that's a perfectly fine assumption,
> and -1 might be preferred by some because it makes that sign extension
> behavior of the integer constant more obvious).
> 
> Don't try to do anything clever or anything else, because it's going
> to be strictly worse.
> 
> The old code that that patch removed was "technically correct", but
> just pointless, and actually shows the problem:
> 
> for (dma_start = ~(dma_addr_t)0; r->size; r++) {
> 
> the above is indeed a correct way to say "I want all bits set in a
> dma_addr_t", but while correct, it is - once again - strictly inferior
> to just using "~0".
> 
> Why? Because "~0" works regardless of type. IOW, exactly *because*
> people used the wrong pattern for "all bits set", that patch was now
> (a) bigger than necessary and (b) much more ilkely to cause bugs (ie I
> could have imagined people changing just the type of the variable
> without changing the initialization).
> 
> So in that tiny three-line patch there were actually several examples
> of why "~0" is the right pattern to use for "all bits set". Because it
> JustWorks(tm) in ways other patterns do not.
> 
> And if you have a compiler that complains about assigning -1 or ~0 to
> an unsigned variable, get rid of that piece of garbage. You're almost
> certainly either using some warning flag that you shouldn't be using,
> or the compiler writer didn't know what they were doing.
> 
> Linus
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
---end quoted text---
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 1/3] dt-bindings: memory: mediatek: Convert SMI to DT

2020-11-01 Thread Yong Wu
On Sat, 2020-10-31 at 12:36 +0100, Krzysztof Kozlowski wrote:
> On Fri, Oct 30, 2020 at 05:12:52PM +0800, Yong Wu wrote:
> > Convert MediaTek SMI to DT schema.
> > 
> > CC: Fabien Parent 
> > CC: Ming-Fan Chen 
> > CC: Matthias Brugger 
> > Signed-off-by: Yong Wu 
> > ---
> >  .../mediatek,smi-common.txt   |  50 ---
> >  .../mediatek,smi-common.yaml  | 140 ++
> >  .../memory-controllers/mediatek,smi-larb.txt  |  50 ---
> >  .../memory-controllers/mediatek,smi-larb.yaml | 129 
> >  4 files changed, 269 insertions(+), 100 deletions(-)
> >  delete mode 100644 
> > Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> 
> +Cc Honghui Zhang,

As comment [1], Honghui's address is not valid now. I will act for him.

> 
> Your Ack is needed as you contributed descriptions to the bindings and
> work is being relicensed to GPL-2.0-only OR BSD-2-Clause.

"GPL-2.0-only OR BSD-2-Clause" is required when we run check-patch.

If I still use "GPL-2.0-only", then the contributors' Ack/SoB is not
needed, right?

[1]
https://lore.kernel.org/linux-iommu/1604051256.26323.100.camel@mhfsdcap03/T/#u

> 
> 
> Best regards,
> Krzysztof
> 
> 
> 
> 
> >  create mode 100644 
> > Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
> >  delete mode 100644 
> > Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt
> >  create mode 100644 
> > Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> >  
> > b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> > deleted file mode 100644
> > index dbafffe3f41e..
> > --- 
> > a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> > +++ /dev/null
> > @@ -1,50 +0,0 @@
> > -SMI (Smart Multimedia Interface) Common
> > -
> > -The hardware block diagram please check bindings/iommu/mediatek,iommu.txt
> > -
> > -Mediatek SMI have two generations of HW architecture, here is the list
> > -which generation the SoCs use:
> > -generation 1: mt2701 and mt7623.
> > -generation 2: mt2712, mt6779, mt8167, mt8173 and mt8183.
> > -
> > -There's slight differences between the two SMI, for generation 2, the
> > -register which control the iommu port is at each larb's register base. But
> > -for generation 1, the register is at smi ao base(smi always on register
> > -base). Besides that, the smi async clock should be prepared and enabled for
> > -SMI generation 1 to transform the smi clock into emi clock domain, but 
> > that is
> > -not needed for SMI generation 2.
> > -
> > -Required properties:
> > -- compatible : must be one of :
> > -   "mediatek,mt2701-smi-common"
> > -   "mediatek,mt2712-smi-common"
> > -   "mediatek,mt6779-smi-common"
> > -   "mediatek,mt7623-smi-common", "mediatek,mt2701-smi-common"
> > -   "mediatek,mt8167-smi-common"
> > -   "mediatek,mt8173-smi-common"
> > -   "mediatek,mt8183-smi-common"
> > -- reg : the register and size of the SMI block.
> > -- power-domains : a phandle to the power domain of this local arbiter.
> > -- clocks : Must contain an entry for each entry in clock-names.
> > -- clock-names : must contain 3 entries for generation 1 smi HW and 2 
> > entries
> > -  for generation 2 smi HW as follows:
> > -  - "apb" : Advanced Peripheral Bus clock, It's the clock for setting
> > -   the register.
> > -  - "smi" : It's the clock for transfer data and command.
> > -   They may be the same if both source clocks are the same.
> > -  - "async" : asynchronous clock, it help transform the smi clock into the 
> > emi
> > - clock domain, this clock is only needed by generation 1 smi HW.
> > -  and these 2 option clocks for generation 2 smi HW:
> > -  - "gals0": the path0 clock of GALS(Global Async Local Sync).
> > -  - "gals1": the path1 clock of GALS(Global Async Local Sync).
> > -  Here is the list which has this GALS: mt6779 and mt8183.
> > -
> > -Example:
> > -   smi_common: smi@14022000 {
> > -   compatible = "mediatek,mt8173-smi-common";
> > -   reg = <0 0x14022000 0 0x1000>;
> > -   power-domains = < MT8173_POWER_DOMAIN_MM>;
> > -   clocks = < CLK_MM_SMI_COMMON>,
> > -< CLK_MM_SMI_COMMON>;
> > -   clock-names = "apb", "smi";
> > -   };
> > diff --git 
> > a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
> >  
> > b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
> > new file mode 100644
> > index ..e050a0c2aed6
> > --- /dev/null
> > +++ 
> > b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
> > @@ -0,0 +1,140 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +# Copyright (c) 2020 MediaTek Inc.
> > +%YAML 1.2
> > +---
> > +$id: 
> > 

Re: [PATCH v3 00/14] iommu/amd: Add Generic IO Page Table Framework Support

2020-11-01 Thread Suravee Suthikulpanit

Joerg,

You mentioned to remind you to pull this in to linux-next.

Thanks,
Suravee

On 10/4/20 8:45 AM, Suravee Suthikulpanit wrote:

The framework allows callable implementation of IO page table.
This allows AMD IOMMU driver to switch between different types
of AMD IOMMU page tables (e.g. v1 vs. v2).

This series refactors the current implementation of AMD IOMMU v1 page table
to adopt the framework. There should be no functional change.
Subsequent series will introduce support for the AMD IOMMU v2 page table.

Thanks,
Suravee

Change from V2 
(https://lore.kernel.org/lkml/835c0d46-ed96-9fbe-856a-777dcffac...@amd.com/T/#t)
   - Patch 2/14: Introduce helper function io_pgtable_cfg_to_data.
   - Patch 13/14: Put back the struct iommu_flush_ops since patch v2 would run 
into
 NULL pointer bug when calling free_io_pgtable_ops if not defined.

Change from V1 (https://lkml.org/lkml/2020/9/23/251)
   - Do not specify struct io_pgtable_cfg.coherent_walk, since it is
 not currently used. (per Robin)
   - Remove unused struct iommu_flush_ops.  (patch 2/13)
   - Move amd_iommu_setup_io_pgtable_ops to iommu.c instead of io_pgtable.c
 patch 13/13)

Suravee Suthikulpanit (14):
   iommu/amd: Re-define amd_iommu_domain_encode_pgtable as inline
   iommu/amd: Prepare for generic IO page table framework
   iommu/amd: Move pt_root to to struct amd_io_pgtable
   iommu/amd: Convert to using amd_io_pgtable
   iommu/amd: Declare functions as extern
   iommu/amd: Move IO page table related functions
   iommu/amd: Restructure code for freeing page table
   iommu/amd: Remove amd_iommu_domain_get_pgtable
   iommu/amd: Rename variables to be consistent with struct
 io_pgtable_ops
   iommu/amd: Refactor fetch_pte to use struct amd_io_pgtable
   iommu/amd: Introduce iommu_v1_iova_to_phys
   iommu/amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
   iommu/amd: Introduce IOMMU flush callbacks
   iommu/amd: Adopt IO page table framework

  drivers/iommu/amd/Kconfig   |   1 +
  drivers/iommu/amd/Makefile  |   2 +-
  drivers/iommu/amd/amd_iommu.h   |  22 +
  drivers/iommu/amd/amd_iommu_types.h |  43 +-
  drivers/iommu/amd/io_pgtable.c  | 564 
  drivers/iommu/amd/iommu.c   | 646 +++-
  drivers/iommu/io-pgtable.c  |   3 +
  include/linux/io-pgtable.h  |   2 +
  8 files changed, 691 insertions(+), 592 deletions(-)
  create mode 100644 drivers/iommu/amd/io_pgtable.c


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/7] Convert the intel iommu driver to the dma-iommu api

2020-11-01 Thread Lu Baolu

Hi Tvrtko,

On 10/12/20 4:44 PM, Tvrtko Ursulin wrote:


On 29/09/2020 01:11, Lu Baolu wrote:

Hi Tvrtko,

On 9/28/20 5:44 PM, Tvrtko Ursulin wrote:


On 27/09/2020 07:34, Lu Baolu wrote:

Hi,

The previous post of this series could be found here.

https://lore.kernel.org/linux-iommu/20200912032200.11489-1-baolu...@linux.intel.com/ 



This version introduce a new patch [4/7] to fix an issue reported here.

https://lore.kernel.org/linux-iommu/51a1baec-48d1-c0ac-181b-1fba92aa4...@linux.intel.com/ 



There aren't any other changes.

Please help to test and review.

Best regards,
baolu

Lu Baolu (3):
   iommu: Add quirk for Intel graphic devices in map_sg


Since I do have patches to fix i915 to handle this, do we want to 
co-ordinate the two and avoid having to add this quirk and then later 
remove it? Or you want to go the staged approach?


I have no preference. It depends on which patch goes first. Let the
maintainers help here.


FYI we have merged the required i915 patches to out tree last week or 
so. I *think* this means they will go into 5.11. So the i915 specific 
workaround patch will not be needed in Intel IOMMU.


Do you mind telling me what's the status of this fix patch? I tried this
series on v5.10-rc1 with the graphic quirk patch dropped. I am still
seeing dma faults from graphic device.

Best regards,
baolu



Regards,

Tvrtko

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-01 Thread kernel test robot
Hi Barry,

I love your patch! Yet something to improve:

[auto build test ERROR on kselftest/next]
[also build test ERROR on linus/master v5.10-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Barry-Song/dma-mapping-provide-a-benchmark-for-streaming-DMA-mapping/20201101-182009
base:   
https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git next
config: h8300-allyesconfig (attached as .config)
compiler: h8300-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/b9abda38be7f32b9420c27b6c24eff2e69defa87
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Barry-Song/dma-mapping-provide-a-benchmark-for-streaming-DMA-mapping/20201101-182009
git checkout b9abda38be7f32b9420c27b6c24eff2e69defa87
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=h8300 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   h8300-linux-ld: kernel/dma/map_benchmark.o: in function `.L28':
   map_benchmark.c:(.text+0x283): undefined reference to `__udivdi3'
>> h8300-linux-ld: map_benchmark.c:(.text+0x2c1): undefined reference to 
>> `__udivdi3'
   h8300-linux-ld: map_benchmark.c:(.text+0x327): undefined reference to 
`__udivdi3'
   h8300-linux-ld: kernel/dma/map_benchmark.o: in function `.L26':
   map_benchmark.c:(.text+0x3d7): undefined reference to `__udivdi3'
   h8300-linux-ld: kernel/dma/map_benchmark.o: in function `.L44':
   map_benchmark.c:(.text+0x799): undefined reference to `__divdi3'
   h8300-linux-ld: kernel/dma/map_benchmark.o: in function `.L45':
   map_benchmark.c:(.text+0x7f5): undefined reference to `__divdi3'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-01 Thread kernel test robot
Hi Barry,

I love your patch! Yet something to improve:

[auto build test ERROR on kselftest/next]
[also build test ERROR on linus/master v5.10-rc1 next-20201030]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Barry-Song/dma-mapping-provide-a-benchmark-for-streaming-DMA-mapping/20201101-182009
base:   
https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git next
config: mips-allyesconfig (attached as .config)
compiler: mips-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/b9abda38be7f32b9420c27b6c24eff2e69defa87
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Barry-Song/dma-mapping-provide-a-benchmark-for-streaming-DMA-mapping/20201101-182009
git checkout b9abda38be7f32b9420c27b6c24eff2e69defa87
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=mips 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   arch/mips/kernel/head.o: in function `dtb_found':
   (.ref.text+0xe0): relocation truncated to fit: R_MIPS_26 against 
`start_kernel'
   init/main.o: in function `set_reset_devices':
   main.c:(.init.text+0x20): relocation truncated to fit: R_MIPS_26 against 
`_mcount'
   main.c:(.init.text+0x30): relocation truncated to fit: R_MIPS_26 against 
`__sanitizer_cov_trace_pc'
   init/main.o: in function `debug_kernel':
   main.c:(.init.text+0x9c): relocation truncated to fit: R_MIPS_26 against 
`_mcount'
   main.c:(.init.text+0xac): relocation truncated to fit: R_MIPS_26 against 
`__sanitizer_cov_trace_pc'
   init/main.o: in function `quiet_kernel':
   main.c:(.init.text+0x118): relocation truncated to fit: R_MIPS_26 against 
`_mcount'
   main.c:(.init.text+0x128): relocation truncated to fit: R_MIPS_26 against 
`__sanitizer_cov_trace_pc'
   init/main.o: in function `init_setup':
   main.c:(.init.text+0x1a4): relocation truncated to fit: R_MIPS_26 against 
`_mcount'
   main.c:(.init.text+0x1c8): relocation truncated to fit: R_MIPS_26 against 
`__sanitizer_cov_trace_pc'
   main.c:(.init.text+0x1e8): relocation truncated to fit: R_MIPS_26 against 
`__sanitizer_cov_trace_pc'
   main.c:(.init.text+0x1fc): additional relocation overflows omitted from the 
output
   mips-linux-ld: kernel/dma/map_benchmark.o: in function 
`map_benchmark_thread':
>> map_benchmark.c:(.text.map_benchmark_thread+0x1f4): undefined reference to 
>> `__divdi3'
>> mips-linux-ld: map_benchmark.c:(.text.map_benchmark_thread+0x218): undefined 
>> reference to `__divdi3'
   mips-linux-ld: kernel/dma/map_benchmark.o: in function `do_map_benchmark':
>> map_benchmark.c:(.text.do_map_benchmark+0x260): undefined reference to 
>> `__udivdi3'
>> mips-linux-ld: map_benchmark.c:(.text.do_map_benchmark+0x284): undefined 
>> reference to `__udivdi3'
   mips-linux-ld: map_benchmark.c:(.text.do_map_benchmark+0x2b4): undefined 
reference to `__udivdi3'
   mips-linux-ld: map_benchmark.c:(.text.do_map_benchmark+0x300): undefined 
reference to `__udivdi3'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-01 Thread Barry Song
Nowadays, there are increasing requirements to benchmark the performance
of dma_map and dma_unmap particually while the device is attached to an
IOMMU.

This patch enables the support. Users can run specified number of threads
to do dma_map_page and dma_unmap_page on a specific NUMA node with the
specified duration. Then dma_map_benchmark will calculate the average
latency for map and unmap.

A difficulity for this benchmark is that dma_map/unmap APIs must run on
a particular device. Each device might have different backend of IOMMU or
non-IOMMU.

So we use the driver_override to bind dma_map_benchmark to a particual
device by:
For platform devices:
echo dma_map_benchmark > /sys/bus/platform/devices/xxx/driver_override
echo xxx > /sys/bus/platform/drivers/xxx/unbind
echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind

For PCI devices:
echo dma_map_benchmark > /sys/bus/pci/devices/:00:01.0/driver_override
echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind
echo :00:01.0 > /sys/bus/pci/drivers/dma_map_benchmark/bind

Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Shuah Khan 
Cc: Christoph Hellwig 
Cc: Marek Szyprowski 
Cc: Robin Murphy 
Signed-off-by: Barry Song 
---
-v2:
  * add PCI support; v1 supported platform devices only
  * replace ssleep by msleep_interruptible() to permit users to exit
benchmark before it is completed
  * many changes according to Robin's suggestions, thanks! Robin
- add standard deviation output to reflect the worst case
- check users' parameters strictly like the number of threads
- make cache dirty before dma_map
- fix unpaired dma_map_page and dma_unmap_single;
- remove redundant "long long" before ktime_to_ns();
- use devm_add_action();
- wakeup all threads together after they are ready

 kernel/dma/Kconfig |   8 +
 kernel/dma/Makefile|   1 +
 kernel/dma/map_benchmark.c | 295 +
 3 files changed, 304 insertions(+)
 create mode 100644 kernel/dma/map_benchmark.c

diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index c99de4a21458..949c53da5991 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -225,3 +225,11 @@ config DMA_API_DEBUG_SG
  is technically out-of-spec.
 
  If unsure, say N.
+
+config DMA_MAP_BENCHMARK
+   bool "Enable benchmarking of streaming DMA mapping"
+   help
+ Provides /sys/kernel/debug/dma_map_benchmark that helps with testing
+ performance of dma_(un)map_page.
+
+ See tools/testing/selftests/dma/dma_map_benchmark.c
diff --git a/kernel/dma/Makefile b/kernel/dma/Makefile
index dc755ab68aab..7aa6b26b1348 100644
--- a/kernel/dma/Makefile
+++ b/kernel/dma/Makefile
@@ -10,3 +10,4 @@ obj-$(CONFIG_DMA_API_DEBUG)   += debug.o
 obj-$(CONFIG_SWIOTLB)  += swiotlb.o
 obj-$(CONFIG_DMA_COHERENT_POOL)+= pool.o
 obj-$(CONFIG_DMA_REMAP)+= remap.o
+obj-$(CONFIG_DMA_MAP_BENCHMARK)+= map_benchmark.o
diff --git a/kernel/dma/map_benchmark.c b/kernel/dma/map_benchmark.c
new file mode 100644
index ..ac397758087b
--- /dev/null
+++ b/kernel/dma/map_benchmark.c
@@ -0,0 +1,295 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 Hisilicon Limited.
+ */
+
+#define pr_fmt(fmt)KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DMA_MAP_BENCHMARK  _IOWR('d', 1, struct map_benchmark)
+#define DMA_MAP_MAX_THREADS1024
+#define DMA_MAP_MAX_SECONDS300
+
+struct map_benchmark {
+   __u64 avg_map_100ns; /* average map latency in 100ns */
+   __u64 map_stddev; /* standard deviation of map latency */
+   __u64 avg_unmap_100ns; /* as above */
+   __u64 unmap_stddev;
+   __u32 threads; /* how many threads will do map/unmap in parallel */
+   __u32 seconds; /* how long the test will last */
+   int node; /* which numa node this benchmark will run on */
+   __u64 expansion[10];/* For future use */
+};
+
+struct map_benchmark_data {
+   struct map_benchmark bparam;
+   struct device *dev;
+   struct dentry  *debugfs;
+   atomic64_t sum_map_100ns;
+   atomic64_t sum_unmap_100ns;
+   atomic64_t sum_square_map;
+   atomic64_t sum_square_unmap;
+   atomic64_t loops;
+};
+
+static int map_benchmark_thread(void *data)
+{
+   void *buf;
+   dma_addr_t dma_addr;
+   struct map_benchmark_data *map = data;
+   int ret = 0;
+
+   buf = (void *)__get_free_page(GFP_KERNEL);
+   if (!buf)
+   return -ENOMEM;
+
+   while (!kthread_should_stop())  {
+   __u64 map_100ns, unmap_100ns, map_square, unmap_square;
+   ktime_t map_stime, map_etime, unmap_stime, unmap_etime;
+
+   /*
+* for a non-coherent device, if we don't stain them in the 
cache,
+* this will give an 

[PATCH v2 0/2] dma-mapping: provide a benchmark for streaming DMA mapping

2020-11-01 Thread Barry Song
Nowadays, there are increasing requirements to benchmark the performance
of dma_map and dma_unmap particually while the device is attached to an
IOMMU.

This patchset provides the benchmark infrastruture for streaming DMA
mapping. The architecture of the code is pretty much similar with GUP
benchmark:
* mm/gup_benchmark.c provides kernel interface;
* tools/testing/selftests/vm/gup_benchmark.c provides user program to
call the interface provided by mm/gup_benchmark.c.

In our case, kernel/dma/map_benchmark.c is like mm/gup_benchmark.c;
tools/testing/selftests/dma/dma_map_benchmark.c is like tools/testing/
selftests/vm/gup_benchmark.c

A major difference with GUP benchmark is DMA_MAP benchmark needs to run
on a device. Considering one board with below devices and IOMMUs
device A  --- IOMMU 1
device B  --- IOMMU 2
device C  --- non-IOMMU

Different devices might attach to different IOMMU or non-IOMMU. To make
benchmark run, we can either
* create a virtual device and hack the kernel code to attach the virtual
device to IOMMU1, IOMMU2 or non-IOMMU.
* use the existing driver_override mechinism, unbind device A,B, OR c from
their original driver and bind A to dma_map_benchmark platform driver or
pci driver for benchmarking.

In this patchset, I prefer to use the driver_override and avoid the ugly
hack in kernel. We can dynamically switch device behind different IOMMUs
to get the performance of IOMMU or non-IOMMU.

-v2:
  * add PCI support; v1 supported platform devices only
  * replace ssleep by msleep_interruptible() to permit users to exit
benchmark before it is completed
  * many changes according to Robin's suggestions, thanks! Robin
- add standard deviation output to reflect the worst case
- check users' parameters strictly like the number of threads
- make cache dirty before dma_map
- fix unpaired dma_map_page and dma_unmap_single;
- remove redundant "long long" before ktime_to_ns();
- use devm_add_action();
- wakeup all threads together after they are ready

Barry Song (2):
  dma-mapping: add benchmark support for streaming DMA APIs
  selftests/dma: add test application for DMA_MAP_BENCHMARK

 MAINTAINERS   |   6 +
 kernel/dma/Kconfig|   8 +
 kernel/dma/Makefile   |   1 +
 kernel/dma/map_benchmark.c| 295 ++
 tools/testing/selftests/dma/Makefile  |   6 +
 tools/testing/selftests/dma/config|   1 +
 .../testing/selftests/dma/dma_map_benchmark.c |  87 ++
 7 files changed, 404 insertions(+)
 create mode 100644 kernel/dma/map_benchmark.c
 create mode 100644 tools/testing/selftests/dma/Makefile
 create mode 100644 tools/testing/selftests/dma/config
 create mode 100644 tools/testing/selftests/dma/dma_map_benchmark.c

-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/2] selftests/dma: add test application for DMA_MAP_BENCHMARK

2020-11-01 Thread Barry Song
This patch provides the test application for DMA_MAP_BENCHMARK.

Before running the test application, we need to bind a device to dma_map_
benchmark driver. For example, unbind "xxx" from its original driver and
bind to dma_map_benchmark:

echo dma_map_benchmark > /sys/bus/platform/devices/xxx/driver_override
echo xxx > /sys/bus/platform/drivers/xxx/unbind
echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind

Another example for PCI devices:
echo dma_map_benchmark > /sys/bus/pci/devices/:00:01.0/driver_override
echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind
echo :00:01.0 > /sys/bus/pci/drivers/dma_map_benchmark/bind

The below command will run 16 threads on numa node 0 for 10 seconds on
the device bound to dma_map_benchmark platform_driver or pci_driver:
./dma_map_benchmark -t 16 -s 10 -n 0
dma mapping benchmark: threads:16 seconds:10
average map latency(us):1.1 standard deviation:1.9
average unmap latency(us):0.5 standard deviation:0.8

Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Shuah Khan 
Cc: Christoph Hellwig 
Cc: Marek Szyprowski 
Cc: Robin Murphy 
Signed-off-by: Barry Song 
---
 -v2:
 * check parameters like threads, seconds strictly
 * print standard deviation for latencies

 MAINTAINERS   |  6 ++
 tools/testing/selftests/dma/Makefile  |  6 ++
 tools/testing/selftests/dma/config|  1 +
 .../testing/selftests/dma/dma_map_benchmark.c | 87 +++
 4 files changed, 100 insertions(+)
 create mode 100644 tools/testing/selftests/dma/Makefile
 create mode 100644 tools/testing/selftests/dma/config
 create mode 100644 tools/testing/selftests/dma/dma_map_benchmark.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 608fc8484c02..a1e38d5e14f6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5247,6 +5247,12 @@ F:   include/linux/dma-mapping.h
 F: include/linux/dma-map-ops.h
 F: kernel/dma/
 
+DMA MAPPING BENCHMARK
+M: Barry Song 
+L: iommu@lists.linux-foundation.org
+F: kernel/dma/map_benchmark.c
+F: tools/testing/selftests/dma/
+
 DMA-BUF HEAPS FRAMEWORK
 M: Sumit Semwal 
 R: Benjamin Gaignard 
diff --git a/tools/testing/selftests/dma/Makefile 
b/tools/testing/selftests/dma/Makefile
new file mode 100644
index ..aa8e8b5b3864
--- /dev/null
+++ b/tools/testing/selftests/dma/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+CFLAGS += -I../../../../usr/include/
+
+TEST_GEN_PROGS := dma_map_benchmark
+
+include ../lib.mk
diff --git a/tools/testing/selftests/dma/config 
b/tools/testing/selftests/dma/config
new file mode 100644
index ..6102ee3c43cd
--- /dev/null
+++ b/tools/testing/selftests/dma/config
@@ -0,0 +1 @@
+CONFIG_DMA_MAP_BENCHMARK=y
diff --git a/tools/testing/selftests/dma/dma_map_benchmark.c 
b/tools/testing/selftests/dma/dma_map_benchmark.c
new file mode 100644
index ..4778df0c458f
--- /dev/null
+++ b/tools/testing/selftests/dma/dma_map_benchmark.c
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 Hisilicon Limited.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DMA_MAP_BENCHMARK  _IOWR('d', 1, struct map_benchmark)
+#define DMA_MAP_MAX_THREADS1024
+#define DMA_MAP_MAX_SECONDS 300
+
+struct map_benchmark {
+   __u64 avg_map_100ns; /* average map latency in 100ns */
+   __u64 map_stddev; /* standard deviation of map latency */
+   __u64 avg_unmap_100ns; /* as above */
+   __u64 unmap_stddev;
+   __u32 threads; /* how many threads will do map/unmap in parallel */
+   __u32 seconds; /* how long the test will last */
+   int node; /* which numa node this benchmark will run on */
+   __u64 expansion[10];/* For future use */
+};
+
+int main(int argc, char **argv)
+{
+   struct map_benchmark map;
+   int fd, opt;
+   /* default single thread, run 20 seconds on NUMA_NO_NODE */
+   int threads = 1, seconds = 20, node = -1;
+   int cmd = DMA_MAP_BENCHMARK;
+   char *p;
+
+   while ((opt = getopt(argc, argv, "t:s:n:")) != -1) {
+   switch (opt) {
+   case 't':
+   threads = atoi(optarg);
+   break;
+   case 's':
+   seconds = atoi(optarg);
+   break;
+   case 'n':
+   node = atoi(optarg);
+   break;
+   default:
+   return -1;
+   }
+   }
+
+   if (threads <= 0 || threads > DMA_MAP_MAX_THREADS) {
+   fprintf(stderr, "invalid number of threads, must be in 1-%d\n",
+   DMA_MAP_MAX_THREADS);
+   exit(1);
+   }
+
+   if (seconds <= 0 || seconds > DMA_MAP_MAX_SECONDS) {
+   fprintf(stderr, "invalid number of seconds, must be in 1-%d\n",
+   DMA_MAP_MAX_SECONDS);
+   exit(1);
+   }
+
+   fd =