RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-15 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Sunday, November 15, 2020 9:45 PM
> To: Song Bao Hua (Barry Song) 
> Cc: Christoph Hellwig ; iommu@lists.linux-foundation.org;
> robin.mur...@arm.com; m.szyprow...@samsung.com; Linuxarm
> ; linux-kselft...@vger.kernel.org; xuwei (O)
> ; Joerg Roedel ; Will Deacon
> ; Shuah Khan 
> Subject: Re: [PATCH v3 1/2] dma-mapping: add benchmark support for
> streaming DMA APIs
> 
> On Sun, Nov 15, 2020 at 12:11:15AM +, Song Bao Hua (Barry Song)
> wrote:
> >
> > Checkpatch has changed 80 to 100. That's probably why my local checkpatch
> didn't report any warning:
> >
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=
> bdc48fa11e46f867ea4d
> >
> > I am happy to change them to be less than 80 if you like.
> 
> Don't rely on checkpath, is is broken.  Look at the codingstyle document.
> 
> > > I think this needs to set a dma mask as behavior for unlimited dma
> > > mask vs the default 32-bit one can be very different.
> >
> > I actually prefer users bind real devices with real dma_mask to test rather
> than force to change
> > the dma_mask in this benchmark.
> 
> The mask is set by the driver, not the device.  So you need to set when
> when you bind, real device or not.

Yep while it is a little bit tricky.

Sometimes, it is done by "device" in architectures, e.g. there are lots of
dma_mask configuration code in arch/arm/mach-xxx.
arch/arm/mach-davinci/da850.c
static u64 da850_vpif_dma_mask = DMA_BIT_MASK(32);
static struct platform_device da850_vpif_dev = {
.name   = "vpif",
.id = -1,
.dev= {
.dma_mask   = &da850_vpif_dma_mask,
.coherent_dma_mask  = DMA_BIT_MASK(32),
},
.resource   = da850_vpif_resource,
.num_resources  = ARRAY_SIZE(da850_vpif_resource),
};

Sometimes, it is done by "of" or "acpi", for example:
drivers/acpi/arm64/iort.c
void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
{
u64 end, mask, dmaaddr = 0, size = 0, offset = 0;
int ret;

...

ret = acpi_dma_get_range(dev, &dmaaddr, &offset, &size);
if (!ret) {
/*
 * Limit coherent and dma mask based on size retrieved from
 * firmware.
 */
end = dmaaddr + size - 1;
mask = DMA_BIT_MASK(ilog2(end) + 1);
dev->bus_dma_limit = end;
dev->coherent_dma_mask = mask;
*dev->dma_mask = mask;
}
...
}

Sometimes, it is done by "bus", for example, ISA:
isa_dev->dev.coherent_dma_mask = DMA_BIT_MASK(24);
isa_dev->dev.dma_mask = &isa_dev->dev.coherent_dma_mask;

error = device_register(&isa_dev->dev);
if (error) {
put_device(&isa_dev->dev);
break;
}

And in many cases, it is done by driver. On the ARM64 server platform I am 
testing,
actually rarely drivers set dma_mask.

So to make the dma benchmark work on all platforms, it seems it is worth
to add a dma_mask_bit parameter. But, in order to avoid breaking the
dma_mask of those devices whose dma_mask are set by architectures, 
acpi and bus, it seems we need to do the below in dma_benchmark:

u64 old_mask;

old_mask = dma_get_mask(dev);

dma_set_mask(dev, &new_mask);

do_map_benchmark();

/* restore old dma_mask so that the dma_mask of the device is not changed due to
benchmark when it is bound back to its original driver */
dma_set_mask(dev, &old_mask);

Thanks
Barry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-15 Thread Christoph Hellwig
On Sun, Nov 15, 2020 at 12:11:15AM +, Song Bao Hua (Barry Song) wrote:
> 
> Checkpatch has changed 80 to 100. That's probably why my local checkpatch 
> didn't report any warning:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bdc48fa11e46f867ea4d
> 
> I am happy to change them to be less than 80 if you like.

Don't rely on checkpath, is is broken.  Look at the codingstyle document.

> > I think this needs to set a dma mask as behavior for unlimited dma
> > mask vs the default 32-bit one can be very different. 
> 
> I actually prefer users bind real devices with real dma_mask to test rather 
> than force to change
> the dma_mask in this benchmark.

The mask is set by the driver, not the device.  So you need to set when
when you bind, real device or not.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-14 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Sunday, November 15, 2020 5:54 AM
> To: Song Bao Hua (Barry Song) 
> Cc: iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com;
> m.szyprow...@samsung.com; Linuxarm ;
> linux-kselft...@vger.kernel.org; xuwei (O) ; Joerg
> Roedel ; Will Deacon ; Shuah Khan
> 
> Subject: Re: [PATCH v3 1/2] dma-mapping: add benchmark support for
> streaming DMA APIs
> 
> Lots of > 80 char lines.  Please fix up the style.

Checkpatch has changed 80 to 100. That's probably why my local checkpatch 
didn't report any warning:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bdc48fa11e46f867ea4d

I am happy to change them to be less than 80 if you like.

> 
> I think this needs to set a dma mask as behavior for unlimited dma
> mask vs the default 32-bit one can be very different. 

I actually prefer users bind real devices with real dma_mask to test rather 
than force to change
the dma_mask in this benchmark.

Some device might have 32bit dma_mask while some others might have unlimited. 
But both of
them can bind to this driver or unbind from it after the test is done. So users 
just need to bind
those different real devices with different real dma_mask to dma_benchmark.

This can reflect the real performance of the real device better, I think.

> I also think you need to be able to pass the direction or have different tests
> for directions.  bidirectional is not exactly heavily used and pays
> more cache management penality.

For this, I'd like to increase a direction option in the test app and pass the 
option to the benchmark
driver.

Thanks
Barry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-14 Thread Christoph Hellwig
Lots of > 80 char lines.  Please fix up the style.

I think this needs to set a dma mask as behavior for unlimited dma
mask vs the default 32-bit one can be very different.  I also think
you need to be able to pass the direction or have different tests
for directions.  bidirectional is not exactly heavily used and pays
more cache management penality.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-11 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: John Garry
> Sent: Wednesday, November 11, 2020 10:37 PM
> To: Song Bao Hua (Barry Song) ;
> iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com;
> m.szyprow...@samsung.com
> Cc: linux-kselft...@vger.kernel.org; Will Deacon ; Joerg
> Roedel ; Linuxarm ; xuwei (O)
> ; Shuah Khan 
> Subject: Re: [PATCH v3 1/2] dma-mapping: add benchmark support for
> streaming DMA APIs
> 
> On 11/11/2020 01:29, Song Bao Hua (Barry Song) wrote:
> > I'd like to think checking this here would be overdesign. We just give 
> > users the
> > freedom to bind any device they care about to the benchmark driver. Usually
> > that means a real hardware either behind an IOMMU or through a direct
> > mapping.
> >
> > if for any reason users put a wrong "device", that is the choice of users.
> 
> Right, but if the device simply has no DMA ops supported, it could be
> better to fail the probe rather than let them try the test at all.
> 
>   Anyhow,
> > the below code will still handle it properly and users will get a report in 
> > which
> > everything is zero.
> >
> > +static int map_benchmark_thread(void *data)
> > +{
> > ...
> > +   dma_addr = dma_map_single(map->dev, buf, PAGE_SIZE,
> DMA_BIDIRECTIONAL);
> > +   if (unlikely(dma_mapping_error(map->dev, dma_addr))) {
> 
> Doing this is proper, but I am not sure if this tells the user the real
> problem.

Telling users the real problem isn't the design intention of this test
benchmark. It is never the purpose of this benchmark.

> 
> > +   pr_err("dma_map_single failed on %s\n",
> dev_name(map->dev));
> 
> Not sure why use pr_err() over dev_err().

We are reporting errors in dma-benchmark driver rather than reporting errors
in the driver of the specific device. I think we should have "dma-benchmark"
as the prefix while printing the name of the device by dev_name().

> 
> > +   ret = -ENOMEM;
> > +   goto out;
> > +   }
> 
> Thanks,
> John

Thanks
Barry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-10 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: John Garry
> Sent: Tuesday, November 10, 2020 9:39 PM
> To: Song Bao Hua (Barry Song) ;
> iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com;
> m.szyprow...@samsung.com
> Cc: linux-kselft...@vger.kernel.org; Will Deacon ; Joerg
> Roedel ; Linuxarm ; xuwei (O)
> ; Shuah Khan 
> Subject: Re: [PATCH v3 1/2] dma-mapping: add benchmark support for
> streaming DMA APIs
> 
> On 10/11/2020 08:10, Song Bao Hua (Barry Song) wrote:
> > Hello Robin, Christoph,
> > Any further comment? John suggested that "depends on DEBUG_FS" should
> be added in Kconfig.
> > I am collecting more comments to send v4 together with fixing this minor
> issue :-)
> >
> > Thanks
> > Barry
> >
> >> -Original Message-
> >> From: Song Bao Hua (Barry Song)
> >> Sent: Monday, November 2, 2020 9:07 PM
> >> To: iommu@lists.linux-foundation.org; h...@lst.de;
> robin.mur...@arm.com;
> >> m.szyprow...@samsung.com
> >> Cc: Linuxarm ; linux-kselft...@vger.kernel.org;
> xuwei
> >> (O) ; Song Bao Hua (Barry Song)
> >> ; Joerg Roedel ; Will
> Deacon
> >> ; Shuah Khan 
> >> Subject: [PATCH v3 1/2] dma-mapping: add benchmark support for
> streaming
> >> DMA APIs
> >>
> >> Nowadays, there are increasing requirements to benchmark the
> performance
> >> of dma_map and dma_unmap particually while the device is attached to an
> >> IOMMU.
> >>
> >> This patch enables the support. Users can run specified number of threads
> to
> >> do dma_map_page and dma_unmap_page on a specific NUMA node with
> the
> >> specified duration. Then dma_map_benchmark will calculate the average
> >> latency for map and unmap.
> >>
> >> A difficulity for this benchmark is that dma_map/unmap APIs must run on a
> >> particular device. Each device might have different backend of IOMMU or
> >> non-IOMMU.
> >>
> >> So we use the driver_override to bind dma_map_benchmark to a particual
> >> device by:
> >> For platform devices:
> >> echo dma_map_benchmark >
> /sys/bus/platform/devices/xxx/driver_override
> >> echo xxx > /sys/bus/platform/drivers/xxx/unbind
> >> echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind
> >>
> 
> Hi Barry,
> 
> >> For PCI devices:
> >> echo dma_map_benchmark >
> >> /sys/bus/pci/devices/:00:01.0/driver_override
> >> echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind echo :00:01.0 >
> >> /sys/bus/pci/drivers/dma_map_benchmark/bind
> 
> Do we need to check if the device to which we attach actually has DMA
> mapping capability?

Hello John,

I'd like to think checking this here would be overdesign. We just give users the
freedom to bind any device they care about to the benchmark driver. Usually
that means a real hardware either behind an IOMMU or through a direct
mapping.

if for any reason users put a wrong "device", that is the choice of users. 
Anyhow,
the below code will still handle it properly and users will get a report in 
which
everything is zero.

+static int map_benchmark_thread(void *data)
+{
...
+   dma_addr = dma_map_single(map->dev, buf, PAGE_SIZE, 
DMA_BIDIRECTIONAL);
+   if (unlikely(dma_mapping_error(map->dev, dma_addr))) {
+   pr_err("dma_map_single failed on %s\n", 
dev_name(map->dev));
+   ret = -ENOMEM;
+   goto out;
+   }
...
+}

> 
> >>
> >> Cc: Joerg Roedel 
> >> Cc: Will Deacon 
> >> Cc: Shuah Khan 
> >> Cc: Christoph Hellwig 
> >> Cc: Marek Szyprowski 
> >> Cc: Robin Murphy 
> >> Signed-off-by: Barry Song 
> >> ---
> 
> Thanks,
> John

Thanks
Barry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-10 Thread John Garry

On 10/11/2020 08:10, Song Bao Hua (Barry Song) wrote:

Hello Robin, Christoph,
Any further comment? John suggested that "depends on DEBUG_FS" should be added 
in Kconfig.
I am collecting more comments to send v4 together with fixing this minor issue 
:-)

Thanks
Barry


-Original Message-
From: Song Bao Hua (Barry Song)
Sent: Monday, November 2, 2020 9:07 PM
To: iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com;
m.szyprow...@samsung.com
Cc: Linuxarm ; linux-kselft...@vger.kernel.org; xuwei
(O) ; Song Bao Hua (Barry Song)
; Joerg Roedel ; Will Deacon
; Shuah Khan 
Subject: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming
DMA APIs

Nowadays, there are increasing requirements to benchmark the performance
of dma_map and dma_unmap particually while the device is attached to an
IOMMU.

This patch enables the support. Users can run specified number of threads to
do dma_map_page and dma_unmap_page on a specific NUMA node with the
specified duration. Then dma_map_benchmark will calculate the average
latency for map and unmap.

A difficulity for this benchmark is that dma_map/unmap APIs must run on a
particular device. Each device might have different backend of IOMMU or
non-IOMMU.

So we use the driver_override to bind dma_map_benchmark to a particual
device by:
For platform devices:
echo dma_map_benchmark > /sys/bus/platform/devices/xxx/driver_override
echo xxx > /sys/bus/platform/drivers/xxx/unbind
echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind



Hi Barry,


For PCI devices:
echo dma_map_benchmark >
/sys/bus/pci/devices/:00:01.0/driver_override
echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind echo :00:01.0 >
/sys/bus/pci/drivers/dma_map_benchmark/bind


Do we need to check if the device to which we attach actually has DMA 
mapping capability?




Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Shuah Khan 
Cc: Christoph Hellwig 
Cc: Marek Szyprowski 
Cc: Robin Murphy 
Signed-off-by: Barry Song 
---


Thanks,
John
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-10 Thread Song Bao Hua (Barry Song)
Hello Robin, Christoph,
Any further comment? John suggested that "depends on DEBUG_FS" should be added 
in Kconfig.
I am collecting more comments to send v4 together with fixing this minor issue 
:-)

Thanks
Barry

> -Original Message-
> From: Song Bao Hua (Barry Song)
> Sent: Monday, November 2, 2020 9:07 PM
> To: iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com;
> m.szyprow...@samsung.com
> Cc: Linuxarm ; linux-kselft...@vger.kernel.org; xuwei
> (O) ; Song Bao Hua (Barry Song)
> ; Joerg Roedel ; Will Deacon
> ; Shuah Khan 
> Subject: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming
> DMA APIs
> 
> Nowadays, there are increasing requirements to benchmark the performance
> of dma_map and dma_unmap particually while the device is attached to an
> IOMMU.
> 
> This patch enables the support. Users can run specified number of threads to
> do dma_map_page and dma_unmap_page on a specific NUMA node with the
> specified duration. Then dma_map_benchmark will calculate the average
> latency for map and unmap.
> 
> A difficulity for this benchmark is that dma_map/unmap APIs must run on a
> particular device. Each device might have different backend of IOMMU or
> non-IOMMU.
> 
> So we use the driver_override to bind dma_map_benchmark to a particual
> device by:
> For platform devices:
> echo dma_map_benchmark > /sys/bus/platform/devices/xxx/driver_override
> echo xxx > /sys/bus/platform/drivers/xxx/unbind
> echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind
> 
> For PCI devices:
> echo dma_map_benchmark >
> /sys/bus/pci/devices/:00:01.0/driver_override
> echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind echo :00:01.0 >
> /sys/bus/pci/drivers/dma_map_benchmark/bind
> 
> Cc: Joerg Roedel 
> Cc: Will Deacon 
> Cc: Shuah Khan 
> Cc: Christoph Hellwig 
> Cc: Marek Szyprowski 
> Cc: Robin Murphy 
> Signed-off-by: Barry Song 
> ---
> -v3:
>   * fix build issues reported by 0day kernel test robot
> -v2:
>   * add PCI support; v1 supported platform devices only
>   * replace ssleep by msleep_interruptible() to permit users to exit
> benchmark before it is completed
>   * many changes according to Robin's suggestions, thanks! Robin
> - add standard deviation output to reflect the worst case
> - check users' parameters strictly like the number of threads
> - make cache dirty before dma_map
> - fix unpaired dma_map_page and dma_unmap_single;
> - remove redundant "long long" before ktime_to_ns();
> - use devm_add_action()
> 
>  kernel/dma/Kconfig |   8 +
>  kernel/dma/Makefile|   1 +
>  kernel/dma/map_benchmark.c | 296
> +
>  3 files changed, 305 insertions(+)
>  create mode 100644 kernel/dma/map_benchmark.c
> 
> diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig index
> c99de4a21458..949c53da5991 100644
> --- a/kernel/dma/Kconfig
> +++ b/kernel/dma/Kconfig
> @@ -225,3 +225,11 @@ config DMA_API_DEBUG_SG
> is technically out-of-spec.
> 
> If unsure, say N.
> +
> +config DMA_MAP_BENCHMARK
> + bool "Enable benchmarking of streaming DMA mapping"
> + help
> +   Provides /sys/kernel/debug/dma_map_benchmark that helps with
> testing
> +   performance of dma_(un)map_page.
> +
> +   See tools/testing/selftests/dma/dma_map_benchmark.c
> diff --git a/kernel/dma/Makefile b/kernel/dma/Makefile index
> dc755ab68aab..7aa6b26b1348 100644
> --- a/kernel/dma/Makefile
> +++ b/kernel/dma/Makefile
> @@ -10,3 +10,4 @@ obj-$(CONFIG_DMA_API_DEBUG) += debug.o
>  obj-$(CONFIG_SWIOTLB)+= swiotlb.o
>  obj-$(CONFIG_DMA_COHERENT_POOL)  += pool.o
>  obj-$(CONFIG_DMA_REMAP)  += remap.o
> +obj-$(CONFIG_DMA_MAP_BENCHMARK)  += map_benchmark.o
> diff --git a/kernel/dma/map_benchmark.c b/kernel/dma/map_benchmark.c
> new file mode 100644 index ..dc4e5ff48a2d
> --- /dev/null
> +++ b/kernel/dma/map_benchmark.c
> @@ -0,0 +1,296 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2020 Hisilicon Limited.
> + */
> +
> +#define pr_fmt(fmt)  KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DMA_MAP_BENCHMARK_IOWR('d', 1, struct map_benchmark)
> +#define DMA_MAP_MAX_THREADS  1024
> +#define DMA_MAP_MAX_SECONDS  300
> +
> +struct map_benchmark {
> + __u64 avg_map_100ns; /* average map latency in 100ns */
> + __u64 map_stddev; /* standard deviation of map latency */
> + __u64 avg_unmap_100ns; /* as above */
> + __u64 unmap_stddev;
> + __u32 threads; /* how many threads will do map/unmap in parallel */
> + __u32 seconds; /* how long the test will last */
> + int node; /* which numa node this benchmark will run on */
> + __u64 expansion[10];/* For future use */
> +};
> +
> +struct map_benchmark_data {
> + st

RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-02 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: John Garry
> Sent: Monday, November 2, 2020 10:19 PM
> To: Song Bao Hua (Barry Song) ;
> iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com;
> m.szyprow...@samsung.com
> Cc: linux-kselft...@vger.kernel.org; Shuah Khan ; Joerg
> Roedel ; Linuxarm ; xuwei (O)
> ; Will Deacon 
> Subject: Re: [PATCH v3 1/2] dma-mapping: add benchmark support for
> streaming DMA APIs
> 
> On 02/11/2020 08:06, Barry Song wrote:
> > Nowadays, there are increasing requirements to benchmark the performance
> > of dma_map and dma_unmap particually while the device is attached to an
> > IOMMU.
> >
> > This patch enables the support. Users can run specified number of threads
> > to do dma_map_page and dma_unmap_page on a specific NUMA node with
> the
> > specified duration. Then dma_map_benchmark will calculate the average
> > latency for map and unmap.
> >
> > A difficulity for this benchmark is that dma_map/unmap APIs must run on
> > a particular device. Each device might have different backend of IOMMU or
> > non-IOMMU.
> >
> > So we use the driver_override to bind dma_map_benchmark to a particual
> > device by:
> > For platform devices:
> > echo dma_map_benchmark > /sys/bus/platform/devices/xxx/driver_override
> > echo xxx > /sys/bus/platform/drivers/xxx/unbind
> > echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind
> >
> > For PCI devices:
> > echo dma_map_benchmark >
> /sys/bus/pci/devices/:00:01.0/driver_override
> > echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind
> > echo :00:01.0 > /sys/bus/pci/drivers/dma_map_benchmark/bind
> >
> > Cc: Joerg Roedel 
> > Cc: Will Deacon 
> > Cc: Shuah Khan 
> > Cc: Christoph Hellwig 
> > Cc: Marek Szyprowski 
> > Cc: Robin Murphy 
> > Signed-off-by: Barry Song 
> > ---
> > -v3:
> >* fix build issues reported by 0day kernel test robot
> > -v2:
> >* add PCI support; v1 supported platform devices only
> >* replace ssleep by msleep_interruptible() to permit users to exit
> >  benchmark before it is completed
> >* many changes according to Robin's suggestions, thanks! Robin
> >  - add standard deviation output to reflect the worst case
> >  - check users' parameters strictly like the number of threads
> >  - make cache dirty before dma_map
> >  - fix unpaired dma_map_page and dma_unmap_single;
> >  - remove redundant "long long" before ktime_to_ns();
> >  - use devm_add_action()
> >
> >   kernel/dma/Kconfig |   8 +
> >   kernel/dma/Makefile|   1 +
> >   kernel/dma/map_benchmark.c | 296
> +
> >   3 files changed, 305 insertions(+)
> >   create mode 100644 kernel/dma/map_benchmark.c
> >
> > diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
> > index c99de4a21458..949c53da5991 100644
> > --- a/kernel/dma/Kconfig
> > +++ b/kernel/dma/Kconfig
> > @@ -225,3 +225,11 @@ config DMA_API_DEBUG_SG
> >   is technically out-of-spec.
> >
> >   If unsure, say N.
> > +
> > +config DMA_MAP_BENCHMARK
> > +   bool "Enable benchmarking of streaming DMA mapping"
> > +   help
> > + Provides /sys/kernel/debug/dma_map_benchmark that helps with
> testing
> > + performance of dma_(un)map_page.
> 
> Since this is a driver, any reason for which it cannot be loadable? If
> so, it seems any functionality would depend on DEBUG FS, I figure that's
> just how we work for debugfs.

We depend on kthread_bind_mask which isn't an export_symbol.
Maybe worth to send a patch to export it?

> 
> Thanks,
> John
> 
> > +
> > + See tools/testing/selftests/dma/dma_map_benchmark.c
> > diff --git a/kernel/dma/Makefile b/kernel/dma/Makefile
> > index dc755ab68aab..7aa6b26b1348 100644
> > --- a/kernel/dma/Makefile
> > +++ b/kernel/dma/Makefile

Thanks
Barry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs

2020-11-02 Thread John Garry

On 02/11/2020 08:06, Barry Song wrote:

Nowadays, there are increasing requirements to benchmark the performance
of dma_map and dma_unmap particually while the device is attached to an
IOMMU.

This patch enables the support. Users can run specified number of threads
to do dma_map_page and dma_unmap_page on a specific NUMA node with the
specified duration. Then dma_map_benchmark will calculate the average
latency for map and unmap.

A difficulity for this benchmark is that dma_map/unmap APIs must run on
a particular device. Each device might have different backend of IOMMU or
non-IOMMU.

So we use the driver_override to bind dma_map_benchmark to a particual
device by:
For platform devices:
echo dma_map_benchmark > /sys/bus/platform/devices/xxx/driver_override
echo xxx > /sys/bus/platform/drivers/xxx/unbind
echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind

For PCI devices:
echo dma_map_benchmark > /sys/bus/pci/devices/:00:01.0/driver_override
echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind
echo :00:01.0 > /sys/bus/pci/drivers/dma_map_benchmark/bind

Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Shuah Khan 
Cc: Christoph Hellwig 
Cc: Marek Szyprowski 
Cc: Robin Murphy 
Signed-off-by: Barry Song 
---
-v3:
   * fix build issues reported by 0day kernel test robot
-v2:
   * add PCI support; v1 supported platform devices only
   * replace ssleep by msleep_interruptible() to permit users to exit
 benchmark before it is completed
   * many changes according to Robin's suggestions, thanks! Robin
 - add standard deviation output to reflect the worst case
 - check users' parameters strictly like the number of threads
 - make cache dirty before dma_map
 - fix unpaired dma_map_page and dma_unmap_single;
 - remove redundant "long long" before ktime_to_ns();
 - use devm_add_action()

  kernel/dma/Kconfig |   8 +
  kernel/dma/Makefile|   1 +
  kernel/dma/map_benchmark.c | 296 +
  3 files changed, 305 insertions(+)
  create mode 100644 kernel/dma/map_benchmark.c

diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index c99de4a21458..949c53da5991 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -225,3 +225,11 @@ config DMA_API_DEBUG_SG
  is technically out-of-spec.
  
  	  If unsure, say N.

+
+config DMA_MAP_BENCHMARK
+   bool "Enable benchmarking of streaming DMA mapping"
+   help
+ Provides /sys/kernel/debug/dma_map_benchmark that helps with testing
+ performance of dma_(un)map_page.


Since this is a driver, any reason for which it cannot be loadable? If 
so, it seems any functionality would depend on DEBUG FS, I figure that's 
just how we work for debugfs.


Thanks,
John


+
+ See tools/testing/selftests/dma/dma_map_benchmark.c
diff --git a/kernel/dma/Makefile b/kernel/dma/Makefile
index dc755ab68aab..7aa6b26b1348 100644
--- a/kernel/dma/Makefile
+++ b/kernel/dma/Makefile

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu