RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs
> -Original Message- > From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Sunday, November 15, 2020 9:45 PM > To: Song Bao Hua (Barry Song) > Cc: Christoph Hellwig ; iommu@lists.linux-foundation.org; > robin.mur...@arm.com; m.szyprow...@samsung.com; Linuxarm > ; linux-kselft...@vger.kernel.org; xuwei (O) > ; Joerg Roedel ; Will Deacon > ; Shuah Khan > Subject: Re: [PATCH v3 1/2] dma-mapping: add benchmark support for > streaming DMA APIs > > On Sun, Nov 15, 2020 at 12:11:15AM +, Song Bao Hua (Barry Song) > wrote: > > > > Checkpatch has changed 80 to 100. That's probably why my local checkpatch > didn't report any warning: > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id= > bdc48fa11e46f867ea4d > > > > I am happy to change them to be less than 80 if you like. > > Don't rely on checkpath, is is broken. Look at the codingstyle document. > > > > I think this needs to set a dma mask as behavior for unlimited dma > > > mask vs the default 32-bit one can be very different. > > > > I actually prefer users bind real devices with real dma_mask to test rather > than force to change > > the dma_mask in this benchmark. > > The mask is set by the driver, not the device. So you need to set when > when you bind, real device or not. Yep while it is a little bit tricky. Sometimes, it is done by "device" in architectures, e.g. there are lots of dma_mask configuration code in arch/arm/mach-xxx. arch/arm/mach-davinci/da850.c static u64 da850_vpif_dma_mask = DMA_BIT_MASK(32); static struct platform_device da850_vpif_dev = { .name = "vpif", .id = -1, .dev= { .dma_mask = &da850_vpif_dma_mask, .coherent_dma_mask = DMA_BIT_MASK(32), }, .resource = da850_vpif_resource, .num_resources = ARRAY_SIZE(da850_vpif_resource), }; Sometimes, it is done by "of" or "acpi", for example: drivers/acpi/arm64/iort.c void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size) { u64 end, mask, dmaaddr = 0, size = 0, offset = 0; int ret; ... ret = acpi_dma_get_range(dev, &dmaaddr, &offset, &size); if (!ret) { /* * Limit coherent and dma mask based on size retrieved from * firmware. */ end = dmaaddr + size - 1; mask = DMA_BIT_MASK(ilog2(end) + 1); dev->bus_dma_limit = end; dev->coherent_dma_mask = mask; *dev->dma_mask = mask; } ... } Sometimes, it is done by "bus", for example, ISA: isa_dev->dev.coherent_dma_mask = DMA_BIT_MASK(24); isa_dev->dev.dma_mask = &isa_dev->dev.coherent_dma_mask; error = device_register(&isa_dev->dev); if (error) { put_device(&isa_dev->dev); break; } And in many cases, it is done by driver. On the ARM64 server platform I am testing, actually rarely drivers set dma_mask. So to make the dma benchmark work on all platforms, it seems it is worth to add a dma_mask_bit parameter. But, in order to avoid breaking the dma_mask of those devices whose dma_mask are set by architectures, acpi and bus, it seems we need to do the below in dma_benchmark: u64 old_mask; old_mask = dma_get_mask(dev); dma_set_mask(dev, &new_mask); do_map_benchmark(); /* restore old dma_mask so that the dma_mask of the device is not changed due to benchmark when it is bound back to its original driver */ dma_set_mask(dev, &old_mask); Thanks Barry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs
On Sun, Nov 15, 2020 at 12:11:15AM +, Song Bao Hua (Barry Song) wrote: > > Checkpatch has changed 80 to 100. That's probably why my local checkpatch > didn't report any warning: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bdc48fa11e46f867ea4d > > I am happy to change them to be less than 80 if you like. Don't rely on checkpath, is is broken. Look at the codingstyle document. > > I think this needs to set a dma mask as behavior for unlimited dma > > mask vs the default 32-bit one can be very different. > > I actually prefer users bind real devices with real dma_mask to test rather > than force to change > the dma_mask in this benchmark. The mask is set by the driver, not the device. So you need to set when when you bind, real device or not. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs
> -Original Message- > From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Sunday, November 15, 2020 5:54 AM > To: Song Bao Hua (Barry Song) > Cc: iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com; > m.szyprow...@samsung.com; Linuxarm ; > linux-kselft...@vger.kernel.org; xuwei (O) ; Joerg > Roedel ; Will Deacon ; Shuah Khan > > Subject: Re: [PATCH v3 1/2] dma-mapping: add benchmark support for > streaming DMA APIs > > Lots of > 80 char lines. Please fix up the style. Checkpatch has changed 80 to 100. That's probably why my local checkpatch didn't report any warning: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bdc48fa11e46f867ea4d I am happy to change them to be less than 80 if you like. > > I think this needs to set a dma mask as behavior for unlimited dma > mask vs the default 32-bit one can be very different. I actually prefer users bind real devices with real dma_mask to test rather than force to change the dma_mask in this benchmark. Some device might have 32bit dma_mask while some others might have unlimited. But both of them can bind to this driver or unbind from it after the test is done. So users just need to bind those different real devices with different real dma_mask to dma_benchmark. This can reflect the real performance of the real device better, I think. > I also think you need to be able to pass the direction or have different tests > for directions. bidirectional is not exactly heavily used and pays > more cache management penality. For this, I'd like to increase a direction option in the test app and pass the option to the benchmark driver. Thanks Barry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs
Lots of > 80 char lines. Please fix up the style. I think this needs to set a dma mask as behavior for unlimited dma mask vs the default 32-bit one can be very different. I also think you need to be able to pass the direction or have different tests for directions. bidirectional is not exactly heavily used and pays more cache management penality. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs
> -Original Message- > From: John Garry > Sent: Wednesday, November 11, 2020 10:37 PM > To: Song Bao Hua (Barry Song) ; > iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com; > m.szyprow...@samsung.com > Cc: linux-kselft...@vger.kernel.org; Will Deacon ; Joerg > Roedel ; Linuxarm ; xuwei (O) > ; Shuah Khan > Subject: Re: [PATCH v3 1/2] dma-mapping: add benchmark support for > streaming DMA APIs > > On 11/11/2020 01:29, Song Bao Hua (Barry Song) wrote: > > I'd like to think checking this here would be overdesign. We just give > > users the > > freedom to bind any device they care about to the benchmark driver. Usually > > that means a real hardware either behind an IOMMU or through a direct > > mapping. > > > > if for any reason users put a wrong "device", that is the choice of users. > > Right, but if the device simply has no DMA ops supported, it could be > better to fail the probe rather than let them try the test at all. > > Anyhow, > > the below code will still handle it properly and users will get a report in > > which > > everything is zero. > > > > +static int map_benchmark_thread(void *data) > > +{ > > ... > > + dma_addr = dma_map_single(map->dev, buf, PAGE_SIZE, > DMA_BIDIRECTIONAL); > > + if (unlikely(dma_mapping_error(map->dev, dma_addr))) { > > Doing this is proper, but I am not sure if this tells the user the real > problem. Telling users the real problem isn't the design intention of this test benchmark. It is never the purpose of this benchmark. > > > + pr_err("dma_map_single failed on %s\n", > dev_name(map->dev)); > > Not sure why use pr_err() over dev_err(). We are reporting errors in dma-benchmark driver rather than reporting errors in the driver of the specific device. I think we should have "dma-benchmark" as the prefix while printing the name of the device by dev_name(). > > > + ret = -ENOMEM; > > + goto out; > > + } > > Thanks, > John Thanks Barry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs
> -Original Message- > From: John Garry > Sent: Tuesday, November 10, 2020 9:39 PM > To: Song Bao Hua (Barry Song) ; > iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com; > m.szyprow...@samsung.com > Cc: linux-kselft...@vger.kernel.org; Will Deacon ; Joerg > Roedel ; Linuxarm ; xuwei (O) > ; Shuah Khan > Subject: Re: [PATCH v3 1/2] dma-mapping: add benchmark support for > streaming DMA APIs > > On 10/11/2020 08:10, Song Bao Hua (Barry Song) wrote: > > Hello Robin, Christoph, > > Any further comment? John suggested that "depends on DEBUG_FS" should > be added in Kconfig. > > I am collecting more comments to send v4 together with fixing this minor > issue :-) > > > > Thanks > > Barry > > > >> -Original Message- > >> From: Song Bao Hua (Barry Song) > >> Sent: Monday, November 2, 2020 9:07 PM > >> To: iommu@lists.linux-foundation.org; h...@lst.de; > robin.mur...@arm.com; > >> m.szyprow...@samsung.com > >> Cc: Linuxarm ; linux-kselft...@vger.kernel.org; > xuwei > >> (O) ; Song Bao Hua (Barry Song) > >> ; Joerg Roedel ; Will > Deacon > >> ; Shuah Khan > >> Subject: [PATCH v3 1/2] dma-mapping: add benchmark support for > streaming > >> DMA APIs > >> > >> Nowadays, there are increasing requirements to benchmark the > performance > >> of dma_map and dma_unmap particually while the device is attached to an > >> IOMMU. > >> > >> This patch enables the support. Users can run specified number of threads > to > >> do dma_map_page and dma_unmap_page on a specific NUMA node with > the > >> specified duration. Then dma_map_benchmark will calculate the average > >> latency for map and unmap. > >> > >> A difficulity for this benchmark is that dma_map/unmap APIs must run on a > >> particular device. Each device might have different backend of IOMMU or > >> non-IOMMU. > >> > >> So we use the driver_override to bind dma_map_benchmark to a particual > >> device by: > >> For platform devices: > >> echo dma_map_benchmark > > /sys/bus/platform/devices/xxx/driver_override > >> echo xxx > /sys/bus/platform/drivers/xxx/unbind > >> echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind > >> > > Hi Barry, > > >> For PCI devices: > >> echo dma_map_benchmark > > >> /sys/bus/pci/devices/:00:01.0/driver_override > >> echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind echo :00:01.0 > > >> /sys/bus/pci/drivers/dma_map_benchmark/bind > > Do we need to check if the device to which we attach actually has DMA > mapping capability? Hello John, I'd like to think checking this here would be overdesign. We just give users the freedom to bind any device they care about to the benchmark driver. Usually that means a real hardware either behind an IOMMU or through a direct mapping. if for any reason users put a wrong "device", that is the choice of users. Anyhow, the below code will still handle it properly and users will get a report in which everything is zero. +static int map_benchmark_thread(void *data) +{ ... + dma_addr = dma_map_single(map->dev, buf, PAGE_SIZE, DMA_BIDIRECTIONAL); + if (unlikely(dma_mapping_error(map->dev, dma_addr))) { + pr_err("dma_map_single failed on %s\n", dev_name(map->dev)); + ret = -ENOMEM; + goto out; + } ... +} > > >> > >> Cc: Joerg Roedel > >> Cc: Will Deacon > >> Cc: Shuah Khan > >> Cc: Christoph Hellwig > >> Cc: Marek Szyprowski > >> Cc: Robin Murphy > >> Signed-off-by: Barry Song > >> --- > > Thanks, > John Thanks Barry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs
On 10/11/2020 08:10, Song Bao Hua (Barry Song) wrote: Hello Robin, Christoph, Any further comment? John suggested that "depends on DEBUG_FS" should be added in Kconfig. I am collecting more comments to send v4 together with fixing this minor issue :-) Thanks Barry -Original Message- From: Song Bao Hua (Barry Song) Sent: Monday, November 2, 2020 9:07 PM To: iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com; m.szyprow...@samsung.com Cc: Linuxarm ; linux-kselft...@vger.kernel.org; xuwei (O) ; Song Bao Hua (Barry Song) ; Joerg Roedel ; Will Deacon ; Shuah Khan Subject: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs Nowadays, there are increasing requirements to benchmark the performance of dma_map and dma_unmap particually while the device is attached to an IOMMU. This patch enables the support. Users can run specified number of threads to do dma_map_page and dma_unmap_page on a specific NUMA node with the specified duration. Then dma_map_benchmark will calculate the average latency for map and unmap. A difficulity for this benchmark is that dma_map/unmap APIs must run on a particular device. Each device might have different backend of IOMMU or non-IOMMU. So we use the driver_override to bind dma_map_benchmark to a particual device by: For platform devices: echo dma_map_benchmark > /sys/bus/platform/devices/xxx/driver_override echo xxx > /sys/bus/platform/drivers/xxx/unbind echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind Hi Barry, For PCI devices: echo dma_map_benchmark > /sys/bus/pci/devices/:00:01.0/driver_override echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind echo :00:01.0 > /sys/bus/pci/drivers/dma_map_benchmark/bind Do we need to check if the device to which we attach actually has DMA mapping capability? Cc: Joerg Roedel Cc: Will Deacon Cc: Shuah Khan Cc: Christoph Hellwig Cc: Marek Szyprowski Cc: Robin Murphy Signed-off-by: Barry Song --- Thanks, John ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs
Hello Robin, Christoph, Any further comment? John suggested that "depends on DEBUG_FS" should be added in Kconfig. I am collecting more comments to send v4 together with fixing this minor issue :-) Thanks Barry > -Original Message- > From: Song Bao Hua (Barry Song) > Sent: Monday, November 2, 2020 9:07 PM > To: iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com; > m.szyprow...@samsung.com > Cc: Linuxarm ; linux-kselft...@vger.kernel.org; xuwei > (O) ; Song Bao Hua (Barry Song) > ; Joerg Roedel ; Will Deacon > ; Shuah Khan > Subject: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming > DMA APIs > > Nowadays, there are increasing requirements to benchmark the performance > of dma_map and dma_unmap particually while the device is attached to an > IOMMU. > > This patch enables the support. Users can run specified number of threads to > do dma_map_page and dma_unmap_page on a specific NUMA node with the > specified duration. Then dma_map_benchmark will calculate the average > latency for map and unmap. > > A difficulity for this benchmark is that dma_map/unmap APIs must run on a > particular device. Each device might have different backend of IOMMU or > non-IOMMU. > > So we use the driver_override to bind dma_map_benchmark to a particual > device by: > For platform devices: > echo dma_map_benchmark > /sys/bus/platform/devices/xxx/driver_override > echo xxx > /sys/bus/platform/drivers/xxx/unbind > echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind > > For PCI devices: > echo dma_map_benchmark > > /sys/bus/pci/devices/:00:01.0/driver_override > echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind echo :00:01.0 > > /sys/bus/pci/drivers/dma_map_benchmark/bind > > Cc: Joerg Roedel > Cc: Will Deacon > Cc: Shuah Khan > Cc: Christoph Hellwig > Cc: Marek Szyprowski > Cc: Robin Murphy > Signed-off-by: Barry Song > --- > -v3: > * fix build issues reported by 0day kernel test robot > -v2: > * add PCI support; v1 supported platform devices only > * replace ssleep by msleep_interruptible() to permit users to exit > benchmark before it is completed > * many changes according to Robin's suggestions, thanks! Robin > - add standard deviation output to reflect the worst case > - check users' parameters strictly like the number of threads > - make cache dirty before dma_map > - fix unpaired dma_map_page and dma_unmap_single; > - remove redundant "long long" before ktime_to_ns(); > - use devm_add_action() > > kernel/dma/Kconfig | 8 + > kernel/dma/Makefile| 1 + > kernel/dma/map_benchmark.c | 296 > + > 3 files changed, 305 insertions(+) > create mode 100644 kernel/dma/map_benchmark.c > > diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig index > c99de4a21458..949c53da5991 100644 > --- a/kernel/dma/Kconfig > +++ b/kernel/dma/Kconfig > @@ -225,3 +225,11 @@ config DMA_API_DEBUG_SG > is technically out-of-spec. > > If unsure, say N. > + > +config DMA_MAP_BENCHMARK > + bool "Enable benchmarking of streaming DMA mapping" > + help > + Provides /sys/kernel/debug/dma_map_benchmark that helps with > testing > + performance of dma_(un)map_page. > + > + See tools/testing/selftests/dma/dma_map_benchmark.c > diff --git a/kernel/dma/Makefile b/kernel/dma/Makefile index > dc755ab68aab..7aa6b26b1348 100644 > --- a/kernel/dma/Makefile > +++ b/kernel/dma/Makefile > @@ -10,3 +10,4 @@ obj-$(CONFIG_DMA_API_DEBUG) += debug.o > obj-$(CONFIG_SWIOTLB)+= swiotlb.o > obj-$(CONFIG_DMA_COHERENT_POOL) += pool.o > obj-$(CONFIG_DMA_REMAP) += remap.o > +obj-$(CONFIG_DMA_MAP_BENCHMARK) += map_benchmark.o > diff --git a/kernel/dma/map_benchmark.c b/kernel/dma/map_benchmark.c > new file mode 100644 index ..dc4e5ff48a2d > --- /dev/null > +++ b/kernel/dma/map_benchmark.c > @@ -0,0 +1,296 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Copyright (C) 2020 Hisilicon Limited. > + */ > + > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#define DMA_MAP_BENCHMARK_IOWR('d', 1, struct map_benchmark) > +#define DMA_MAP_MAX_THREADS 1024 > +#define DMA_MAP_MAX_SECONDS 300 > + > +struct map_benchmark { > + __u64 avg_map_100ns; /* average map latency in 100ns */ > + __u64 map_stddev; /* standard deviation of map latency */ > + __u64 avg_unmap_100ns; /* as above */ > + __u64 unmap_stddev; > + __u32 threads; /* how many threads will do map/unmap in parallel */ > + __u32 seconds; /* how long the test will last */ > + int node; /* which numa node this benchmark will run on */ > + __u64 expansion[10];/* For future use */ > +}; > + > +struct map_benchmark_data { > + st
RE: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs
> -Original Message- > From: John Garry > Sent: Monday, November 2, 2020 10:19 PM > To: Song Bao Hua (Barry Song) ; > iommu@lists.linux-foundation.org; h...@lst.de; robin.mur...@arm.com; > m.szyprow...@samsung.com > Cc: linux-kselft...@vger.kernel.org; Shuah Khan ; Joerg > Roedel ; Linuxarm ; xuwei (O) > ; Will Deacon > Subject: Re: [PATCH v3 1/2] dma-mapping: add benchmark support for > streaming DMA APIs > > On 02/11/2020 08:06, Barry Song wrote: > > Nowadays, there are increasing requirements to benchmark the performance > > of dma_map and dma_unmap particually while the device is attached to an > > IOMMU. > > > > This patch enables the support. Users can run specified number of threads > > to do dma_map_page and dma_unmap_page on a specific NUMA node with > the > > specified duration. Then dma_map_benchmark will calculate the average > > latency for map and unmap. > > > > A difficulity for this benchmark is that dma_map/unmap APIs must run on > > a particular device. Each device might have different backend of IOMMU or > > non-IOMMU. > > > > So we use the driver_override to bind dma_map_benchmark to a particual > > device by: > > For platform devices: > > echo dma_map_benchmark > /sys/bus/platform/devices/xxx/driver_override > > echo xxx > /sys/bus/platform/drivers/xxx/unbind > > echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind > > > > For PCI devices: > > echo dma_map_benchmark > > /sys/bus/pci/devices/:00:01.0/driver_override > > echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind > > echo :00:01.0 > /sys/bus/pci/drivers/dma_map_benchmark/bind > > > > Cc: Joerg Roedel > > Cc: Will Deacon > > Cc: Shuah Khan > > Cc: Christoph Hellwig > > Cc: Marek Szyprowski > > Cc: Robin Murphy > > Signed-off-by: Barry Song > > --- > > -v3: > >* fix build issues reported by 0day kernel test robot > > -v2: > >* add PCI support; v1 supported platform devices only > >* replace ssleep by msleep_interruptible() to permit users to exit > > benchmark before it is completed > >* many changes according to Robin's suggestions, thanks! Robin > > - add standard deviation output to reflect the worst case > > - check users' parameters strictly like the number of threads > > - make cache dirty before dma_map > > - fix unpaired dma_map_page and dma_unmap_single; > > - remove redundant "long long" before ktime_to_ns(); > > - use devm_add_action() > > > > kernel/dma/Kconfig | 8 + > > kernel/dma/Makefile| 1 + > > kernel/dma/map_benchmark.c | 296 > + > > 3 files changed, 305 insertions(+) > > create mode 100644 kernel/dma/map_benchmark.c > > > > diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig > > index c99de4a21458..949c53da5991 100644 > > --- a/kernel/dma/Kconfig > > +++ b/kernel/dma/Kconfig > > @@ -225,3 +225,11 @@ config DMA_API_DEBUG_SG > > is technically out-of-spec. > > > > If unsure, say N. > > + > > +config DMA_MAP_BENCHMARK > > + bool "Enable benchmarking of streaming DMA mapping" > > + help > > + Provides /sys/kernel/debug/dma_map_benchmark that helps with > testing > > + performance of dma_(un)map_page. > > Since this is a driver, any reason for which it cannot be loadable? If > so, it seems any functionality would depend on DEBUG FS, I figure that's > just how we work for debugfs. We depend on kthread_bind_mask which isn't an export_symbol. Maybe worth to send a patch to export it? > > Thanks, > John > > > + > > + See tools/testing/selftests/dma/dma_map_benchmark.c > > diff --git a/kernel/dma/Makefile b/kernel/dma/Makefile > > index dc755ab68aab..7aa6b26b1348 100644 > > --- a/kernel/dma/Makefile > > +++ b/kernel/dma/Makefile Thanks Barry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/2] dma-mapping: add benchmark support for streaming DMA APIs
On 02/11/2020 08:06, Barry Song wrote: Nowadays, there are increasing requirements to benchmark the performance of dma_map and dma_unmap particually while the device is attached to an IOMMU. This patch enables the support. Users can run specified number of threads to do dma_map_page and dma_unmap_page on a specific NUMA node with the specified duration. Then dma_map_benchmark will calculate the average latency for map and unmap. A difficulity for this benchmark is that dma_map/unmap APIs must run on a particular device. Each device might have different backend of IOMMU or non-IOMMU. So we use the driver_override to bind dma_map_benchmark to a particual device by: For platform devices: echo dma_map_benchmark > /sys/bus/platform/devices/xxx/driver_override echo xxx > /sys/bus/platform/drivers/xxx/unbind echo xxx > /sys/bus/platform/drivers/dma_map_benchmark/bind For PCI devices: echo dma_map_benchmark > /sys/bus/pci/devices/:00:01.0/driver_override echo :00:01.0 > /sys/bus/pci/drivers/xxx/unbind echo :00:01.0 > /sys/bus/pci/drivers/dma_map_benchmark/bind Cc: Joerg Roedel Cc: Will Deacon Cc: Shuah Khan Cc: Christoph Hellwig Cc: Marek Szyprowski Cc: Robin Murphy Signed-off-by: Barry Song --- -v3: * fix build issues reported by 0day kernel test robot -v2: * add PCI support; v1 supported platform devices only * replace ssleep by msleep_interruptible() to permit users to exit benchmark before it is completed * many changes according to Robin's suggestions, thanks! Robin - add standard deviation output to reflect the worst case - check users' parameters strictly like the number of threads - make cache dirty before dma_map - fix unpaired dma_map_page and dma_unmap_single; - remove redundant "long long" before ktime_to_ns(); - use devm_add_action() kernel/dma/Kconfig | 8 + kernel/dma/Makefile| 1 + kernel/dma/map_benchmark.c | 296 + 3 files changed, 305 insertions(+) create mode 100644 kernel/dma/map_benchmark.c diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig index c99de4a21458..949c53da5991 100644 --- a/kernel/dma/Kconfig +++ b/kernel/dma/Kconfig @@ -225,3 +225,11 @@ config DMA_API_DEBUG_SG is technically out-of-spec. If unsure, say N. + +config DMA_MAP_BENCHMARK + bool "Enable benchmarking of streaming DMA mapping" + help + Provides /sys/kernel/debug/dma_map_benchmark that helps with testing + performance of dma_(un)map_page. Since this is a driver, any reason for which it cannot be loadable? If so, it seems any functionality would depend on DEBUG FS, I figure that's just how we work for debugfs. Thanks, John + + See tools/testing/selftests/dma/dma_map_benchmark.c diff --git a/kernel/dma/Makefile b/kernel/dma/Makefile index dc755ab68aab..7aa6b26b1348 100644 --- a/kernel/dma/Makefile +++ b/kernel/dma/Makefile ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu