Re: [PATCH] uio: Replace mutex info_lock with percpu_ref to improve performance
On Tue, Feb 08, 2022 at 03:19:20PM +0800, Guixin Liu wrote: > This patch includes a modification to repace mutex info_lock with > percpu_ref, in order to improve uio performance. What performance critical use case do you have for uio? Everyone really should be using vfio these days due to the large amount of shortcomings in the uio interface. > > Reviewed-by: Xiaoguang Wang > Signed-off-by: Guixin Liu > --- > drivers/uio/uio.c | 95 > ++ > include/linux/uio_driver.h | 5 ++- > 2 files changed, 75 insertions(+), 25 deletions(-) > > diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c > index 43afbb7..0cc0655 100644 > --- a/drivers/uio/uio.c > +++ b/drivers/uio/uio.c > @@ -24,6 +24,8 @@ > #include > #include > #include > +#include > +#include > > #define UIO_MAX_DEVICES (1U << MINORBITS) > > @@ -218,7 +220,9 @@ static ssize_t name_show(struct device *dev, > struct uio_device *idev = dev_get_drvdata(dev); > int ret; > > - mutex_lock(>info_lock); > + if (!percpu_ref_tryget_live(>info_ref)) > + return -EINVAL; > + > if (!idev->info) { > ret = -EINVAL; > dev_err(dev, "the device has been unregistered\n"); > @@ -228,7 +232,7 @@ static ssize_t name_show(struct device *dev, > ret = sprintf(buf, "%s\n", idev->info->name); > > out: > - mutex_unlock(>info_lock); > + percpu_ref_put(>info_ref); > return ret; > } > static DEVICE_ATTR_RO(name); > @@ -239,7 +243,9 @@ static ssize_t version_show(struct device *dev, > struct uio_device *idev = dev_get_drvdata(dev); > int ret; > > - mutex_lock(>info_lock); > + if (!percpu_ref_tryget_live(>info_ref)) > + return -EINVAL; > + > if (!idev->info) { > ret = -EINVAL; > dev_err(dev, "the device has been unregistered\n"); > @@ -249,7 +255,7 @@ static ssize_t version_show(struct device *dev, > ret = sprintf(buf, "%s\n", idev->info->version); > > out: > - mutex_unlock(>info_lock); > + percpu_ref_put(>info_ref); > return ret; > } > static DEVICE_ATTR_RO(version); > @@ -489,16 +495,20 @@ static int uio_open(struct inode *inode, struct file > *filep) > listener->event_count = atomic_read(>event); > filep->private_data = listener; > > - mutex_lock(>info_lock); > + if (!percpu_ref_tryget_live(>info_ref)) { > + ret = -EINVAL; > + goto err_infoopen; > + } > + > if (!idev->info) { > - mutex_unlock(>info_lock); > + percpu_ref_put(>info_ref); > ret = -EINVAL; > goto err_infoopen; > } > > if (idev->info->open) > ret = idev->info->open(idev->info, inode); > - mutex_unlock(>info_lock); > + percpu_ref_put(>info_ref); > if (ret) > goto err_infoopen; > > @@ -531,10 +541,12 @@ static int uio_release(struct inode *inode, struct file > *filep) > struct uio_listener *listener = filep->private_data; > struct uio_device *idev = listener->dev; > > - mutex_lock(>info_lock); > + if (!percpu_ref_tryget_live(>info_ref)) > + return -EINVAL; > + > if (idev->info && idev->info->release) > ret = idev->info->release(idev->info, inode); > - mutex_unlock(>info_lock); > + percpu_ref_put(>info_ref); > > module_put(idev->owner); > kfree(listener); > @@ -548,10 +560,12 @@ static __poll_t uio_poll(struct file *filep, poll_table > *wait) > struct uio_device *idev = listener->dev; > __poll_t ret = 0; > > - mutex_lock(>info_lock); > + if (!percpu_ref_tryget_live(>info_ref)) > + return -EINVAL; > + > if (!idev->info || !idev->info->irq) > ret = -EIO; > - mutex_unlock(>info_lock); > + percpu_ref_put(>info_ref); > > if (ret) > return ret; > @@ -577,13 +591,17 @@ static ssize_t uio_read(struct file *filep, char __user > *buf, > add_wait_queue(>wait, ); > > do { > - mutex_lock(>info_lock); > + if (!percpu_ref_tryget_live(>info_ref)) { > + retval = -EINVAL; > + break; > + } > + > if (!idev->info || !idev->info->irq) { > retval = -EIO; > - mutex_unlock(>info_lock); > + percpu_ref_put(>info_ref); > break; > } > - mutex_unlock(>info_lock); > + percpu_ref_put(>info_ref); > > set_current_state(TASK_INTERRUPTIBLE); > > @@ -631,7 +649,9 @@ static ssize_t uio_write(struct file *filep, const char > __user *buf, > if (copy_from_user(_on, buf, count)) > return -EFAULT; > > - mutex_lock(>info_lock); > + if (!percpu_ref_tryget_live(>info_ref)) > + return -EINVAL; > + >
Re: [PATCH v2 1/1] iommu/dma: Use DMA ops setter instead of direct assignment
On Mon, Feb 07, 2022 at 03:55:32PM +, Robin Murphy wrote: > On 2022-02-07 14:13, Andy Shevchenko wrote: > > Use DMA ops setter instead of direct assignment. Even we know that > > this module doesn't perform access to the dma_ops member of struct device, > > it's better to use setter to avoid potential problems in the future. > > What potential problems are you imagining? This whole file is a DMA ops > implementation, not driver code (and definitely not a module); if anyone > removes the "select DMA_OPS" from CONFIG_IOMMU_DMA they deserve whatever > breakage they get. > > I concur that there's no major harm in using the helper here, but I also see > no point in pretending that there's any value to abstracting the operation > in this particular context. Yeah. Killing off the the wrapper is actually on my todo list, mostly because it leads to people doing completely broken things like the VDPA private dma ops that should not exist. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 10/10] iommu: Split struct iommu_ops
On Tue, Feb 08, 2022 at 09:25:59AM +0800, Lu Baolu wrote: > Move the domain specific operations out of struct iommu_ops into a new > structure that only has domain specific operations. This solves the > problem of needing to know if the method vector for a given operation > needs to be retrieved from the device or the domain. Logically the domain > ops are the ones that make sense for external subsystems and endpoint > drivers to use, while device ops, with the sole exception of domain_alloc, > are IOMMU API internals. I can't say I like the default_domain_ops concept all that much, but the split itself looks like a good idea and done nicely. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 09/10] iommu: Use dev_iommu_ops() helper
On Tue, Feb 08, 2022 at 09:25:58AM +0800, Lu Baolu wrote: > Convert all the feasible instances of dev->bus->iommu_ops to > dev_iommu_ops() in order to making the operation of obtaining > iommu_ops from a device consistent. I'm not a native speaker, but I think this should read ".. in order to make .." > void iommu_get_resv_regions(struct device *dev, struct list_head *list) > { > - const struct iommu_ops *ops = dev->bus->iommu_ops; > + const struct iommu_ops *ops = dev_iommu_ops(dev); > > if (ops && ops->get_resv_regions) dev_iommu_ops warns on a NULL ops, so we either don'tneed the ops check here or have another problem. Same in a few more spots. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 08/10] iommu: Remove unused argument in is_attach_deferred
Looks good, Reviewed-by: Christoph Hellwig ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 07/10] iommu: Use right way to retrieve iommu_ops
Looks good, Reviewed-by: Christoph Hellwig ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 05/10] iommu: Remove apply_resv_region
Looks good, Reviewed-by: Christoph Hellwig ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dt-bindings: iommu: renesas,ipmmu-vmsa: Reformat renesas,ipmmu-main description
On Wed, Jan 26, 2022 at 01:24:32PM +0100, Geert Uytterhoeven wrote: > Remove trailing whitespace and break overly long lines. > > Signed-off-by: Geert Uytterhoeven > --- > .../devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) I fixed this and other whitespace in my tree from commit 'dt-bindings: Improve phandle-array schemas'. Rob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] dt-bindings: mediatek: mt8186: Add binding for MM iommu
On Tue, 25 Jan 2022 17:32:43 +0800, Yong Wu wrote: > Add mt8186 iommu binding. "-mm" means the iommu is for Multimedia. > > Signed-off-by: Yong Wu > --- > .../bindings/iommu/mediatek,iommu.yaml| 4 + > .../dt-bindings/memory/mt8186-memory-port.h | 217 ++ > 2 files changed, 221 insertions(+) > create mode 100644 include/dt-bindings/memory/mt8186-memory-port.h > Reviewed-by: Rob Herring ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Error when running fio against nvme-of rdma target (mlx5 driver)
Hello, We have been hitting an error when running IO over our nvme-of setup, using the mlx5 driver and we are wondering if anyone has seen anything similar/has any suggestions. Both initiator and target are AMD EPYC 7502 machines connected over RDMA using a Mellanox MT28908. Target has 12 NVMe SSDs which are exposed as a single NVMe fabrics device, one physical SSD per namespace. When running an fio job targeting directly the fabrics devices (no filesystem, see script at the end), within a minute or so we start seeing errors like this: [ 408.368677] mlx5_core :c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x002f address=0x24d08000 flags=0x] [ 408.372201] infiniband mlx5_0: mlx5_handle_error_cqe:332:(pid 0): WC error: 4, Message: local protection error [ 408.380181] infiniband mlx5_0: dump_cqe:272:(pid 0): dump error cqe [ 408.380187] : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 408.380189] 0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 408.380191] 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 408.380192] 0030: 00 00 00 00 a9 00 56 04 00 00 01 e9 00 54 e8 e2 [ 408.380230] nvme nvme15: RECV for CQE 0xce392ed9 failed with status local protection error (4) [ 408.380235] nvme nvme15: starting error recovery [ 408.380238] nvme_ns_head_submit_bio: 726 callbacks suppressed [ 408.380246] block nvme15n2: no usable path - requeuing I/O [ 408.380284] block nvme15n5: no usable path - requeuing I/O [ 408.380298] block nvme15n1: no usable path - requeuing I/O [ 408.380304] block nvme15n11: no usable path - requeuing I/O [ 408.380304] block nvme15n11: no usable path - requeuing I/O [ 408.380330] block nvme15n1: no usable path - requeuing I/O [ 408.380350] block nvme15n2: no usable path - requeuing I/O [ 408.380371] block nvme15n6: no usable path - requeuing I/O [ 408.380377] block nvme15n6: no usable path - requeuing I/O [ 408.380382] block nvme15n4: no usable path - requeuing I/O [ 408.380472] mlx5_core :c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x002f address=0x24d09000 flags=0x] [ 408.391265] mlx5_core :c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x002f address=0x24d0a000 flags=0x] [ 415.125967] nvmet: ctrl 1 keep-alive timer (5 seconds) expired! [ 415.131898] nvmet: ctrl 1 fatal error occurred! Occasionally, we've seen the following stack trace: [ 1158.152464] kernel BUG at drivers/iommu/amd/io_pgtable.c:485! [ 1158.427696] invalid opcode: [#1] SMP NOPTI [ 1158.432228] CPU: 51 PID: 796 Comm: kworker/51:1H Tainted: P OE 5.13.0-eid-athena-g6fb4e704d11c-dirty #14 [ 1158.443867] Hardware name: GIGABYTE R272-Z32-00/MZ32-AR0-00, BIOS R21 10/08/2020 [ 1158.451252] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] [ 1158.456884] RIP: 0010:iommu_v1_unmap_page+0xed/0x100 [ 1158.461849] Code: 48 8b 45 d0 65 48 33 04 25 28 00 00 00 75 1d 48 83 c4 10 4c 89 f0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 49 8d 46 ff 4c 85 f0 74 d6 <0f> 0b e8 1c 38 46 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 [ 1158.480589] RSP: 0018:abb520587bd0 EFLAGS: 00010206 [ 1158.485812] RAX: 000100061fff RBX: 0010 RCX: 0027 [ 1158.492938] RDX: 30562000 RSI: RDI: [ 1158.500071] RBP: abb520587c08 R08: abb520587bd0 R09: [ 1158.507202] R10: 0001 R11: 000ff000 R12: 9984abd9e318 [ 1158.514326] R13: 9984abd9e310 R14: 000100062000 R15: 0001 [ 1158.521452] FS: () GS:99a36c8c() knlGS: [ 1158.529540] CS: 0010 DS: ES: CR0: 80050033 [ 1158.535286] CR2: 7f75b04f1000 CR3: 0001eddd8000 CR4: 00350ee0 [ 1158.542419] Call Trace: [ 1158.544877] amd_iommu_unmap+0x2c/0x40 [ 1158.548653] __iommu_unmap+0xc4/0x170 [ 1158.552344] iommu_unmap_fast+0xe/0x10 [ 1158.556100] __iommu_dma_unmap+0x85/0x120 [ 1158.560115] iommu_dma_unmap_sg+0x95/0x110 [ 1158.564213] dma_unmap_sg_attrs+0x42/0x50 [ 1158.568225] rdma_rw_ctx_destroy+0x6e/0xc0 [ib_core] [ 1158.573201] nvmet_rdma_rw_ctx_destroy+0xa7/0xc0 [nvmet_rdma] [ 1158.578944] nvmet_rdma_read_data_done+0x5c/0xf0 [nvmet_rdma] [ 1158.584683] __ib_process_cq+0x8e/0x150 [ib_core] [ 1158.589398] ib_cq_poll_work+0x2b/0x80 [ib_core] [ 1158.594027] process_one_work+0x220/0x3c0 [ 1158.598038] worker_thread+0x4d/0x3f0 [ 1158.601696] kthread+0x114/0x150 [ 1158.604928] ? process_one_work+0x3c0/0x3c0 [ 1158.609114] ? kthread_park+0x90/0x90 [ 1158.612783] ret_from_fork+0x22/0x30 We first saw this on a 5.13 kernel but could reproduce with 5.17-rc2. We found a possibly related bug report [1] that suggested disabling the IOMMU could help, but even after I disabled it (amd_iommu=off iommu=off) I still get errors (nvme IO timeouts). Another thread from 2016[2] suggested that disabling some kernel debug options could workaround the "local protection error" but
Re: iommu/amd: bug report: page table memory leak
Hi Daniel, On 1/19/2022 2:47 AM, Daniel Jordan wrote: Hi, I've hit a memory leak while testing qemu v6.2.0-rc4 on an AMD EPYC 7J13 (Milan) system. Starting an almost 1T guest, the leak is over 1.5G per qemu invocation. I haven't checked whether the leak is proportional to guest size. It happens with a vfio device, and only when the guest's memory is preallocated using qemu prealloc (this latter part is kinda strange). It happens when the guest memory uses THP but not hugetlb. Bisection: # bad: [df0cc57e057f18e44dac8e6c18aba47ab53202f9] Linux 5.16 # good: [f40ddce88593482919761f74910f42f4b84c004b] Linux 5.11 git bisect start 'df0cc57e057f1' 'f40ddce885934' '--' 'drivers/vfio' 'drivers/iommu' 'include/linux/amd-iommu.h' 'include/linux/dma-iommu.h' 'include/linux/intel-iommu.h' 'include/linux/iommu-helper.h' 'include/linux/of_iommu.h' 'include/ linux/omap-iommu.h' 'include/linux/platform_data/iommu-omap.h' 'include/linux/iommu.h' 'include/trace/events/intel_iommu.h' 'include/trace/events/iommu.h' 'include/uapi/linux/iommu.h' 'include/uapi/linux/virtio_iommu.h' 'arch/x86/events/a md/iommu.h' 'arch/x86/events/amd/iommu.c' 'arch/x86/include/asm/iommu.h' 'arch/x86/include/asm/iommu_table.h' 'arch/x86/kernel/pci-iommu_table.c' # bad: [cee57d4fe74e82e784f6566bad3e3bb1ca51a211] iommu/vt-d: Remove unnecessary braces git bisect bad cee57d4fe74e82e784f6566bad3e3bb1ca51a211 # bad: [9fb5fad562fa0a41c84691714d99c23f54168a9e] iommu: remove DOMAIN_ATTR_PAGING git bisect bad 9fb5fad562fa0a41c84691714d99c23f54168a9e # bad: [45e606f2726926b04094e1c9bf809bca4884c57f] Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'x86/vt-d' and 'core' into next git bisect bad 45e606f2726926b04094e1c9bf809bca4884c57f # good: [7060377ce06f9cd3ed6274c0f2310463feb5baec] Merge branch 'for-joerg/mtk' into for-joerg/arm-smmu/updates git bisect good 7060377ce06f9cd3ed6274c0f2310463feb5baec # bad: [6778ff5b21bd8e78c8bd547fd66437cf2657fd9b] iommu/amd: Fix performance counter initialization git bisect bad 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b # good: [f9b4df790aa4372bfa11b7d212e537b763295429] iommu/amd: Declare functions as extern git bisect good f9b4df790aa4372bfa11b7d212e537b763295429 # bad: [33aef9786046d9a5744cd1e8d5d0ce800d611fdc] iommu/amd: Rename variables to be consistent with struct io_pgtable_ops git bisect bad 33aef9786046d9a5744cd1e8d5d0ce800d611fdc # bad: [e42ba0633064ef23eb1c8c21edf96bac1541bd4b] iommu/amd: Restructure code for freeing page table git bisect bad e42ba0633064ef23eb1c8c21edf96bac1541bd4b # good: [18954252a1d0b12e1b77087b55c37fb43b09e12a] iommu/amd: Move IO page table related functions git bisect good 18954252a1d0b12e1b77087b55c37fb43b09e12a # first bad commit: [e42ba0633064ef23eb1c8c21edf96bac1541bd4b] iommu/amd: Restructure code for freeing page table commit e42ba0633064ef23eb1c8c21edf96bac1541bd4b Author: Suravee Suthikulpanit Date: Tue Dec 15 01:36:59 2020 -0600 iommu/amd: Restructure code for freeing page table By consolidate logic into v1_free_pgtable helper function, which is called from IO page table framework. Signed-off-by: Suravee Suthikulpanit Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fr%2F20201215073705.123786-8-suravee.suthikulpanit%40amd.comdata=04%7C01%7Csuravee.suthikulpanit%40amd.com%7C143de50116b0431302ce08d9dabb5dab%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637781323390114388%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=RK%2F8spS7L5qdvaBYx0OE9T75TOfb9QiA8%2BKk4C00Jqo%3Dreserved=0 Signed-off-by: Joerg Roedel drivers/iommu/amd/amd_iommu.h | 1 - drivers/iommu/amd/io_pgtable.c | 41 - drivers/iommu/amd/iommu.c | 21 - 3 files changed, 28 insertions(+), 35 deletions(-) Qemu command line: numactl -m 1 -N 1 "$QEMU" \ -name vmol74 \ -machine q35,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram \ -cpu host,host-phys-bits=true \ -smp cpus=32 \ -no-user-config \ -nodefaults \ -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-hpet \ -no-shutdown \ -boot strict=on \ -drive file=${vm_image},format=raw,if=none,id=drive-ide0-0-0 \ -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=2 \ -device vfio-pci,host=${pci_addr},id=net2,bus=pcie.0 \ -msg timestamp=on \ -nographic \ -object memory-backend-ram,id=pc.ram,size=980g,prealloc=on,prealloc-threads=16 -m 980g \ -daemonize Kernel config
Re: [PATCH] iommu/arm-smmu-v3: fix event handling soft lockup
On Wed, 19 Jan 2022 07:07:54 +, Zhou Guanghui wrote: > During event processing, events are read from the event queue one > by one until the queue is empty.If the master device continuously > requests address access at the same time and the SMMU generates > events, the cyclic processing of the event takes a long time and > softlockup warnings may be reported. > > arm-smmu-v3 arm-smmu-v3.34.auto: event 0x0a received: > arm-smmu-v3 arm-smmu-v3.34.auto: 0x7f22280a > arm-smmu-v3 arm-smmu-v3.34.auto: 0x107e > arm-smmu-v3 arm-smmu-v3.34.auto: 0x034e8670 > watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [irq/268-arm-smm:247] > Call trace: > _dev_info+0x7c/0xa0 > arm_smmu_evtq_thread+0x1c0/0x230 > irq_thread_fn+0x30/0x80 > irq_thread+0x128/0x210 > kthread+0x134/0x138 > ret_from_fork+0x10/0x1c > Kernel panic - not syncing: softlockup: hung tasks > > [...] Applied to will (for-joerg/arm-smmu/updates), thanks! [1/1] iommu/arm-smmu-v3: fix event handling soft lockup https://git.kernel.org/will/c/30de2b541af9 Cheers, -- Will https://fixes.arm64.dev https://next.arm64.dev https://will.arm64.dev ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: Add missing pm_runtime_disable() in qcom_iommu_device_probe
On Wed, 5 Jan 2022 10:16:19 +, Miaoqian Lin wrote: > If the probe fails, we should use pm_runtime_disable() to balance > pm_runtime_enable(). > Add missing pm_runtime_disable() for error handling. > > Applied to will (for-joerg/arm-smmu/updates), thanks! [1/1] iommu/arm-smmu: Add missing pm_runtime_disable() in qcom_iommu_device_probe https://git.kernel.org/will/c/93665e0275a2 Cheers, -- Will https://fixes.arm64.dev https://next.arm64.dev https://will.arm64.dev ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/2] iommu/arm-smmu-v3: Avoid open coded arithmetic in memory allocation
On Mon, 7 Feb 2022 23:50:48 +0100, Christophe JAILLET wrote: > kmalloc_array()/kcalloc() should be used to avoid potential overflow when > a multiplication is needed to compute the size of the requested memory. > > So turn a devm_kzalloc()+explicit size computation into an equivalent > devm_kcalloc(). > > > [...] Applied to will (for-joerg/arm-smmu/updates), thanks! [1/2] iommu/arm-smmu-v3: Avoid open coded arithmetic in memory allocation https://git.kernel.org/will/c/98b64741d611 [2/2] iommu/arm-smmu-v3: Simplify memory allocation https://git.kernel.org/will/c/fcdeb8c34043 Cheers, -- Will https://fixes.arm64.dev https://next.arm64.dev https://will.arm64.dev ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu/arm-smmu: Use platform_irq_count() to get the interrupt count
On Tue, Feb 08, 2022 at 03:28:50PM +, Robin Murphy wrote: > On 2022-02-08 15:19, Will Deacon wrote: > > On Thu, Dec 23, 2021 at 02:14:35PM +, Robin Murphy wrote: > > > On 2021-12-23 13:00, Lad Prabhakar wrote: > > > > platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static > > > > allocation of IRQ resources in DT core code, this causes an issue > > > > when using hierarchical interrupt domains using "interrupts" property > > > > in the node as this bypasses the hierarchical setup and messes up the > > > > irq chaining. > > > > > > > > In preparation for removal of static setup of IRQ resource from DT core > > > > code use platform_get_irq_count(). > > > > > > Nit: platform_irq_count() > > > > > > > Signed-off-by: Lad Prabhakar > > > > --- > > > >drivers/iommu/arm/arm-smmu/arm-smmu.c | 12 ++-- > > > >1 file changed, 6 insertions(+), 6 deletions(-) > > > > > > > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c > > > > b/drivers/iommu/arm/arm-smmu/arm-smmu.c > > > > index 4bc75c4ce402..4844cd075644 100644 > > > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c > > > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c > > > > @@ -2105,12 +2105,12 @@ static int arm_smmu_device_probe(struct > > > > platform_device *pdev) > > > > if (IS_ERR(smmu)) > > > > return PTR_ERR(smmu); > > > > - num_irqs = 0; > > > > - while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, > > > > num_irqs))) { > > > > - num_irqs++; > > > > - if (num_irqs > smmu->num_global_irqs) > > > > - smmu->num_context_irqs++; > > > > - } > > > > + num_irqs = platform_irq_count(pdev); > > > > + if (num_irqs < 0) > > > > + return num_irqs; > > > > + > > > > + if (num_irqs > smmu->num_global_irqs) > > > > + smmu->num_context_irqs += (num_irqs - > > > > smmu->num_global_irqs); > > > > > > This seems a bit overcomplicated. I reckon: > > > > > > smmu->num_context_irqs = num_irqs - smmu->num_global_irqs; > > > if (num_irqs <= smmu->num_global_irqs) { > > > dev_err(... > > > > > > should do it. > > > > > > However, FYI I have some patches refactoring most of the IRQ stuff here > > > that > > > I plan to post next cycle (didn't quite have time to get them done for > > > 5.17 > > > as I'd hoped...), so unless this needs to go in right now as an urgent > > > fix, > > > I'm happy to take care of removing platform_get_resource() as part of that > > > if it's easier. > > > > Did you get anywhere with this? December 23rd is long forgotten by now ;) > > Yup: > https://gitlab.arm.com/linux-arm/linux-rm/-/commit/b2a40caaf1622eb35c555074a0d72f4f0513cff9 > > I'm still cleaning up the PMU driver that justifies the whole thing, but I > can send that patch now if you reckon it's not complete nonsense out of > context. Otherwise, I'll aim to get the whole lot out next week. Next week is fine, no rush. I was just trying to work out what to do with Lad's patches in this thread and it sounds like I should ignore them and wait for your series instead. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu/arm-smmu: Use platform_irq_count() to get the interrupt count
On 2022-02-08 15:19, Will Deacon wrote: On Thu, Dec 23, 2021 at 02:14:35PM +, Robin Murphy wrote: On 2021-12-23 13:00, Lad Prabhakar wrote: platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq_count(). Nit: platform_irq_count() Signed-off-by: Lad Prabhakar --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 4bc75c4ce402..4844cd075644 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -2105,12 +2105,12 @@ static int arm_smmu_device_probe(struct platform_device *pdev) if (IS_ERR(smmu)) return PTR_ERR(smmu); - num_irqs = 0; - while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) { - num_irqs++; - if (num_irqs > smmu->num_global_irqs) - smmu->num_context_irqs++; - } + num_irqs = platform_irq_count(pdev); + if (num_irqs < 0) + return num_irqs; + + if (num_irqs > smmu->num_global_irqs) + smmu->num_context_irqs += (num_irqs - smmu->num_global_irqs); This seems a bit overcomplicated. I reckon: smmu->num_context_irqs = num_irqs - smmu->num_global_irqs; if (num_irqs <= smmu->num_global_irqs) { dev_err(... should do it. However, FYI I have some patches refactoring most of the IRQ stuff here that I plan to post next cycle (didn't quite have time to get them done for 5.17 as I'd hoped...), so unless this needs to go in right now as an urgent fix, I'm happy to take care of removing platform_get_resource() as part of that if it's easier. Did you get anywhere with this? December 23rd is long forgotten by now ;) Yup: https://gitlab.arm.com/linux-arm/linux-rm/-/commit/b2a40caaf1622eb35c555074a0d72f4f0513cff9 I'm still cleaning up the PMU driver that justifies the whole thing, but I can send that patch now if you reckon it's not complete nonsense out of context. Otherwise, I'll aim to get the whole lot out next week. Cheers, Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu/arm-smmu: Use platform_irq_count() to get the interrupt count
On Thu, Dec 23, 2021 at 02:14:35PM +, Robin Murphy wrote: > On 2021-12-23 13:00, Lad Prabhakar wrote: > > platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static > > allocation of IRQ resources in DT core code, this causes an issue > > when using hierarchical interrupt domains using "interrupts" property > > in the node as this bypasses the hierarchical setup and messes up the > > irq chaining. > > > > In preparation for removal of static setup of IRQ resource from DT core > > code use platform_get_irq_count(). > > Nit: platform_irq_count() > > > Signed-off-by: Lad Prabhakar > > --- > > drivers/iommu/arm/arm-smmu/arm-smmu.c | 12 ++-- > > 1 file changed, 6 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c > > b/drivers/iommu/arm/arm-smmu/arm-smmu.c > > index 4bc75c4ce402..4844cd075644 100644 > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c > > @@ -2105,12 +2105,12 @@ static int arm_smmu_device_probe(struct > > platform_device *pdev) > > if (IS_ERR(smmu)) > > return PTR_ERR(smmu); > > - num_irqs = 0; > > - while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) { > > - num_irqs++; > > - if (num_irqs > smmu->num_global_irqs) > > - smmu->num_context_irqs++; > > - } > > + num_irqs = platform_irq_count(pdev); > > + if (num_irqs < 0) > > + return num_irqs; > > + > > + if (num_irqs > smmu->num_global_irqs) > > + smmu->num_context_irqs += (num_irqs - smmu->num_global_irqs); > > This seems a bit overcomplicated. I reckon: > > smmu->num_context_irqs = num_irqs - smmu->num_global_irqs; > if (num_irqs <= smmu->num_global_irqs) { > dev_err(... > > should do it. > > However, FYI I have some patches refactoring most of the IRQ stuff here that > I plan to post next cycle (didn't quite have time to get them done for 5.17 > as I'd hoped...), so unless this needs to go in right now as an urgent fix, > I'm happy to take care of removing platform_get_resource() as part of that > if it's easier. Did you get anywhere with this? December 23rd is long forgotten by now ;) Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
On Tue, Feb 08, 2022 at 10:41:39AM +0800, Lu Baolu wrote: > On 2/8/22 7:02 AM, Fenghua Yu wrote: > > PASIDs are process wide. It was attempted to use refcounted PASIDs to > > free them when the last thread drops the refcount. This turned out to > > be complex and error prone. Given the fact that the PASID space is 20 > > bits, which allows up to 1M processes to have a PASID associated > > concurrently, PASID resource exhaustion is not a realistic concern. > > > > Therefore it was decided to simplify the approach and stick with lazy > > on demand PASID allocation, but drop the eager free approach and make > > a allocated PASID lifetime bound to the life time of the process. > > > > Get rid of the refcounting mechanisms and replace/rename the interfaces > > to reflect this new approach. > > > > Suggested-by: Dave Hansen > > Signed-off-by: Fenghua Yu > > Reviewed-by: Tony Luck > > --- > > v4: > > - Update the commit message (Thomas). > > Reviewed-by: Lu Baolu Thank you very much for your review, Baolu! -Fenghua ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] Netvsc: Call hv_unmap_memory() in the netvsc_device_remove()
On 2/3/2022 1:05 AM, Michael Kelley (LINUX) wrote: From: Tianyu Lan Sent: Tuesday, February 1, 2022 8:32 AM netvsc_device_remove() calls vunmap() inside which should not be called in the interrupt context. Current code calls hv_unmap_memory() in the free_netvsc_device() which is rcu callback and maybe called in the interrupt context. This will trigger BUG_ON(in_interrupt()) in the vunmap(). Fix it via moving hv_unmap_memory() to netvsc_device_ remove(). I think this change can fail to call hv_unmap_memory() in an error case. If netvsc_init_buf() fails after hv_map_memory() succeeds for the receive buffer or for the send buffer, no corresponding hv_unmap_memory() will be done. The failure in netvsc_init_buf() will cause netvsc_connect_vsp() to fail, so netvsc_add_device() will "goto close" where free_netvsc_device() will be called. But free_netvsc_device() no longer calls hv_unmap_memory(), so it won't ever happen. netvsc_device_remove() is never called in this case because netvsc_add_device() failed. Hi Michael: Thanks for your review. Nice catch and will fix in the next version. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [syzbot] WARNING in __dma_map_sg_attrs
On Sat, Feb 05, 2022 at 12:18:23PM -0800, syzbot wrote: > syzbot has found a reproducer for the following issue on: > > HEAD commit:0457e5153e0e Merge tag 'for-linus' of git://git.kernel.org.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=11b2637c70 > kernel config: https://syzkaller.appspot.com/x/.config?x=6f043113811433a5 > dashboard link: https://syzkaller.appspot.com/bug?extid=10e27961f4da37c443b2 > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils > for Debian) 2.35.2 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11c6554270 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1163f48070 > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+10e27961f4da37c44...@syzkaller.appspotmail.com Adding Gerd, since this seems to blow up in udmabuf. I wonder why syzbot didn't figure this out, since it seems to have correctly added both dma-api and dma-buf people. Just not the maintainer for the begin_cpu_udmabuf function in the middle of the backtrace? -Daniel > > [ cut here ] > WARNING: CPU: 1 PID: 3595 at kernel/dma/mapping.c:188 > __dma_map_sg_attrs+0x181/0x1f0 kernel/dma/mapping.c:188 > Modules linked in: > CPU: 0 PID: 3595 Comm: syz-executor249 Not tainted > 5.17.0-rc2-syzkaller-00316-g0457e5153e0e #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > RIP: 0010:__dma_map_sg_attrs+0x181/0x1f0 kernel/dma/mapping.c:188 > Code: 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 71 4c 8b 3d c0 83 b5 > 0d e9 db fe ff ff e8 b6 0f 13 00 0f 0b e8 af 0f 13 00 <0f> 0b 45 31 e4 e9 54 > ff ff ff e8 a0 0f 13 00 49 8d 7f 50 48 b8 00 > RSP: 0018:c90002a07d68 EFLAGS: 00010293 > RAX: RBX: RCX: > RDX: 88807e25e2c0 RSI: 81649e91 RDI: 88801b848408 > RBP: 88801b848000 R08: 0002 R09: 88801d86c74f > R10: 81649d72 R11: 0001 R12: 0002 > R13: 88801d86c680 R14: 0001 R15: > FS: 56e30300() GS:8880b9d0() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 20cc CR3: 1d74a000 CR4: 003506e0 > DR0: DR1: DR2: > DR3: DR6: fffe0ff0 DR7: 0400 > Call Trace: > > dma_map_sgtable+0x70/0xf0 kernel/dma/mapping.c:264 > get_sg_table.isra.0+0xe0/0x160 drivers/dma-buf/udmabuf.c:72 > begin_cpu_udmabuf+0x130/0x1d0 drivers/dma-buf/udmabuf.c:126 > dma_buf_begin_cpu_access+0xfd/0x1d0 drivers/dma-buf/dma-buf.c:1164 > dma_buf_ioctl+0x259/0x2b0 drivers/dma-buf/dma-buf.c:363 > vfs_ioctl fs/ioctl.c:51 [inline] > __do_sys_ioctl fs/ioctl.c:874 [inline] > __se_sys_ioctl fs/ioctl.c:860 [inline] > __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > RIP: 0033:0x7f62fcf530f9 > Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 > 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 > 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:7ffe3edab9b8 EFLAGS: 0246 ORIG_RAX: 0010 > RAX: ffda RBX: RCX: 7f62fcf530f9 > RDX: 2200 RSI: 40086200 RDI: 0006 > RBP: 7f62fcf170e0 R08: R09: > R10: R11: 0246 R12: 7f62fcf17170 > R13: R14: R15: > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 8/8] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity
On 2022/2/8 19:56, John Garry wrote: > On 08/02/2022 11:21, Yicong Yang wrote: >>> This patch should be earlier in the series, before the PTT driver, and the >>> comment on hisi_ptt_check_iommu_mapping() should mention what is going on >>> here. >>> >> ok I'll reorder the serives and modify the comments of >> hisi_ptt_check_iommu_mapping() like: >> >> /* >> * The DMA of PTT trace can only use direct mapping, due to some >> * hardware restriction. Check whether there is an iommu or the >> * policy of the iommu domain is passthrough, otherwise the trace >> * cannot work. > > IOMMU, capitalize acronyms > ok. >> * >> * The PTT device is supposed to behind the arm smmu v3, which >> * should have passthrough the device by a quirk. Otherwise user >> * should manually set the iommu domain type to identity through >> * sysfs. > > Sorry, but I don't really understand your meaning here. > > I did not think that if we have a default domain then we can change via sysfs > to anything else. > ok I think the last sentence maybe misleading and better drop it. >> */ >> Signed-off-by: Yicong Yang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 1 file changed, 16 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 6dc6d8b6b368..6f67a2b1dd27 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2838,6 +2838,21 @@ static int arm_smmu_dev_disable_feature(struct device *dev, } } +#define IS_HISI_PTT_DEVICE(pdev) ((pdev)->vendor == PCI_VENDOR_ID_HUAWEI && \ + (pdev)->device == 0xa12e) >>> I assume that not all revisions will require this check, right? > > So if you are very confident that the next revision will be fixed then I > would add a check for this current broken revision. > >>> >> For current revisions it's necessary. >> > > . ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 8/8] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity
On 08/02/2022 11:21, Yicong Yang wrote: This patch should be earlier in the series, before the PTT driver, and the comment on hisi_ptt_check_iommu_mapping() should mention what is going on here. ok I'll reorder the serives and modify the comments of hisi_ptt_check_iommu_mapping() like: /* * The DMA of PTT trace can only use direct mapping, due to some * hardware restriction. Check whether there is an iommu or the * policy of the iommu domain is passthrough, otherwise the trace * cannot work. IOMMU, capitalize acronyms * * The PTT device is supposed to behind the arm smmu v3, which * should have passthrough the device by a quirk. Otherwise user * should manually set the iommu domain type to identity through * sysfs. Sorry, but I don't really understand your meaning here. I did not think that if we have a default domain then we can change via sysfs to anything else. */ Signed-off-by: Yicong Yang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 1 file changed, 16 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 6dc6d8b6b368..6f67a2b1dd27 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2838,6 +2838,21 @@ static int arm_smmu_dev_disable_feature(struct device *dev, } } +#define IS_HISI_PTT_DEVICE(pdev) ((pdev)->vendor == PCI_VENDOR_ID_HUAWEI && \ + (pdev)->device == 0xa12e) I assume that not all revisions will require this check, right? So if you are very confident that the next revision will be fixed then I would add a check for this current broken revision. For current revisions it's necessary. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 02/14] driver core: Add dma_cleanup callback in bus_type
On Tue, Feb 08, 2022 at 01:55:29PM +0800, Lu Baolu wrote: > Hi Greg, > > On 1/4/22 9:04 PM, Greg Kroah-Hartman wrote: > > On Tue, Jan 04, 2022 at 08:39:11AM -0400, Jason Gunthorpe wrote: > > > On Tue, Jan 04, 2022 at 02:08:36AM -0800, Christoph Hellwig wrote: > > > > All these bus callouts still looks horrible and just create tons of > > > > boilerplate code. > > > > > > Yes, Lu - Greg asked questions then didn't respond to their answers > > > meaning he accepts them, you should stick with the v4 version. > > > > Trying to catch up on emails from the break, that was way down my list > > of things to get back to as it's messy and non-obvious. I'll revisit it > > again after 5.17-rc1 is out, this is too late for that merge window > > anyway. > > In this series we want to add calls into the iommu subsystem during > device driver binding/unbinding, so that the device DMA ownership > conflict (kernel driver vs. user-space) could be detected and avoided > before calling into device driver's .probe(). > > In this v5 series, we implemented this in the affected buses (amba/ > platform/fsl-mc/pci) which are known to support assigning devices to > user space through the vfio framework currently. And more buses are > possible to be affected in the future if they also want to support > device assignment. Christoph commented that this will create boilerplate > code in various bus drivers. > > Back to v4 of this series (please refer to below link [1]), we added > this call in the driver core if buses have provided the dma_configure() > callback (please refer to below link [2]). > > Which would you prefer, or any other suggestions? We need your guide to > move this series ahead. Please help to suggest. > > [1] > https://lore.kernel.org/linux-iommu/20211217063708.1740334-1-baolu...@linux.intel.com/ > [2] > https://lore.kernel.org/linux-iommu/20211217063708.1740334-3-baolu...@linux.intel.com/ Let me look over the series again this afternooon. thanks, greg k-h ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 8/8] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity
On 2022/2/8 16:05, John Garry wrote: > On 24/01/2022 13:11, Yicong Yang wrote: >> The DMA of HiSilicon PTT device can only work with identical >> mapping. So add a quirk for the device to force the domain >> passthrough. > > This patch should be earlier in the series, before the PTT driver, and the > comment on hisi_ptt_check_iommu_mapping() should mention what is going on > here. > ok I'll reorder the serives and modify the comments of hisi_ptt_check_iommu_mapping() like: /* * The DMA of PTT trace can only use direct mapping, due to some * hardware restriction. Check whether there is an iommu or the * policy of the iommu domain is passthrough, otherwise the trace * cannot work. * * The PTT device is supposed to behind the arm smmu v3, which * should have passthrough the device by a quirk. Otherwise user * should manually set the iommu domain type to identity through * sysfs. */ >> >> Signed-off-by: Yicong Yang >> --- >> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 >> 1 file changed, 16 insertions(+) >> >> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c >> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c >> index 6dc6d8b6b368..6f67a2b1dd27 100644 >> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c >> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c >> @@ -2838,6 +2838,21 @@ static int arm_smmu_dev_disable_feature(struct device >> *dev, >> } >> } >> +#define IS_HISI_PTT_DEVICE(pdev) ((pdev)->vendor == >> PCI_VENDOR_ID_HUAWEI && \ >> + (pdev)->device == 0xa12e) > > I assume that not all revisions will require this check, right? > For current revisions it's necessary. >> + >> +static int arm_smmu_def_domain_type(struct device *dev) >> +{ >> + if (dev_is_pci(dev)) { >> + struct pci_dev *pdev = to_pci_dev(dev); >> + >> + if (IS_HISI_PTT_DEVICE(pdev)) >> + return IOMMU_DOMAIN_IDENTITY; >> + } >> + >> + return 0; >> +} >> + >> static struct iommu_ops arm_smmu_ops = { >> .capable = arm_smmu_capable, >> .domain_alloc = arm_smmu_domain_alloc, >> @@ -2863,6 +2878,7 @@ static struct iommu_ops arm_smmu_ops = { >> .sva_unbind = arm_smmu_sva_unbind, >> .sva_get_pasid = arm_smmu_sva_get_pasid, >> .page_response = arm_smmu_page_response, >> + .def_domain_type = arm_smmu_def_domain_type, >> .pgsize_bitmap = -1UL, /* Restricted during device attach */ >> .owner = THIS_MODULE, >> }; > > . ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 6/8] docs: Add HiSilicon PTT device driver documentation
On 2022/2/7 20:12, Jonathan Cameron wrote: > On Mon, 24 Jan 2022 21:11:16 +0800 > Yicong Yang wrote: > >> Document the introduction and usage of HiSilicon PTT device driver. >> >> Signed-off-by: Yicong Yang > Nice document. A few trivial typos inline. > I would give a RB except I've suggested you change a part of the > sysfs interface which will affect the relevant documentation. > Thanks. I'll get these fixed and update the documentation with sysfs interface updated. > Thanks, > > Jonathan > >> --- >> Documentation/trace/hisi-ptt.rst | 304 +++ >> 1 file changed, 304 insertions(+) >> create mode 100644 Documentation/trace/hisi-ptt.rst >> >> diff --git a/Documentation/trace/hisi-ptt.rst >> b/Documentation/trace/hisi-ptt.rst >> new file mode 100644 >> index ..f3269b11a2f6 >> --- /dev/null >> +++ b/Documentation/trace/hisi-ptt.rst >> @@ -0,0 +1,304 @@ >> +.. SPDX-License-Identifier: GPL-2.0 >> + >> +== >> +HiSilicon PCIe Tune and Trace device >> +== >> + >> +Introduction >> + >> + >> +HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex >> +integrated Endpoint (RCiEP) device, providing the capability >> +to dynamically monitor and tune the PCIe link's events (tune), >> +and trace the TLP headers (trace). The two functions are independent, >> +but is recommended to use them together to analyze and enhance the >> +PCIe link's performance. >> + >> +On Kunpeng 930 SoC, the PCIe Root Complex is composed of several >> +PCIe cores. Each PCIe core includes several Root Ports and a PTT >> +RCiEP, like below. The PTT device is capable of tuning and >> +tracing the link of the PCIe core. > > links > >> +:: >> + +--Core 0---+ >> + | | [ PTT ] | >> + | | [Root Port]---[Endpoint] >> + | | [Root Port]---[Endpoint] >> + | | [Root Port]---[Endpoint] >> +Root Complex |--Core 1---+ >> + | | [ PTT ] | >> + | | [Root Port]---[ Switch ]---[Endpoint] >> + | | [Root Port]---[Endpoint] `-[Endpoint] >> + | | [Root Port]---[Endpoint] >> + +---+ >> + >> +The PTT device driver registers PMU device for each PTT device. > > registers one PMU device .. > >> +The name of each PTT device is composed of 'hisi_ptt' prefix with >> +the id of the SICL and the Core where it locates. The Kunpeng 930 >> +SoC encapsulates multiple CPU dies (SCCL, Super CPU Cluster) and >> +IO dies (SICL, Super I/O Cluster), where there's one PCIe Root >> +Complex for each SICL. >> +:: >> +/sys/devices/hisi_ptt_ >> + >> +Tune >> + >> + >> +PTT tune is designed for monitoring and adjusting PCIe link parameters >> (events). >> +Currently we support events in 4 classes. The scope of the events >> +covers the PCIe core to which the PTT device belongs. >> + >> +Each event is presented as a file under $(PTT PMU dir)/tune, and >> +mostly a simple open/read/write/close cycle will be used to tune > > drop "mostly" as it doesn't add anything other than potential confusion. > >> +the event. >> +:: >> +$ cd /sys/devices/hisi_ptt_/tune >> +$ ls >> +qos_tx_cplqos_tx_npqos_tx_p >> +tx_path_rx_req_alloc_buf_level >> +tx_path_tx_req_alloc_buf_level >> +$ cat qos_tx_dp >> +1 >> +$ echo 2 > qos_tx_dp >> +$ cat qos_tx_dp >> +2 >> + >> +Current value (numerical value) of the event can be simply read >> +from the file, and the desired value written to the file to tune. >> + >> +1. Tx path QoS control >> + >> + >> +The following files are provided to tune the QoS of the tx path of >> +the PCIe core. >> + >> +- qos_tx_cpl: weight of Tx completion TLPs >> +- qos_tx_np: weight of Tx non-posted TLPs >> +- qos_tx_p: weight of Tx posted TLPs >> + >> +The weight influences the proportion of certain packets on the PCIe link. >> +For example, for the storage scenario, increase the proportion >> +of the completion packets on the link to enhance the performance as >> +more completions are consumed. >> + >> +The available tune data of these events is [0, 1, 2]. >> +Writing a negative value will return an error, and out of range >> +values will be converted to 2. Note that the event value just >> +indicates a probable level, but is not precise. >> + >> +2. Tx path buffer control >> +- >> + >> +Following files are provided to tune the buffer of tx path of the PCIe core. >> + >> +- tx_path_rx_req_alloc_buf_level: watermark of Rx requested >> +- tx_path_tx_req_alloc_buf_level: watermark of Tx requested >> + >> +These events influence the watermark of the buffer allocated for each >> +type. Rx means the inbound while Tx means outbound. The packets will >> +be stored in the buffer first and then posted either
Re: [PATCH v3 1/8] hwtracing: Add trace function support for HiSilicon PCIe Tune and Trace device
On 2022/2/7 19:42, Jonathan Cameron wrote: > On Mon, 24 Jan 2022 21:11:11 +0800 > Yicong Yang wrote: > >> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex >> integrated Endpoint(RCiEP) device, providing the capability >> to dynamically monitor and tune the PCIe traffic, and trace >> the TLP headers. >> >> Add the driver for the device to enable the trace function. >> This patch adds basic function of trace, including the device's >> probe and initialization, functions for trace buffer allocation >> and trace enable/disable, register an interrupt handler to >> simply response to the DMA events. The user interface of trace >> will be added in the following patch. >> >> Signed-off-by: Yicong Yang > Hi Yicong, > > I've not been following all the earlier discussion on this driver closely > so I may well raise something that has already been addressed. If so > just ignore the comment. Thanks for the comments. It's ok for me to clarify it :). Part replies inline and I need to do some test on the others. > > Thanks, > > Jonathan > >> --- >> drivers/Makefile | 1 + >> drivers/hwtracing/Kconfig| 2 + >> drivers/hwtracing/ptt/Kconfig| 11 + >> drivers/hwtracing/ptt/Makefile | 2 + >> drivers/hwtracing/ptt/hisi_ptt.c | 398 +++ >> drivers/hwtracing/ptt/hisi_ptt.h | 159 >> 6 files changed, 573 insertions(+) >> create mode 100644 drivers/hwtracing/ptt/Kconfig >> create mode 100644 drivers/hwtracing/ptt/Makefile >> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c >> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h >> [...] >> + >> +static int hisi_ptt_alloc_trace_buf(struct hisi_ptt *hisi_ptt) >> +{ >> +struct hisi_ptt_trace_ctrl *ctrl = _ptt->trace_ctrl; >> +struct device *dev = _ptt->pdev->dev; >> +struct hisi_ptt_dma_buffer *buffer; >> +int i, ret; >> + >> +hisi_ptt->trace_ctrl.buf_index = 0; >> + >> +/* Make sure the trace buffer is empty before allocating */ > > This comment is misleading as it suggests it not being empty is > a bad thing but the code handles it as an acceptable path. > Perhaps: > /* >* If the trace buffer has already been allocated, zero the >* memory. >*/ > will make it less misleading. thanks. >> +if (!list_empty(>trace_buf)) { >> +list_for_each_entry(buffer, >trace_buf, list) >> +memset(buffer->addr, 0, buffer->size); >> +return 0; >> +} >> + >> +for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; ++i) { >> +buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); >> +if (!buffer) { >> +ret = -ENOMEM; >> +goto err; >> +} >> + >> +buffer->addr = dma_alloc_coherent(dev, ctrl->buffer_size, >> + >dma, GFP_KERNEL); >> +if (!buffer->addr) { >> +kfree(buffer); >> +ret = -ENOMEM; >> +goto err; >> +} >> + >> +memset(buffer->addr, 0, buffer->size); > See: > https://lore.kernel.org/lkml/20190108130701.14161-4-...@lst.de/ > dma_alloc_coherent() always zeros the memory for us hence there > is no longer a dma_kzalloc_coherent() > thanks for the information. Then the memset here is redundant and will drop it. >> + >> +buffer->index = i; > > Carrying an index inside a list which corresponds directly > to the position in the list is not particularly nice. > Why can't we compute this index on the fly where the list > is walked? Or am I misunderstanding and the order of the buffers > is changed in a later patch? > The index is fixed once allocated and I stored it to avoid later computing. But seems it's highly recommended to compute these sort of things on the fly when necessary. John recommends the same things on some other places so I think I can get these addressed. > As a side note, is a list actually appropriate when we always > have 4 of these buffers? Feels like an array of buffer > structures might be cheaper. > >> +buffer->size = ctrl->buffer_size; >> +list_add_tail(>list, >trace_buf); >> +} >> + >> +return 0; >> +err: >> +hisi_ptt_free_trace_buf(hisi_ptt); >> +return ret; >> +} >> + >> +static void hisi_ptt_trace_end(struct hisi_ptt *hisi_ptt) >> +{ >> +writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >> +hisi_ptt->trace_ctrl.status = HISI_PTT_TRACE_STATUS_OFF; >> +} >> + >> +static int hisi_ptt_trace_start(struct hisi_ptt *hisi_ptt) >> +{ >> +struct hisi_ptt_trace_ctrl *ctrl = _ptt->trace_ctrl; >> +struct hisi_ptt_dma_buffer *cur; >> +u32 val; >> + >> +/* Check device idle before start trace */ >> +if (hisi_ptt_wait_trace_hw_idle(hisi_ptt)) { >> +pci_err(hisi_ptt->pdev, "Failed to start trace, the device is >> still busy.\n"); >> +return -EBUSY; >> +} >> +
RE: [PATCH] MAINTAINERS: Update maintainer list of DMA MAPPING BENCHMARK
> -Original Message- > From: chenxiang (M) > Sent: Tuesday, February 8, 2022 8:05 PM > To: Song Bao Hua (Barry Song) ; h...@lst.de; > m.szyprow...@samsung.com; robin.mur...@arm.com > Cc: linux...@openeuler.org; Linuxarm ; > iommu@lists.linux-foundation.org; linux-kselft...@vger.kernel.org; > chenxiang (M) > Subject: [PATCH] MAINTAINERS: Update maintainer list of DMA MAPPING BENCHMARK > > From: Xiang Chen > > Barry Song will not focus on this area, and Xiang Chen will continue his > work to maintain this module. > > Signed-off-by: Xiang Chen Acked-by: Barry Song Xiang has been an user of this module and has made substantial contributions not only to this module and but also to related modules such as iommu/arm-smmu-v3. My this email account will be unreachable in this month. And probably I will rarely work on this module afterwards. So I am happy Xiang will take care of it. Thanks! > --- > MAINTAINERS | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/MAINTAINERS b/MAINTAINERS > index ea3e6c914384..48335022b0e4 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -5765,7 +5765,7 @@ F: include/linux/dma-map-ops.h > F: kernel/dma/ > > DMA MAPPING BENCHMARK > -M: Barry Song > +M: Xiang Chen > L: iommu@lists.linux-foundation.org > F: kernel/dma/map_benchmark.c > F: tools/testing/selftests/dma/ > -- > 2.33.0 Best Regards, Barry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/8] hwtracing: Add trace function support for HiSilicon PCIe Tune and Trace device
Hi John, Thanks for the comments. some replies inline. On 2022/2/8 2:11, John Garry wrote: > On 24/01/2022 13:11, Yicong Yang wrote: >> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex >> integrated Endpoint(RCiEP) device, providing the capability >> to dynamically monitor and tune the PCIe traffic, and trace >> the TLP headers. >> >> Add the driver for the device to enable the trace function. >> This patch adds basic function of trace, including the device's >> probe and initialization, functions for trace buffer allocation >> and trace enable/disable, register an interrupt handler to >> simply response to the DMA events. The user interface of trace >> will be added in the following patch. >> >> Signed-off-by: Yicong Yang >> --- >> drivers/Makefile | 1 + >> drivers/hwtracing/Kconfig | 2 + >> drivers/hwtracing/ptt/Kconfig | 11 + >> drivers/hwtracing/ptt/Makefile | 2 + >> drivers/hwtracing/ptt/hisi_ptt.c | 398 +++ >> drivers/hwtracing/ptt/hisi_ptt.h | 159 >> 6 files changed, 573 insertions(+) >> create mode 100644 drivers/hwtracing/ptt/Kconfig >> create mode 100644 drivers/hwtracing/ptt/Makefile >> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c >> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h >> >> diff --git a/drivers/Makefile b/drivers/Makefile >> index a110338c860c..ab3411e4eba5 100644 >> --- a/drivers/Makefile >> +++ b/drivers/Makefile >> @@ -175,6 +175,7 @@ obj-$(CONFIG_USB4) += thunderbolt/ >> obj-$(CONFIG_CORESIGHT) += hwtracing/coresight/ >> obj-y += hwtracing/intel_th/ >> obj-$(CONFIG_STM) += hwtracing/stm/ >> +obj-$(CONFIG_HISI_PTT) += hwtracing/ptt/ >> obj-$(CONFIG_ANDROID) += android/ >> obj-$(CONFIG_NVMEM) += nvmem/ >> obj-$(CONFIG_FPGA) += fpga/ >> diff --git a/drivers/hwtracing/Kconfig b/drivers/hwtracing/Kconfig >> index 13085835a636..911ee977103c 100644 >> --- a/drivers/hwtracing/Kconfig >> +++ b/drivers/hwtracing/Kconfig >> @@ -5,4 +5,6 @@ source "drivers/hwtracing/stm/Kconfig" >> source "drivers/hwtracing/intel_th/Kconfig" >> +source "drivers/hwtracing/ptt/Kconfig" >> + >> endmenu >> diff --git a/drivers/hwtracing/ptt/Kconfig b/drivers/hwtracing/ptt/Kconfig >> new file mode 100644 >> index ..4f4f2459ac47 >> --- /dev/null >> +++ b/drivers/hwtracing/ptt/Kconfig >> @@ -0,0 +1,11 @@ >> +# SPDX-License-Identifier: GPL-2.0-only >> +config HISI_PTT >> + tristate "HiSilicon PCIe Tune and Trace Device" >> + depends on ARM64 && PCI && HAS_DMA && HAS_IOMEM >> + help >> + HiSilicon PCIe Tune and Trace Device exist as a PCIe RCiEP > > exists > >> + device, provides support for PCIe traffic tuning and > > and it provides support... > will fix, thanks. >> + tracing TLP headers to the memory. >> + >> + This driver can also be built as a module. If so, the module >> + will be called hisi_ptt. >> diff --git a/drivers/hwtracing/ptt/Makefile b/drivers/hwtracing/ptt/Makefile >> new file mode 100644 >> index ..908c09a98161 >> --- /dev/null >> +++ b/drivers/hwtracing/ptt/Makefile >> @@ -0,0 +1,2 @@ >> +# SPDX-License-Identifier: GPL-2.0 >> +obj-$(CONFIG_HISI_PTT) += hisi_ptt.o >> diff --git a/drivers/hwtracing/ptt/hisi_ptt.c >> b/drivers/hwtracing/ptt/hisi_ptt.c >> new file mode 100644 >> index ..6d0a0ca5c0a9 >> --- /dev/null >> +++ b/drivers/hwtracing/ptt/hisi_ptt.c >> @@ -0,0 +1,398 @@ >> +// SPDX-License-Identifier: GPL-2.0 >> +/* >> + * Driver for HiSilicon PCIe tune and trace device >> + * >> + * Copyright (c) 2022 HiSilicon Technologies Co., Ltd. >> + * Author: Yicong Yang >> + */ >> + [...] >> +static int hisi_ptt_alloc_trace_buf(struct hisi_ptt *hisi_ptt) >> +{ >> + struct hisi_ptt_trace_ctrl *ctrl = _ptt->trace_ctrl; >> + struct device *dev = _ptt->pdev->dev; >> + struct hisi_ptt_dma_buffer *buffer; >> + int i, ret; >> + >> + hisi_ptt->trace_ctrl.buf_index = 0; >> + >> + /* Make sure the trace buffer is empty before allocating */ >> + if (!list_empty(>trace_buf)) { >> + list_for_each_entry(buffer, >trace_buf, list) >> + memset(buffer->addr, 0, buffer->size); >> + return 0; >> + } >> + >> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; ++i) { >> + buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); > > I may have asked this before: why no devm usage? > I remembered I was suggested for not using devm where we may need to manually free it as it intends to be freed automically after the driver detachment. >> + if (!buffer) { >> + ret = -ENOMEM; >> + goto err; >> + } >> + >> + buffer->addr = dma_alloc_coherent(dev, ctrl->buffer_size, >> + >dma, GFP_KERNEL); >> + if (!buffer->addr) { >> + kfree(buffer); >> + ret = -ENOMEM; >> + goto err; >> + } >> + >> +
Re: [PATCH v3 8/8] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity
On 24/01/2022 13:11, Yicong Yang wrote: The DMA of HiSilicon PTT device can only work with identical mapping. So add a quirk for the device to force the domain passthrough. This patch should be earlier in the series, before the PTT driver, and the comment on hisi_ptt_check_iommu_mapping() should mention what is going on here. Signed-off-by: Yicong Yang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 1 file changed, 16 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 6dc6d8b6b368..6f67a2b1dd27 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2838,6 +2838,21 @@ static int arm_smmu_dev_disable_feature(struct device *dev, } } +#define IS_HISI_PTT_DEVICE(pdev) ((pdev)->vendor == PCI_VENDOR_ID_HUAWEI && \ +(pdev)->device == 0xa12e) I assume that not all revisions will require this check, right? + +static int arm_smmu_def_domain_type(struct device *dev) +{ + if (dev_is_pci(dev)) { + struct pci_dev *pdev = to_pci_dev(dev); + + if (IS_HISI_PTT_DEVICE(pdev)) + return IOMMU_DOMAIN_IDENTITY; + } + + return 0; +} + static struct iommu_ops arm_smmu_ops = { .capable= arm_smmu_capable, .domain_alloc = arm_smmu_domain_alloc, @@ -2863,6 +2878,7 @@ static struct iommu_ops arm_smmu_ops = { .sva_unbind = arm_smmu_sva_unbind, .sva_get_pasid = arm_smmu_sva_get_pasid, .page_response = arm_smmu_page_response, + .def_domain_type= arm_smmu_def_domain_type, .pgsize_bitmap = -1UL, /* Restricted during device attach */ .owner = THIS_MODULE, }; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu