Re: [PATCH 2/2] iommu: Introduce Interface for IOMMU TLB Flushing
Hi Will, On Fri, Sep 01, 2017 at 06:20:45PM +0100, Will Deacon wrote: > On Wed, Aug 23, 2017 at 03:50:04PM +0200, Joerg Roedel wrote: > > +EXPORT_SYMBOL_GPL(iommu_unmap_fast); > > Really minor nit, but I think that iommu_unmap_nosync is a more descriptive > name (who wouldn't want to use the _fast version?!). Yeah, but I wanted the name to motivate people to use it, or at least think about why they are not using the _fast variant. Therefore I decided for that name and not the more descriptive _sync version. > But anyway, this looks like a really welcome change and it addresses most > of my concerns on v1, so, modulo Robin's comments about map_sg: Thanks, I removed the stale map_sg_sync stuff before applying. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] mm/mmu_notifier: avoid double notification when it is useless
From: Jérôme Glisse(Note that this is 4.15 material or 4.14 if people are extra confident. I am posting now to get people to test. To that effect maybe it would be a good idea to have that patch sit in linux-next for a while for testing. Other motivation is that the problem is fresh in everyone's memory Thanks to Andrea for thinking of a problematic scenario for COW) When clearing a pte/pmd we are given a choice to notify the event through (notify version of *_clear_flush call mmu_notifier_invalidate_range) under the page table lock. But that notification is not necessary in all cases. This patches remove almost all the case where it is useless to have a call to mmu_notifier_invalidate_range() before mmu_notifier_invalidate_range_end(). It also adds documentation in all those case explaining why. Below is a more in depth analysis of why this is fine to do this: For secondary TLB (non CPU TLB) like IOMMU TLB or device TLB (when device use thing like ATS/PASID to get the IOMMU to walk the CPU page table to access a process virtual address space). There is only 2 cases when you need to notify those secondary TLB while holding page table lock when clearing a pte/pmd: A) page backing address is free before mmu_notifier_invalidate_range_end() B) a page table entry is updated to point to a new page (COW, write fault on zero page, __replace_page(), ...) Case A is obvious you do not want to take the risk for the device to write to a page that might now be use by some completely different task. Case B is more subtle. For correctness it requires the following sequence to happen: - take page table lock - clear page table entry and notify ([pmd/pte]p_huge_clear_flush_notify()) - set page table entry to point to new page If clearing the page table entry is not followed by a notify before setting the new pte/pmd value then you can break memory model like C11 or C++11 for the device. Consider the following scenario (device use a feature similar to ATS/PASID): Two address addrA and addrB such that |addrA - addrB| >= PAGE_SIZE we assume they are write protected for COW (other case of B apply too). [Time N] CPU-thread-0 {try to write to addrA} CPU-thread-1 {try to write to addrB} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {read addrA and populate device TLB} DEV-thread-2 {read addrB and populate device TLB} [Time N+1] -- CPU-thread-0 {COW_step0: {mmu_notifier_invalidate_range_start(addrA)}} CPU-thread-1 {COW_step0: {mmu_notifier_invalidate_range_start(addrB)}} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+2] -- CPU-thread-0 {COW_step1: {update page table to point to new page for addrA}} CPU-thread-1 {COW_step1: {update page table to point to new page for addrB}} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+3] -- CPU-thread-0 {preempted} CPU-thread-1 {preempted} CPU-thread-2 {write to addrA which is a write to new page} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+3] -- CPU-thread-0 {preempted} CPU-thread-1 {preempted} CPU-thread-2 {} CPU-thread-3 {write to addrB which is a write to new page} DEV-thread-0 {} DEV-thread-2 {} [Time N+4] -- CPU-thread-0 {preempted} CPU-thread-1 {COW_step3: {mmu_notifier_invalidate_range_end(addrB)}} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {} DEV-thread-2 {} [Time N+5] -- CPU-thread-0 {preempted} CPU-thread-1 {} CPU-thread-2 {} CPU-thread-3 {} DEV-thread-0 {read addrA from old page} DEV-thread-2 {read addrB from new page} So here because at time N+2 the clear page table entry was not pair with a notification to invalidate the secondary TLB, the device see the new value for addrB before seing the new value for addrA. This break total memory ordering for the device. When changing a pte to write protect or to point to a new write protected page with same content (KSM) it is fine to delay the mmu_notifier_invalidate_range call to mmu_notifier_invalidate_range_end() outside the page table lock. This is true ven if the thread doing the page table update is preempted right after releasing page table lock but before call mmu_notifier_invalidate_range_end(). Signed-off-by: Jérôme Glisse Cc: Andrea Arcangeli Cc: Nadav Amit Cc: Linus Torvalds Cc: Andrew Morton Cc: Joerg Roedel Cc: Suravee Suthikulpanit Cc: David Woodhouse
Re: [PATCH 2/2] iommu: Introduce Interface for IOMMU TLB Flushing
Hi Joerg, On Wed, Aug 23, 2017 at 03:50:04PM +0200, Joerg Roedel wrote: > From: Joerg Roedel> > With the current IOMMU-API the hardware TLBs have to be > flushed in every iommu_ops->unmap() call-back. > > For unmapping large amounts of address space, like it > happens when a KVM domain with assigned devices is > destroyed, this causes thousands of unnecessary TLB flushes > in the IOMMU hardware because the unmap call-back runs for > every unmapped physical page. > > With the TLB Flush Interface and the new iommu_unmap_fast() > function introduced here the need to clean the hardware TLBs > is removed from the unmapping code-path. Users of > iommu_unmap_fast() have to explicitly call the TLB-Flush > functions to sync the page-table changes to the hardware. > > Three functions for TLB-Flushes are introduced: > > * iommu_flush_tlb_all() - Flushes all TLB entries > associated with that > domain. TLBs entries are > flushed when this function > returns. > > * iommu_tlb_range_add() - This will add a given > range to the flush queue > for this domain. > > * iommu_tlb_sync() - Flushes all queued ranges from >the hardware TLBs. Returns when >the flush is finished. > > The semantic of this interface is intentionally similar to > the iommu_gather_ops from the io-pgtable code. > > Cc: Alex Williamson > Cc: Will Deacon > Cc: Robin Murphy > Signed-off-by: Joerg Roedel > --- > drivers/iommu/iommu.c | 32 --- > include/linux/iommu.h | 72 > ++- > 2 files changed, 99 insertions(+), 5 deletions(-) [...] > +size_t iommu_unmap(struct iommu_domain *domain, > +unsigned long iova, size_t size) > +{ > + return __iommu_unmap(domain, iova, size, true); > +} > EXPORT_SYMBOL_GPL(iommu_unmap); > > +size_t iommu_unmap_fast(struct iommu_domain *domain, > + unsigned long iova, size_t size) > +{ > + return __iommu_unmap(domain, iova, size, false); > +} > +EXPORT_SYMBOL_GPL(iommu_unmap_fast); Really minor nit, but I think that iommu_unmap_nosync is a more descriptive name (who wouldn't want to use the _fast version?!). But anyway, this looks like a really welcome change and it addresses most of my concerns on v1, so, modulo Robin's comments about map_sg: Reviewed-by: Will Deacon Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/omap: Add support to program multiple iommus
On 09/01/2017 11:33 AM, Joerg Roedel wrote: > Hi Suman, > On Fri, Sep 01, 2017 at 11:21:45AM -0500, Suman Anna wrote: >> It's primarily a question of whether each iommu platform device need to >> be represented as a unique iommu_device or not. If you still think that >> both these need to be presented to iommu core as one device, I would >> have to add some glue logic in probe to tie the two devices together. > > I think that you should only call iommu_device_register and friends for > the mmu-device you link the other devices against in .add_device. > Otherwise people will see two mmus in sysfs, one of them with no devices > behind it. That is inconsistent with the rest of the patch-set, which > basically handles bots mmus as one. But changing that should be easy, I > think. Thanks for the feedback and suggestions, let me look at adding that logic. regards Suman ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/omap: Add support to program multiple iommus
Hi Suman, On Fri, Sep 01, 2017 at 11:21:45AM -0500, Suman Anna wrote: > It's primarily a question of whether each iommu platform device need to > be represented as a unique iommu_device or not. If you still think that > both these need to be presented to iommu core as one device, I would > have to add some glue logic in probe to tie the two devices together. I think that you should only call iommu_device_register and friends for the mmu-device you link the other devices against in .add_device. Otherwise people will see two mmus in sysfs, one of them with no devices behind it. That is inconsistent with the rest of the patch-set, which basically handles bots mmus as one. But changing that should be easy, I think. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/omap: Add support to program multiple iommus
Hi Joerg, On 09/01/2017 05:01 AM, Joerg Roedel wrote: > Hi Suman, > > On Thu, Aug 31, 2017 at 08:14:02AM -0500, Suman Anna wrote: >> The OMAP IOMMU driver has been enhanced to support allowing >> multiple IOMMUs to be programmed by a single client user. This >> support is being added mainly to handle the DSP subsystems on >> the DRA7xx SoCs, which have two MMUs within the same subsystem. >> These MMUs provide translations to a processor core port and >> an internal EDMA port. This support allows both the MMUs to >> be programmed together, but with each one retaining it's own >> internal state objects. The internal EDMA is managed by the >> software running on the DSPs, and this design provides on-par >> functionality with previous generation OMAP DSPs where the >> EDMA and the DSP core shared the same MMU. > > I didn't get that from the review, so a question here: Do both MMUs show > show up in sysfs, means, do you register both of them the the core code? Yes, both MMUs are identical instances of the IP, and both are represented as their own platform devices (unique DT nodes), and since we created the iommu_group in probe of each device, each MMU has its own iommu_group as well. Here's a bit more output, root@dra7xx-evm:~# dmesg | grep mmu [0.519087] omap-iommu 40d01000.mmu: 40d01000.mmu registered [0.519580] omap-iommu 40d02000.mmu: 40d02000.mmu registered [0.520050] omap-iommu 58882000.mmu: 58882000.mmu registered [0.520495] omap-iommu 55082000.mmu: 55082000.mmu registered [0.521145] omap-iommu 41501000.mmu: 41501000.mmu registered [0.521634] omap-iommu 41502000.mmu: 41502000.mmu registered [0.522100] iommu: Adding device 5882.ipu to group 2 [0.522260] iommu: Adding device 5502.ipu to group 3 [0.522520] iommu: Adding device 4080.dsp to group 0 [0.522955] iommu: Adding device 4100.dsp to group 4 root@dra7xx-evm:~# ls -l /sys/class/iommu/ lrwxrwxrwx1 root root 0 Aug 8 03:59 40d01000.mmu -> ../../devices/platform/4400.ocp/40d01000.mmu/iommu/40d01000.mmu lrwxrwxrwx1 root root 0 Aug 8 03:59 40d02000.mmu -> ../../devices/platform/4400.ocp/40d02000.mmu/iommu/40d02000.mmu lrwxrwxrwx1 root root 0 Aug 8 03:59 41501000.mmu -> ../../devices/platform/4400.ocp/41501000.mmu/iommu/41501000.mmu lrwxrwxrwx1 root root 0 Aug 8 03:59 41502000.mmu -> ../../devices/platform/4400.ocp/41502000.mmu/iommu/41502000.mmu lrwxrwxrwx1 root root 0 Aug 8 03:59 55082000.mmu -> ../../devices/platform/4400.ocp/55082000.mmu/iommu/55082000.mmu lrwxrwxrwx1 root root 0 Aug 8 03:59 58882000.mmu -> ../../devices/platform/4400.ocp/58882000.mmu/iommu/58882000.mmu > Because I think with the solution implemented here, only a single, > combined MMU should be visible to iommu core code and in sysfs. So, the way I have implemented atm is to do the device linking only to one MMU in .add_device ops, while most of the remaining ops like map, unmap, attach_device, detach_device loop through both MMUs to do the programming. It's primarily a question of whether each iommu platform device need to be represented as a unique iommu_device or not. If you still think that both these need to be presented to iommu core as one device, I would have to add some glue logic in probe to tie the two devices together. regards Suman ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 00/13] mmu_notifier kill invalidate_page callback
On Wed, Aug 30, 2017, at 10:57 AM, Adam Borowski wrote: > On Tue, Aug 29, 2017 at 08:56:15PM -0400, Jerome Glisse wrote: > > I will wait for people to test and for result of my own test before > > reposting if need be, otherwise i will post as separate patch. > > > > > But from a _very_ quick read-through this looks fine. But it obviously > > > needs testing. > > > > > > People - *especially* the people who saw issues under KVM - can you > > > try out Jérôme's patch-series? I aded some people to the cc, the full > > > series is on lkml. Jérôme - do you have a git branch for people to > > > test that they could easily pull and try out? > > > > https://cgit.freedesktop.org/~glisse/linux mmu-notifier branch > > git://people.freedesktop.org/~glisse/linux > > Tested your branch as of 10f07641, on a long list of guest VMs. > No earth-shattering kaboom. I've been using the mmu_notifier branch @ a3d944233bcf8c for the last 36 hours or so, also without incident. Unlike most other reporters, I experienced a similar splat on 4.12: Aug 03 15:02:47 kvm_master kernel: [ cut here ] Aug 03 15:02:47 kvm_master kernel: WARNING: CPU: 13 PID: 1653 at arch/x86/kvm/mmu.c:682 mmu_spte_clear_track_bits+0xfb/0x100 [kvm] Aug 03 15:02:47 kvm_master kernel: Modules linked in: vhost_net vhost tap xt_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter msr nls_iso8859_1 nls_cp437 intel_rapl ipt_ MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel input_leds pcbc aesni_intel led_class aes_x86_6 4 mxm_wmi crypto_simd glue_helper uvcvideo cryptd videobuf2_vmalloc videobuf2_memops igb videobuf2_v4l2 videobuf2_core snd_usb_audio videodev media joydev ptp evdev mousedev intel_cstate pps_core mac_hid intel_rapl_perf snd_hda_intel snd_virtuoso snd_usbmidi_lib snd_hda_codec snd_oxygen_lib snd_hda_core Aug 03 15:02:47 kvm_master kernel: snd_mpu401_uart snd_rawmidi snd_hwdep snd_seq_device snd_pcm snd_timer snd soundcore i2c_algo_bit pcspkr i2c_i801 lpc_ich ioatdma shpchp dca wmi acpi_power_meter tpm_tis tpm_tis_core tpm button bridge stp llc sch_fq_codel virtio_pci virtio_blk virtio_balloon virtio_net virtio_ring virtio kvm_intel kvm sg ip_tables x_tables hid_logitech_hidpp hid_logitech_dj hid_generic hid_microsoft usbhid hid sr_mod cdrom sd_mod xhci_pci ahci libahci xhci_hcd libata usbcore scsi_mod usb_common zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio vfat fat ext4 crc16 jbd2 fscrypto mbcache dm_thin_pool dm_cache dm_persistent_data dm_bio_prison dm_bufio dm_raid raid456 libcrc32c Aug 03 15:02:47 kvm_master kernel: crc32c_generic crc32c_intel async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq dm_mod dax raid1 md_mod Aug 03 15:02:47 kvm_master kernel: CPU: 13 PID: 1653 Comm: kworker/13:2 Tainted: PB D W O4.12.3-1-ARCH #1 Aug 03 15:02:47 kvm_master kernel: Hardware name: Supermicro SYS-7038A-I/X10DAI, BIOS 2.0a 11/09/2016 Aug 03 15:02:47 kvm_master kernel: Workqueue: events mmput_async_fn Aug 03 15:02:47 kvm_master kernel: task: 9fa89751b900 task.stack: c179880d8000 Aug 03 15:02:47 kvm_master kernel: RIP: 0010:mmu_spte_clear_track_bits+0xfb/0x100 [kvm] Aug 03 15:02:47 kvm_master kernel: RSP: 0018:c179880dbc20 EFLAGS: 00010246 Aug 03 15:02:47 kvm_master kernel: RAX: RBX: 0009c07cce77 RCX: dead00ff Aug 03 15:02:47 kvm_master kernel: RDX: RSI: 9fa82d6d6f08 RDI: f6e76701f300 Aug 03 15:02:47 kvm_master kernel: RBP: c179880dbc38 R08: 0010 R09: 000d Aug 03 15:02:47 kvm_master kernel: R10: 9fa0a56b0008 R11: 9fa0a56b R12: 009c07cc Aug 03 15:02:47 kvm_master kernel: R13: 9fa88b99 R14: 9f9e19dbb1b8 R15: Aug 03 15:02:47 kvm_master kernel: FS: () GS:9fac5f34() knlGS: Aug 03 15:02:47 kvm_master kernel: CS: 0010 DS: ES: CR0: 80050033 Aug 03 15:02:47 kvm_master kernel: CR2: d1b542d71000 CR3: 000570a09000 CR4: 003426e0 Aug 03 15:02:47 kvm_master kernel: DR0:
Re: [PATCH 00/13] mmu_notifier kill invalidate_page callback
taskboxtes...@gmail.com liked your message with Boxer for Android. On Sep 1, 2017 10:48 AM, Jeff Cookwrote: On Wed, Aug 30, 2017, at 10:57 AM, Adam Borowski wrote: > On Tue, Aug 29, 2017 at 08:56:15PM -0400, Jerome Glisse wrote: > > I will wait for people to test and for result of my own test before > > reposting if need be, otherwise i will post as separate patch. > > > > > But from a _very_ quick read-through this looks fine. But it obviously > > > needs testing. > > > > > > People - *especially* the people who saw issues under KVM - can you > > > try out Jérôme's patch-series? I aded some people to the cc, the full > > > series is on lkml. Jérôme - do you have a git branch for people to > > > test that they could easily pull and try out? > > > > https://cgit.freedesktop.org/~glisse/linux mmu-notifier branch > > git://people.freedesktop.org/~glisse/linux > > Tested your branch as of 10f07641, on a long list of guest VMs. > No earth-shattering kaboom. I've been using the mmu_notifier branch @ a3d944233bcf8c for the last 36 hours or so, also without incident. Unlike most other reporters, I experienced a similar splat on 4.12: Aug 03 15:02:47 kvm_master kernel: [ cut here ] Aug 03 15:02:47 kvm_master kernel: WARNING: CPU: 13 PID: 1653 at arch/x86/kvm/mmu.c:682 mmu_spte_clear_track_bits+0xfb/0x100 [kvm] Aug 03 15:02:47 kvm_master kernel: Modules linked in: vhost_net vhost tap xt_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter msr nls_iso8859_1 nls_cp437 intel_rapl ipt_ MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel input_leds pcbc aesni_intel led_class aes_x86_6 4 mxm_wmi crypto_simd glue_helper uvcvideo cryptd videobuf2_vmalloc videobuf2_memops igb videobuf2_v4l2 videobuf2_core snd_usb_audio videodev media joydev ptp evdev mousedev intel_cstate pps_core mac_hid intel_rapl_perf snd_hda_intel snd_virtuoso snd_usbmidi_lib snd_hda_codec snd_oxygen_lib snd_hda_core Aug 03 15:02:47 kvm_master kernel: snd_mpu401_uart snd_rawmidi snd_hwdep snd_seq_device snd_pcm snd_timer snd soundcore i2c_algo_bit pcspkr i2c_i801 lpc_ich ioatdma shpchp dca wmi acpi_power_meter tpm_tis tpm_tis_core tpm button bridge stp llc sch_fq_codel virtio_pci virtio_blk virtio_balloon virtio_net virtio_ring virtio kvm_intel kvm sg ip_tables x_tables hid_logitech_hidpp hid_logitech_dj hid_generic hid_microsoft usbhid hid sr_mod cdrom sd_mod xhci_pci ahci libahci xhci_hcd libata usbcore scsi_mod usb_common zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio vfat fat ext4 crc16 jbd2 fscrypto mbcache dm_thin_pool dm_cache dm_persistent_data dm_bio_prison dm_bufio dm_raid raid456 libcrc32c Aug 03 15:02:47 kvm_master kernel: crc32c_generic crc32c_intel async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq dm_mod dax raid1 md_mod Aug 03 15:02:47 kvm_master kernel: CPU: 13 PID: 1653 Comm: kworker/13:2 Tainted: PB D W O4.12.3-1-ARCH #1 Aug 03 15:02:47 kvm_master kernel: Hardware name: Supermicro SYS-7038A-I/X10DAI, BIOS 2.0a 11/09/2016 Aug 03 15:02:47 kvm_master kernel: Workqueue: events mmput_async_fn Aug 03 15:02:47 kvm_master kernel: task: 9fa89751b900 task.stack: c179880d8000 Aug 03 15:02:47 kvm_master kernel: RIP: 0010:mmu_spte_clear_track_bits+0xfb/0x100 [kvm] Aug 03 15:02:47 kvm_master kernel: RSP: 0018:c179880dbc20 EFLAGS: 00010246 Aug 03 15:02:47 kvm_master kernel: RAX: RBX: 0009c07cce77 RCX: dead00ff Aug 03 15:02:47 kvm_master kernel: RDX: RSI: 9fa82d6d6f08 RDI: f6e76701f300 Aug 03 15:02:47 kvm_master kernel: RBP: c179880dbc38 R08: 0010 R09: 000d Aug 03 15:02:47 kvm_master kernel: R10: 9fa0a56b0008 R11: 9fa0a56b R12: 009c07cc Aug 03 15:02:47 kvm_master kernel: R13: 9fa88b99 R14: 9f9e19dbb1b8 R15: Aug 03 15:02:47 kvm_master kernel: FS: () GS:9fac5f34() knlGS: Aug 03 15:02:47 kvm_master kernel: CS: 0010 DS: ES: CR0: 80050033 Aug 03 15:02:47 kvm_master kernel: CR2: d1b542d71000
Re: [PATCH 2/2] iommu/omap: Add support to program multiple iommus
Hi Suman, On Thu, Aug 31, 2017 at 08:14:02AM -0500, Suman Anna wrote: > The OMAP IOMMU driver has been enhanced to support allowing > multiple IOMMUs to be programmed by a single client user. This > support is being added mainly to handle the DSP subsystems on > the DRA7xx SoCs, which have two MMUs within the same subsystem. > These MMUs provide translations to a processor core port and > an internal EDMA port. This support allows both the MMUs to > be programmed together, but with each one retaining it's own > internal state objects. The internal EDMA is managed by the > software running on the DSPs, and this design provides on-par > functionality with previous generation OMAP DSPs where the > EDMA and the DSP core shared the same MMU. I didn't get that from the review, so a question here: Do both MMUs show show up in sysfs, means, do you register both of them the the core code? Because I think with the solution implemented here, only a single, combined MMU should be visible to iommu core code and in sysfs. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] MAINTAINERS: use the iommu list for the dma-mapping subsystem
On Fri, Sep 01, 2017 at 10:48:50AM +0100, Robin Murphy wrote: > On 01/09/17 08:56, Marek Szyprowski wrote: > > Hi Christoph, > > > > On 2017-09-01 09:50, Christoph Hellwig wrote: > >> Any comments? I'd like to add this to the 4.14 pull request. > >> > >> On Sat, Aug 26, 2017 at 11:27:34AM +0200, Christoph Hellwig wrote: > >>> Maintaining a subsystem with linux-kernel as the main list is painful > >>> as it has way to much traffic. On the other hand the dma-mapping > >>> subsystem is small enough that a list on its own would be silly. > >>> So use the list for the closes subsystem instead instead. > >>> > >>> Signed-off-by: Christoph Hellwig> > > > Like I already said, its fine for me. > > Acked-by: Marek Szyprowski > > Fine by me too. > > Joerg, Alex, are you OK with the odd bit of extra ML traffic? Fine by me, of course. Acked-by: Joerg Roedel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] MAINTAINERS: use the iommu list for the dma-mapping subsystem
On 01/09/17 08:56, Marek Szyprowski wrote: > Hi Christoph, > > On 2017-09-01 09:50, Christoph Hellwig wrote: >> Any comments? I'd like to add this to the 4.14 pull request. >> >> On Sat, Aug 26, 2017 at 11:27:34AM +0200, Christoph Hellwig wrote: >>> Maintaining a subsystem with linux-kernel as the main list is painful >>> as it has way to much traffic. On the other hand the dma-mapping >>> subsystem is small enough that a list on its own would be silly. >>> So use the list for the closes subsystem instead instead. >>> >>> Signed-off-by: Christoph Hellwig> > Like I already said, its fine for me. > Acked-by: Marek Szyprowski Fine by me too. Joerg, Alex, are you OK with the odd bit of extra ML traffic? Thanks, Robin. >>> --- >>> MAINTAINERS | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/MAINTAINERS b/MAINTAINERS >>> index 6f7721d1634c..1df11ed346a7 100644 >>> --- a/MAINTAINERS >>> +++ b/MAINTAINERS >>> @@ -4175,7 +4175,7 @@ DMA MAPPING HELPERS >>> M:Christoph Hellwig >>> M:Marek Szyprowski >>> R:Robin Murphy >>> -L:linux-ker...@vger.kernel.org >>> +L:iommu@lists.linux-foundation.org >>> T:git git://git.infradead.org/users/hch/dma-mapping.git >>> W:http://git.infradead.org/users/hch/dma-mapping.git >>> S:Supported >>> -- >>> 2.11.0 > > Best regards ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/dma: limit the IOVA allocated to dma-ranges region
On 01/09/17 10:26, Joerg Roedel wrote: > Adding Robin for review. > > On Thu, Aug 31, 2017 at 03:08:21PM -0700, Krishna Reddy wrote: >> Limit the IOVA allocated to dma-ranges specified for the device. >> This is necessary to ensure that IOVA allocated is addressable >> by device. Why? IOVA allocation is already constrained as much as it should be - if the device's DMA mask is wrong that's another problem, and this isn't the right place to fix it. dma_32bit_pfn means nothing more than an internal detail of IOVA allocator caching, which is subject to change[1]. As-is, on some platforms this patch will effectively force all allocations to fail already. Robin. [1]:https://www.mail-archive.com/iommu@lists.linux-foundation.org/msg19694.html >> Signed-off-by: Krishna Reddy>> --- >> drivers/iommu/dma-iommu.c | 5 + >> 1 file changed, 5 insertions(+) >> >> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c >> index 9d1cebe7f6cb..e8a8320b571b 100644 >> --- a/drivers/iommu/dma-iommu.c >> +++ b/drivers/iommu/dma-iommu.c >> @@ -364,6 +364,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct >> iommu_domain *domain, >> struct iommu_dma_cookie *cookie = domain->iova_cookie; >> struct iova_domain *iovad = >iovad; >> unsigned long shift, iova_len, iova = 0; >> +dma_addr_t dma_end_addr; >> >> if (cookie->type == IOMMU_DMA_MSI_COOKIE) { >> cookie->msi_iova += size; >> @@ -381,6 +382,10 @@ static dma_addr_t iommu_dma_alloc_iova(struct >> iommu_domain *domain, >> if (iova_len < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1))) >> iova_len = roundup_pow_of_two(iova_len); >> >> +/* Limit IOVA allocated to device addressable dma-ranges region. */ >> +dma_end_addr = (dma_addr_t)iovad->dma_32bit_pfn << shift; >> +dma_limit = dma_limit > dma_end_addr ? dma_end_addr : dma_limit; > > This looks like a good use-case for min(). > >> + >> if (domain->geometry.force_aperture) >> dma_limit = min(dma_limit, domain->geometry.aperture_end); >> >> -- >> 2.1.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/vt-d: Don't be too aggressive when clearing one context entry
On Thu, Aug 31, 2017 at 10:58:11AM +0200, Filippo Sironi wrote: > Previously, we were invalidating context cache and IOTLB globally when > clearing one context entry. This is a tad too aggressive. > Invalidate the context cache and IOTLB for the interested device only. > > Signed-off-by: Filippo Sironi> Cc: David Woodhouse > Cc: David Woodhouse > Cc: Joerg Roedel > Cc: Jacob Pan > Cc: iommu@lists.linux-foundation.org > Cc: linux-ker...@vger.kernel.org > --- > drivers/iommu/intel-iommu.c | 42 -- > 1 file changed, 24 insertions(+), 18 deletions(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/dma: limit the IOVA allocated to dma-ranges region
Adding Robin for review. On Thu, Aug 31, 2017 at 03:08:21PM -0700, Krishna Reddy wrote: > Limit the IOVA allocated to dma-ranges specified for the device. > This is necessary to ensure that IOVA allocated is addressable > by device. > > Signed-off-by: Krishna Reddy> --- > drivers/iommu/dma-iommu.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index 9d1cebe7f6cb..e8a8320b571b 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -364,6 +364,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct > iommu_domain *domain, > struct iommu_dma_cookie *cookie = domain->iova_cookie; > struct iova_domain *iovad = >iovad; > unsigned long shift, iova_len, iova = 0; > + dma_addr_t dma_end_addr; > > if (cookie->type == IOMMU_DMA_MSI_COOKIE) { > cookie->msi_iova += size; > @@ -381,6 +382,10 @@ static dma_addr_t iommu_dma_alloc_iova(struct > iommu_domain *domain, > if (iova_len < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1))) > iova_len = roundup_pow_of_two(iova_len); > > + /* Limit IOVA allocated to device addressable dma-ranges region. */ > + dma_end_addr = (dma_addr_t)iovad->dma_32bit_pfn << shift; > + dma_limit = dma_limit > dma_end_addr ? dma_end_addr : dma_limit; This looks like a good use-case for min(). > + > if (domain->geometry.force_aperture) > dma_limit = min(dma_limit, domain->geometry.aperture_end); > > -- > 2.1.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 3/3] iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801
On 10/08/2017 18:27, Will Deacon wrote: On Wed, Aug 09, 2017 at 11:07:15AM +0100, Shameer Kolothum wrote: The HiSilicon erratum 161010801 describes the limitation of HiSilicon platforms Hip06/Hip07 to support the SMMU mappings for MSI transactions. On these platforms GICv3 ITS translator is presented with the deviceID by extending the MSI payload data to 64 bits to include the deviceID. Hence, the PCIe controller on this platforms has to differentiate the MSI payload against other DMA payload and has to modify the MSI payload. This basically makes it difficult for this platforms to have a SMMU translation for MSI. This patch implements a ACPI table based quirk to reserve the hw msi regions in the smmu-v3 driver which means these address regions will not be translated and will be excluded from iova allocations. Signed-off-by: Shameer Kolothum--- drivers/iommu/arm-smmu-v3.c | 27 ++- 1 file changed, 22 insertions(+), 5 deletions(-) Please can you also add a devicetree binding with corresponding documentation to enable this workaround on non-ACPI based systems too? It should be straightforward if you update the arm_smmu_options table. Hi Will, Lorenzo, Robin, I have created the patch to add DT support for this erratum. However, currently I have only added support for pci-based devices. I'm a bit stumped on how to add platform device support, or if we should also add support at all. And I would rather ask before sending the patches. The specific issue is that I know of no platform device with an ITS msi-parent which we would want reserve, i.e. do not translate msi. And, if there were, how to do it. The only platform devices I know off with msi ITS parents are SMMUv3 and mbigen. For mbigen, the msi are not translated. Actually, as I see under IORT solution, for mbigen we don't reserve the hw msi region as the SMMUv3 does not have an ID mapping. And we have no equivalent ID mapping property for DT SMMUv3, so cannot create an equivalent check. So here is how the DT iommu get reserved region function with only pci device support looks: /** * of_iommu_its_get_resv_regions - Reserved region driver helper * @dev: Device from iommu_get_resv_regions() * @list: Reserved region list from iommu_get_resv_regions() * * Returns: Number of reserved regions on success(0 if no associated ITS), * appropriate error value otherwise. */ int of_iommu_its_get_resv_regions(struct device *dev, struct list_head *head) { struct device_node *of_node = NULL; struct device *parent_dev; const __be32 *msi_map; int num_mappings, i, err, len, resv = 0; /* walk up to find the parent with a msi-map property */ for (parent_dev = dev; parent_dev; parent_dev = parent_dev->parent) { if (!parent_dev->of_node) continue; /* * Current logic to reserve ITS regions for only PCI root * complex. */ msi_map = of_get_property(parent_dev->of_node, "msi-map", ); if (msi_map) { of_node = parent_dev->of_node; break; } } if (!of_node) return 0; num_mappings = of_count_phandle_with_args(of_node, "msi-map", NULL) / 4; for (i = 0; i < num_mappings; i++) { int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO; struct iommu_resv_region *region; struct device_node *msi_node; struct resource resource; u32 phandle; phandle = be32_to_cpup(msi_map + 1); msi_node = of_find_node_by_phandle(phandle); if (!msi_node) return -ENODEV; /* * There may be different types of msi-controller, so check * for ITS. */ if (!of_device_is_compatible(msi_node, "arm,gic-v3-its")) { of_node_put(msi_node); continue; } err = of_address_to_resource(msi_node, 0, ); of_node_put(msi_node); if (err) return err; region = iommu_alloc_resv_region(resource.start, SZ_128K, prot, IOMMU_RESV_MSI); if (region) { list_add_tail(>list, head); resv++; } } return (resv == num_mappings) ? resv : -ENODEV; } Any feedback is appreciated. Thanks, John Thanks, Will . ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] MAINTAINERS: use the iommu list for the dma-mapping subsystem
Hi Christoph, On 2017-09-01 09:50, Christoph Hellwig wrote: Any comments? I'd like to add this to the 4.14 pull request. On Sat, Aug 26, 2017 at 11:27:34AM +0200, Christoph Hellwig wrote: Maintaining a subsystem with linux-kernel as the main list is painful as it has way to much traffic. On the other hand the dma-mapping subsystem is small enough that a list on its own would be silly. So use the list for the closes subsystem instead instead. Signed-off-by: Christoph HellwigLike I already said, its fine for me. Acked-by: Marek Szyprowski --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 6f7721d1634c..1df11ed346a7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4175,7 +4175,7 @@ DMA MAPPING HELPERS M:Christoph Hellwig M:Marek Szyprowski R:Robin Murphy -L: linux-ker...@vger.kernel.org +L: iommu@lists.linux-foundation.org T:git git://git.infradead.org/users/hch/dma-mapping.git W:http://git.infradead.org/users/hch/dma-mapping.git S:Supported -- 2.11.0 Best regards -- Marek Szyprowski, PhD Samsung R Institute Poland ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] MAINTAINERS: use the iommu list for the dma-mapping subsystem
Any comments? I'd like to add this to the 4.14 pull request. On Sat, Aug 26, 2017 at 11:27:34AM +0200, Christoph Hellwig wrote: > Maintaining a subsystem with linux-kernel as the main list is painful > as it has way to much traffic. On the other hand the dma-mapping > subsystem is small enough that a list on its own would be silly. > So use the list for the closes subsystem instead instead. > > Signed-off-by: Christoph Hellwig> --- > MAINTAINERS | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/MAINTAINERS b/MAINTAINERS > index 6f7721d1634c..1df11ed346a7 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -4175,7 +4175,7 @@ DMA MAPPING HELPERS > M: Christoph Hellwig > M: Marek Szyprowski > R: Robin Murphy > -L: linux-ker...@vger.kernel.org > +L: iommu@lists.linux-foundation.org > T: git git://git.infradead.org/users/hch/dma-mapping.git > W: http://git.infradead.org/users/hch/dma-mapping.git > S: Supported > -- > 2.11.0 > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ---end quoted text--- ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu