Re: [PATCH 2/2] iommu: Introduce Interface for IOMMU TLB Flushing

2017-09-01 Thread Joerg Roedel
Hi Will,

On Fri, Sep 01, 2017 at 06:20:45PM +0100, Will Deacon wrote:
> On Wed, Aug 23, 2017 at 03:50:04PM +0200, Joerg Roedel wrote:

> > +EXPORT_SYMBOL_GPL(iommu_unmap_fast);
> 
> Really minor nit, but I think that iommu_unmap_nosync is a more descriptive
> name (who wouldn't want to use the _fast version?!).

Yeah, but I wanted the name to motivate people to use it, or at least
think about why they are not using the _fast variant. Therefore I
decided for that name and not the more descriptive _sync version.

> But anyway, this looks like a really welcome change and it addresses most
> of my concerns on v1, so, modulo Robin's comments about map_sg:

Thanks, I removed the stale map_sg_sync stuff before applying.


Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] mm/mmu_notifier: avoid double notification when it is useless

2017-09-01 Thread jglisse
From: Jérôme Glisse 

(Note that this is 4.15 material or 4.14 if people are extra confident. I
 am posting now to get people to test. To that effect maybe it would be a
 good idea to have that patch sit in linux-next for a while for testing.

 Other motivation is that the problem is fresh in everyone's memory

 Thanks to Andrea for thinking of a problematic scenario for COW)

When clearing a pte/pmd we are given a choice to notify the event through
(notify version of *_clear_flush call mmu_notifier_invalidate_range) under
the page table lock. But that notification is not necessary in all cases.

This patches remove almost all the case where it is useless to have a call to
mmu_notifier_invalidate_range() before mmu_notifier_invalidate_range_end().
It also adds documentation in all those case explaining why.

Below is a more in depth analysis of why this is fine to do this:

For secondary TLB (non CPU TLB) like IOMMU TLB or device TLB (when device use
thing like ATS/PASID to get the IOMMU to walk the CPU page table to access a
process virtual address space). There is only 2 cases when you need to notify
those secondary TLB while holding page table lock when clearing a pte/pmd:

  A) page backing address is free before mmu_notifier_invalidate_range_end()
  B) a page table entry is updated to point to a new page (COW, write fault
 on zero page, __replace_page(), ...)

Case A is obvious you do not want to take the risk for the device to write to
a page that might now be use by some completely different task.

Case B is more subtle. For correctness it requires the following sequence to
happen:
  - take page table lock
  - clear page table entry and notify ([pmd/pte]p_huge_clear_flush_notify())
  - set page table entry to point to new page

If clearing the page table entry is not followed by a notify before setting
the new pte/pmd value then you can break memory model like C11 or C++11 for
the device.

Consider the following scenario (device use a feature similar to ATS/PASID):

Two address addrA and addrB such that |addrA - addrB| >= PAGE_SIZE we assume
they are write protected for COW (other case of B apply too).

[Time N] 
CPU-thread-0  {try to write to addrA}
CPU-thread-1  {try to write to addrB}
CPU-thread-2  {}
CPU-thread-3  {}
DEV-thread-0  {read addrA and populate device TLB}
DEV-thread-2  {read addrB and populate device TLB}
[Time N+1] --
CPU-thread-0  {COW_step0: {mmu_notifier_invalidate_range_start(addrA)}}
CPU-thread-1  {COW_step0: {mmu_notifier_invalidate_range_start(addrB)}}
CPU-thread-2  {}
CPU-thread-3  {}
DEV-thread-0  {}
DEV-thread-2  {}
[Time N+2] --
CPU-thread-0  {COW_step1: {update page table to point to new page for addrA}}
CPU-thread-1  {COW_step1: {update page table to point to new page for addrB}}
CPU-thread-2  {}
CPU-thread-3  {}
DEV-thread-0  {}
DEV-thread-2  {}
[Time N+3] --
CPU-thread-0  {preempted}
CPU-thread-1  {preempted}
CPU-thread-2  {write to addrA which is a write to new page}
CPU-thread-3  {}
DEV-thread-0  {}
DEV-thread-2  {}
[Time N+3] --
CPU-thread-0  {preempted}
CPU-thread-1  {preempted}
CPU-thread-2  {}
CPU-thread-3  {write to addrB which is a write to new page}
DEV-thread-0  {}
DEV-thread-2  {}
[Time N+4] --
CPU-thread-0  {preempted}
CPU-thread-1  {COW_step3: {mmu_notifier_invalidate_range_end(addrB)}}
CPU-thread-2  {}
CPU-thread-3  {}
DEV-thread-0  {}
DEV-thread-2  {}
[Time N+5] --
CPU-thread-0  {preempted}
CPU-thread-1  {}
CPU-thread-2  {}
CPU-thread-3  {}
DEV-thread-0  {read addrA from old page}
DEV-thread-2  {read addrB from new page}

So here because at time N+2 the clear page table entry was not pair with a
notification to invalidate the secondary TLB, the device see the new value for
addrB before seing the new value for addrA. This break total memory ordering
for the device.

When changing a pte to write protect or to point to a new write protected page
with same content (KSM) it is fine to delay the mmu_notifier_invalidate_range
call to mmu_notifier_invalidate_range_end() outside the page table lock. This
is true ven if the thread doing the page table update is preempted right after
releasing page table lock but before call mmu_notifier_invalidate_range_end().

Signed-off-by: Jérôme Glisse 
Cc: Andrea Arcangeli 
Cc: Nadav Amit 
Cc: Linus Torvalds 
Cc: Andrew Morton 
Cc: Joerg Roedel 
Cc: Suravee Suthikulpanit 
Cc: David Woodhouse 

Re: [PATCH 2/2] iommu: Introduce Interface for IOMMU TLB Flushing

2017-09-01 Thread Will Deacon
Hi Joerg,

On Wed, Aug 23, 2017 at 03:50:04PM +0200, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> With the current IOMMU-API the hardware TLBs have to be
> flushed in every iommu_ops->unmap() call-back.
> 
> For unmapping large amounts of address space, like it
> happens when a KVM domain with assigned devices is
> destroyed, this causes thousands of unnecessary TLB flushes
> in the IOMMU hardware because the unmap call-back runs for
> every unmapped physical page.
> 
> With the TLB Flush Interface and the new iommu_unmap_fast()
> function introduced here the need to clean the hardware TLBs
> is removed from the unmapping code-path. Users of
> iommu_unmap_fast() have to explicitly call the TLB-Flush
> functions to sync the page-table changes to the hardware.
> 
> Three functions for TLB-Flushes are introduced:
> 
>   * iommu_flush_tlb_all() - Flushes all TLB entries
> associated with that
> domain. TLBs entries are
> flushed when this function
> returns.
> 
>   * iommu_tlb_range_add() - This will add a given
> range to the flush queue
> for this domain.
> 
>   * iommu_tlb_sync() - Flushes all queued ranges from
>the hardware TLBs. Returns when
>the flush is finished.
> 
> The semantic of this interface is intentionally similar to
> the iommu_gather_ops from the io-pgtable code.
> 
> Cc: Alex Williamson 
> Cc: Will Deacon 
> Cc: Robin Murphy 
> Signed-off-by: Joerg Roedel 
> ---
>  drivers/iommu/iommu.c | 32 ---
>  include/linux/iommu.h | 72 
> ++-
>  2 files changed, 99 insertions(+), 5 deletions(-)

[...]

> +size_t iommu_unmap(struct iommu_domain *domain,
> +unsigned long iova, size_t size)
> +{
> + return __iommu_unmap(domain, iova, size, true);
> +}
>  EXPORT_SYMBOL_GPL(iommu_unmap);
>  
> +size_t iommu_unmap_fast(struct iommu_domain *domain,
> + unsigned long iova, size_t size)
> +{
> + return __iommu_unmap(domain, iova, size, false);
> +}
> +EXPORT_SYMBOL_GPL(iommu_unmap_fast);

Really minor nit, but I think that iommu_unmap_nosync is a more descriptive
name (who wouldn't want to use the _fast version?!).

But anyway, this looks like a really welcome change and it addresses most
of my concerns on v1, so, modulo Robin's comments about map_sg:

Reviewed-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/omap: Add support to program multiple iommus

2017-09-01 Thread Suman Anna via iommu
On 09/01/2017 11:33 AM, Joerg Roedel wrote:
> Hi Suman,
> On Fri, Sep 01, 2017 at 11:21:45AM -0500, Suman Anna wrote:
>> It's primarily a question of whether each iommu platform device need to
>> be represented as a unique iommu_device or not. If you still think that
>> both these need to be presented to iommu core as one device, I would
>> have to add some glue logic in probe to tie the two devices together.
> 
> I think that you should only call iommu_device_register and friends for
> the mmu-device you link the other devices against in .add_device.
> Otherwise people will see two mmus in sysfs, one of them with no devices
> behind it. That is inconsistent with the rest of the patch-set, which
> basically handles bots mmus as one. But changing that should be easy, I
> think.

Thanks for the feedback and suggestions, let me look at adding that logic.

regards
Suman

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/omap: Add support to program multiple iommus

2017-09-01 Thread Joerg Roedel
Hi Suman,
On Fri, Sep 01, 2017 at 11:21:45AM -0500, Suman Anna wrote:
> It's primarily a question of whether each iommu platform device need to
> be represented as a unique iommu_device or not. If you still think that
> both these need to be presented to iommu core as one device, I would
> have to add some glue logic in probe to tie the two devices together.

I think that you should only call iommu_device_register and friends for
the mmu-device you link the other devices against in .add_device.
Otherwise people will see two mmus in sysfs, one of them with no devices
behind it. That is inconsistent with the rest of the patch-set, which
basically handles bots mmus as one. But changing that should be easy, I
think.

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/omap: Add support to program multiple iommus

2017-09-01 Thread Suman Anna via iommu
Hi Joerg,

On 09/01/2017 05:01 AM, Joerg Roedel wrote:
> Hi Suman,
> 
> On Thu, Aug 31, 2017 at 08:14:02AM -0500, Suman Anna wrote:
>> The OMAP IOMMU driver has been enhanced to support allowing
>> multiple IOMMUs to be programmed by a single client user. This
>> support is being added mainly to handle the DSP subsystems on
>> the DRA7xx SoCs, which have two MMUs within the same subsystem.
>> These MMUs provide translations to a processor core port and
>> an internal EDMA port. This support allows both the MMUs to
>> be programmed together, but with each one retaining it's own
>> internal state objects. The internal EDMA is managed by the
>> software running on the DSPs, and this design provides on-par
>> functionality with previous generation OMAP DSPs where the
>> EDMA and the DSP core shared the same MMU.
> 
> I didn't get that from the review, so a question here: Do both MMUs show
> show up in sysfs, means, do you register both of them the the core code?

Yes, both MMUs are identical instances of the IP, and both are
represented as their own platform devices (unique DT nodes), and since
we created the iommu_group in probe of each device, each MMU has its own
iommu_group as well.

Here's a bit more output,

root@dra7xx-evm:~# dmesg | grep mmu
[0.519087] omap-iommu 40d01000.mmu: 40d01000.mmu registered
[0.519580] omap-iommu 40d02000.mmu: 40d02000.mmu registered
[0.520050] omap-iommu 58882000.mmu: 58882000.mmu registered
[0.520495] omap-iommu 55082000.mmu: 55082000.mmu registered
[0.521145] omap-iommu 41501000.mmu: 41501000.mmu registered
[0.521634] omap-iommu 41502000.mmu: 41502000.mmu registered
[0.522100] iommu: Adding device 5882.ipu to group 2
[0.522260] iommu: Adding device 5502.ipu to group 3
[0.522520] iommu: Adding device 4080.dsp to group 0
[0.522955] iommu: Adding device 4100.dsp to group 4
root@dra7xx-evm:~# ls -l /sys/class/iommu/
lrwxrwxrwx1 root root 0 Aug  8 03:59 40d01000.mmu ->
../../devices/platform/4400.ocp/40d01000.mmu/iommu/40d01000.mmu
lrwxrwxrwx1 root root 0 Aug  8 03:59 40d02000.mmu ->
../../devices/platform/4400.ocp/40d02000.mmu/iommu/40d02000.mmu
lrwxrwxrwx1 root root 0 Aug  8 03:59 41501000.mmu ->
../../devices/platform/4400.ocp/41501000.mmu/iommu/41501000.mmu
lrwxrwxrwx1 root root 0 Aug  8 03:59 41502000.mmu ->
../../devices/platform/4400.ocp/41502000.mmu/iommu/41502000.mmu
lrwxrwxrwx1 root root 0 Aug  8 03:59 55082000.mmu ->
../../devices/platform/4400.ocp/55082000.mmu/iommu/55082000.mmu
lrwxrwxrwx1 root root 0 Aug  8 03:59 58882000.mmu ->
../../devices/platform/4400.ocp/58882000.mmu/iommu/58882000.mmu

> Because I think with the solution implemented here, only a single,
> combined MMU should be visible to iommu core code and in sysfs.

So, the way I have implemented atm is to do the device linking only to
one MMU in .add_device ops, while most of the remaining ops like map,
unmap, attach_device, detach_device loop through both MMUs to do the
programming.

It's primarily a question of whether each iommu platform device need to
be represented as a unique iommu_device or not. If you still think that
both these need to be presented to iommu core as one device, I would
have to add some glue logic in probe to tie the two devices together.

regards
Suman
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/13] mmu_notifier kill invalidate_page callback

2017-09-01 Thread Jeff Cook
On Wed, Aug 30, 2017, at 10:57 AM, Adam Borowski wrote:
> On Tue, Aug 29, 2017 at 08:56:15PM -0400, Jerome Glisse wrote:
> > I will wait for people to test and for result of my own test before
> > reposting if need be, otherwise i will post as separate patch.
> >
> > > But from a _very_ quick read-through this looks fine. But it obviously
> > > needs testing.
> > > 
> > > People - *especially* the people who saw issues under KVM - can you
> > > try out Jérôme's patch-series? I aded some people to the cc, the full
> > > series is on lkml. Jérôme - do you have a git branch for people to
> > > test that they could easily pull and try out?
> > 
> > https://cgit.freedesktop.org/~glisse/linux mmu-notifier branch
> > git://people.freedesktop.org/~glisse/linux
> 
> Tested your branch as of 10f07641, on a long list of guest VMs.
> No earth-shattering kaboom.

I've been using the mmu_notifier branch @ a3d944233bcf8c for the last 36
hours or so, also without incident.

Unlike most other reporters, I experienced a similar splat on 4.12:

Aug 03 15:02:47 kvm_master kernel: [ cut here ]
Aug 03 15:02:47 kvm_master kernel: WARNING: CPU: 13 PID: 1653 at
arch/x86/kvm/mmu.c:682 mmu_spte_clear_track_bits+0xfb/0x100 [kvm]
Aug 03 15:02:47 kvm_master kernel: Modules linked in: vhost_net vhost
tap xt_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4
xt_tcpudp tun ebtable_filter ebtables ip6table_filter ip6_tables
iptable_filter msr nls_iso8859_1 nls_cp437 intel_rapl ipt_
MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack sb_edac
x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel input_leds pcbc aesni_intel led_class
aes_x86_6
4 mxm_wmi crypto_simd glue_helper uvcvideo cryptd videobuf2_vmalloc
videobuf2_memops igb videobuf2_v4l2 videobuf2_core snd_usb_audio
videodev media joydev ptp evdev mousedev intel_cstate pps_core mac_hid
intel_rapl_perf snd_hda_intel snd_virtuoso snd_usbmidi_lib snd_hda_codec
snd_oxygen_lib snd_hda_core
Aug 03 15:02:47 kvm_master kernel:  snd_mpu401_uart snd_rawmidi
snd_hwdep snd_seq_device snd_pcm snd_timer snd soundcore i2c_algo_bit
pcspkr i2c_i801 lpc_ich ioatdma shpchp dca wmi acpi_power_meter tpm_tis
tpm_tis_core tpm button bridge stp llc sch_fq_codel virtio_pci
virtio_blk virtio_balloon virtio_net virtio_ring virtio kvm_intel kvm sg
ip_tables x_tables hid_logitech_hidpp hid_logitech_dj hid_generic
hid_microsoft usbhid hid sr_mod cdrom sd_mod xhci_pci ahci libahci
xhci_hcd libata usbcore scsi_mod usb_common zfs(PO) zunicode(PO)
zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops drm vfio_pci irqbypass
vfio_virqfd vfio_iommu_type1 vfio vfat fat ext4 crc16 jbd2 fscrypto
mbcache dm_thin_pool dm_cache dm_persistent_data dm_bio_prison dm_bufio
dm_raid raid456 libcrc32c 
Aug 03 15:02:47 kvm_master kernel:  crc32c_generic crc32c_intel
async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq
dm_mod dax raid1 md_mod  
Aug 03 15:02:47 kvm_master kernel: CPU: 13 PID: 1653 Comm: kworker/13:2
Tainted: PB D W  O4.12.3-1-ARCH #1 
Aug 03 15:02:47 kvm_master kernel: Hardware name: Supermicro
SYS-7038A-I/X10DAI, BIOS 2.0a 11/09/2016  
Aug 03 15:02:47 kvm_master kernel: Workqueue: events mmput_async_fn  
Aug 03 15:02:47 kvm_master kernel: task: 9fa89751b900 task.stack:
c179880d8000 
Aug 03 15:02:47 kvm_master kernel: RIP:
0010:mmu_spte_clear_track_bits+0xfb/0x100 [kvm]  
Aug 03 15:02:47 kvm_master kernel: RSP: 0018:c179880dbc20 EFLAGS:
00010246 
Aug 03 15:02:47 kvm_master kernel: RAX:  RBX:
0009c07cce77 RCX: dead00ff   
Aug 03 15:02:47 kvm_master kernel: RDX:  RSI:
9fa82d6d6f08 RDI: f6e76701f300   
Aug 03 15:02:47 kvm_master kernel: RBP: c179880dbc38 R08:
0010 R09: 000d   
Aug 03 15:02:47 kvm_master kernel: R10: 9fa0a56b0008 R11:
9fa0a56b R12: 009c07cc   
Aug 03 15:02:47 kvm_master kernel: R13: 9fa88b99 R14:
9f9e19dbb1b8 R15:    
Aug 03 15:02:47 kvm_master kernel: FS:  ()
GS:9fac5f34() knlGS:
Aug 03 15:02:47 kvm_master kernel: CS:  0010 DS:  ES:  CR0:
80050033   
Aug 03 15:02:47 kvm_master kernel: CR2: d1b542d71000 CR3:
000570a09000 CR4: 003426e0   
Aug 03 15:02:47 kvm_master kernel: DR0:  

Re: [PATCH 00/13] mmu_notifier kill invalidate_page callback

2017-09-01 Thread taskboxtester
taskboxtes...@gmail.com liked your message with Boxer for Android.


On Sep 1, 2017 10:48 AM, Jeff Cook  wrote:

On Wed, Aug 30, 2017, at 10:57 AM, Adam Borowski wrote:
> On Tue, Aug 29, 2017 at 08:56:15PM -0400, Jerome Glisse wrote:
> > I will wait for people to test and for result of my own test before
> > reposting if need be, otherwise i will post as separate patch.
> >
> > > But from a _very_ quick read-through this looks fine. But it obviously
> > > needs testing.
> > > 
> > > People - *especially* the people who saw issues under KVM - can you
> > > try out Jérôme's patch-series? I aded some people to the cc, the full
> > > series is on lkml. Jérôme - do you have a git branch for people to
> > > test that they could easily pull and try out?
> > 
> > https://cgit.freedesktop.org/~glisse/linux mmu-notifier branch
> > git://people.freedesktop.org/~glisse/linux
> 
> Tested your branch as of 10f07641, on a long list of guest VMs.
> No earth-shattering kaboom.

I've been using the mmu_notifier branch @ a3d944233bcf8c for the last 36
hours or so, also without incident.

Unlike most other reporters, I experienced a similar splat on 4.12:

Aug 03 15:02:47 kvm_master kernel: [ cut here ]
Aug 03 15:02:47 kvm_master kernel: WARNING: CPU: 13 PID: 1653 at
arch/x86/kvm/mmu.c:682 mmu_spte_clear_track_bits+0xfb/0x100 [kvm]
Aug 03 15:02:47 kvm_master kernel: Modules linked in: vhost_net vhost
tap xt_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4
xt_tcpudp tun ebtable_filter ebtables ip6table_filter ip6_tables
iptable_filter msr nls_iso8859_1 nls_cp437 intel_rapl ipt_
MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack sb_edac
x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel input_leds pcbc aesni_intel led_class
aes_x86_6
4 mxm_wmi crypto_simd glue_helper uvcvideo cryptd videobuf2_vmalloc
videobuf2_memops igb videobuf2_v4l2 videobuf2_core snd_usb_audio
videodev media joydev ptp evdev mousedev intel_cstate pps_core mac_hid
intel_rapl_perf snd_hda_intel snd_virtuoso snd_usbmidi_lib snd_hda_codec
snd_oxygen_lib snd_hda_core
Aug 03 15:02:47 kvm_master kernel:  snd_mpu401_uart snd_rawmidi
snd_hwdep snd_seq_device snd_pcm snd_timer snd soundcore i2c_algo_bit
pcspkr i2c_i801 lpc_ich ioatdma shpchp dca wmi acpi_power_meter tpm_tis
tpm_tis_core tpm button bridge stp llc sch_fq_codel virtio_pci
virtio_blk virtio_balloon virtio_net virtio_ring virtio kvm_intel kvm sg
ip_tables x_tables hid_logitech_hidpp hid_logitech_dj hid_generic
hid_microsoft usbhid hid sr_mod cdrom sd_mod xhci_pci ahci libahci
xhci_hcd libata usbcore scsi_mod usb_common zfs(PO) zunicode(PO)
zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops drm vfio_pci irqbypass
vfio_virqfd vfio_iommu_type1 vfio vfat fat ext4 crc16 jbd2 fscrypto
mbcache dm_thin_pool dm_cache dm_persistent_data dm_bio_prison dm_bufio
dm_raid raid456 libcrc32c 
Aug 03 15:02:47 kvm_master kernel:  crc32c_generic crc32c_intel
async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq
dm_mod dax raid1 md_mod  
Aug 03 15:02:47 kvm_master kernel: CPU: 13 PID: 1653 Comm: kworker/13:2
Tainted: PB D W  O4.12.3-1-ARCH #1 
Aug 03 15:02:47 kvm_master kernel: Hardware name: Supermicro
SYS-7038A-I/X10DAI, BIOS 2.0a 11/09/2016  
Aug 03 15:02:47 kvm_master kernel: Workqueue: events mmput_async_fn  
Aug 03 15:02:47 kvm_master kernel: task: 9fa89751b900 task.stack:
c179880d8000 
Aug 03 15:02:47 kvm_master kernel: RIP:
0010:mmu_spte_clear_track_bits+0xfb/0x100 [kvm]  
Aug 03 15:02:47 kvm_master kernel: RSP: 0018:c179880dbc20 EFLAGS:
00010246 
Aug 03 15:02:47 kvm_master kernel: RAX:  RBX:
0009c07cce77 RCX: dead00ff   
Aug 03 15:02:47 kvm_master kernel: RDX:  RSI:
9fa82d6d6f08 RDI: f6e76701f300   
Aug 03 15:02:47 kvm_master kernel: RBP: c179880dbc38 R08:
0010 R09: 000d   
Aug 03 15:02:47 kvm_master kernel: R10: 9fa0a56b0008 R11:
9fa0a56b R12: 009c07cc   
Aug 03 15:02:47 kvm_master kernel: R13: 9fa88b99 R14:
9f9e19dbb1b8 R15:    
Aug 03 15:02:47 kvm_master kernel: FS:  ()
GS:9fac5f34() knlGS:
Aug 03 15:02:47 kvm_master kernel: CS:  0010 DS:  ES:  CR0:
80050033   
Aug 03 15:02:47 kvm_master kernel: CR2: d1b542d71000 

Re: [PATCH 2/2] iommu/omap: Add support to program multiple iommus

2017-09-01 Thread Joerg Roedel
Hi Suman,

On Thu, Aug 31, 2017 at 08:14:02AM -0500, Suman Anna wrote:
> The OMAP IOMMU driver has been enhanced to support allowing
> multiple IOMMUs to be programmed by a single client user. This
> support is being added mainly to handle the DSP subsystems on
> the DRA7xx SoCs, which have two MMUs within the same subsystem.
> These MMUs provide translations to a processor core port and
> an internal EDMA port. This support allows both the MMUs to
> be programmed together, but with each one retaining it's own
> internal state objects. The internal EDMA is managed by the
> software running on the DSPs, and this design provides on-par
> functionality with previous generation OMAP DSPs where the
> EDMA and the DSP core shared the same MMU.

I didn't get that from the review, so a question here: Do both MMUs show
show up in sysfs, means, do you register both of them the the core code?

Because I think with the solution implemented here, only a single,
combined MMU should be visible to iommu core code and in sysfs.


Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] MAINTAINERS: use the iommu list for the dma-mapping subsystem

2017-09-01 Thread Joerg Roedel
On Fri, Sep 01, 2017 at 10:48:50AM +0100, Robin Murphy wrote:
> On 01/09/17 08:56, Marek Szyprowski wrote:
> > Hi Christoph,
> > 
> > On 2017-09-01 09:50, Christoph Hellwig wrote:
> >> Any comments?  I'd like to add this to the 4.14 pull request.
> >>
> >> On Sat, Aug 26, 2017 at 11:27:34AM +0200, Christoph Hellwig wrote:
> >>> Maintaining a subsystem with linux-kernel as the main list is painful
> >>> as it has way to much traffic.  On the other hand the dma-mapping
> >>> subsystem is small enough that a list on its own would be silly.
> >>> So use the list for the closes subsystem instead instead.
> >>>
> >>> Signed-off-by: Christoph Hellwig 
> > 
> > Like I already said, its fine for me.
> > Acked-by: Marek Szyprowski 
> 
> Fine by me too.
> 
> Joerg, Alex, are you OK with the odd bit of extra ML traffic?

Fine by me, of course.

Acked-by: Joerg Roedel 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] MAINTAINERS: use the iommu list for the dma-mapping subsystem

2017-09-01 Thread Robin Murphy
On 01/09/17 08:56, Marek Szyprowski wrote:
> Hi Christoph,
> 
> On 2017-09-01 09:50, Christoph Hellwig wrote:
>> Any comments?  I'd like to add this to the 4.14 pull request.
>>
>> On Sat, Aug 26, 2017 at 11:27:34AM +0200, Christoph Hellwig wrote:
>>> Maintaining a subsystem with linux-kernel as the main list is painful
>>> as it has way to much traffic.  On the other hand the dma-mapping
>>> subsystem is small enough that a list on its own would be silly.
>>> So use the list for the closes subsystem instead instead.
>>>
>>> Signed-off-by: Christoph Hellwig 
> 
> Like I already said, its fine for me.
> Acked-by: Marek Szyprowski 

Fine by me too.

Joerg, Alex, are you OK with the odd bit of extra ML traffic?

Thanks,
Robin.

>>> ---
>>>   MAINTAINERS | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 6f7721d1634c..1df11ed346a7 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -4175,7 +4175,7 @@ DMA MAPPING HELPERS
>>>   M:Christoph Hellwig 
>>>   M:Marek Szyprowski 
>>>   R:Robin Murphy 
>>> -L:linux-ker...@vger.kernel.org
>>> +L:iommu@lists.linux-foundation.org
>>>   T:git git://git.infradead.org/users/hch/dma-mapping.git
>>>   W:http://git.infradead.org/users/hch/dma-mapping.git
>>>   S:Supported
>>> -- 
>>> 2.11.0
> 
> Best regards

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/dma: limit the IOVA allocated to dma-ranges region

2017-09-01 Thread Robin Murphy
On 01/09/17 10:26, Joerg Roedel wrote:
> Adding Robin for review.
> 
> On Thu, Aug 31, 2017 at 03:08:21PM -0700, Krishna Reddy wrote:
>> Limit the IOVA allocated to dma-ranges specified for the device.
>> This is necessary to ensure that IOVA allocated is addressable
>> by device.

Why? IOVA allocation is already constrained as much as it should be - if
the device's DMA mask is wrong that's another problem, and this isn't
the right place to fix it.

dma_32bit_pfn means nothing more than an internal detail of IOVA
allocator caching, which is subject to change[1]. As-is, on some
platforms this patch will effectively force all allocations to fail already.

Robin.

[1]:https://www.mail-archive.com/iommu@lists.linux-foundation.org/msg19694.html
>> Signed-off-by: Krishna Reddy 
>> ---
>>  drivers/iommu/dma-iommu.c | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 9d1cebe7f6cb..e8a8320b571b 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -364,6 +364,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct 
>> iommu_domain *domain,
>>  struct iommu_dma_cookie *cookie = domain->iova_cookie;
>>  struct iova_domain *iovad = >iovad;
>>  unsigned long shift, iova_len, iova = 0;
>> +dma_addr_t dma_end_addr;
>>  
>>  if (cookie->type == IOMMU_DMA_MSI_COOKIE) {
>>  cookie->msi_iova += size;
>> @@ -381,6 +382,10 @@ static dma_addr_t iommu_dma_alloc_iova(struct 
>> iommu_domain *domain,
>>  if (iova_len < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
>>  iova_len = roundup_pow_of_two(iova_len);
>>  
>> +/* Limit IOVA allocated to device addressable dma-ranges region. */
>> +dma_end_addr = (dma_addr_t)iovad->dma_32bit_pfn << shift;
>> +dma_limit = dma_limit > dma_end_addr ? dma_end_addr : dma_limit;
> 
> This looks like a good use-case for min().
> 
>> +
>>  if (domain->geometry.force_aperture)
>>  dma_limit = min(dma_limit, domain->geometry.aperture_end);
>>  
>> -- 
>> 2.1.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/vt-d: Don't be too aggressive when clearing one context entry

2017-09-01 Thread Joerg Roedel
On Thu, Aug 31, 2017 at 10:58:11AM +0200, Filippo Sironi wrote:
> Previously, we were invalidating context cache and IOTLB globally when
> clearing one context entry.  This is a tad too aggressive.
> Invalidate the context cache and IOTLB for the interested device only.
> 
> Signed-off-by: Filippo Sironi 
> Cc: David Woodhouse 
> Cc: David Woodhouse 
> Cc: Joerg Roedel 
> Cc: Jacob Pan 
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  drivers/iommu/intel-iommu.c | 42 --
>  1 file changed, 24 insertions(+), 18 deletions(-)

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/dma: limit the IOVA allocated to dma-ranges region

2017-09-01 Thread Joerg Roedel
Adding Robin for review.

On Thu, Aug 31, 2017 at 03:08:21PM -0700, Krishna Reddy wrote:
> Limit the IOVA allocated to dma-ranges specified for the device.
> This is necessary to ensure that IOVA allocated is addressable
> by device.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  drivers/iommu/dma-iommu.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 9d1cebe7f6cb..e8a8320b571b 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -364,6 +364,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct 
> iommu_domain *domain,
>   struct iommu_dma_cookie *cookie = domain->iova_cookie;
>   struct iova_domain *iovad = >iovad;
>   unsigned long shift, iova_len, iova = 0;
> + dma_addr_t dma_end_addr;
>  
>   if (cookie->type == IOMMU_DMA_MSI_COOKIE) {
>   cookie->msi_iova += size;
> @@ -381,6 +382,10 @@ static dma_addr_t iommu_dma_alloc_iova(struct 
> iommu_domain *domain,
>   if (iova_len < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
>   iova_len = roundup_pow_of_two(iova_len);
>  
> + /* Limit IOVA allocated to device addressable dma-ranges region. */
> + dma_end_addr = (dma_addr_t)iovad->dma_32bit_pfn << shift;
> + dma_limit = dma_limit > dma_end_addr ? dma_end_addr : dma_limit;

This looks like a good use-case for min().

> +
>   if (domain->geometry.force_aperture)
>   dma_limit = min(dma_limit, domain->geometry.aperture_end);
>  
> -- 
> 2.1.4
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 3/3] iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801

2017-09-01 Thread John Garry

On 10/08/2017 18:27, Will Deacon wrote:

On Wed, Aug 09, 2017 at 11:07:15AM +0100, Shameer Kolothum wrote:

The HiSilicon erratum 161010801 describes the limitation of HiSilicon
platforms Hip06/Hip07 to support the SMMU mappings for MSI transactions.

On these platforms GICv3 ITS translator is presented with the deviceID
by extending the MSI payload data to 64 bits to include the deviceID.
Hence, the PCIe controller on this platforms has to differentiate the
MSI payload against other DMA payload and has to modify the MSI payload.
This basically makes it difficult for this platforms to have a SMMU
translation for MSI.

This patch implements a ACPI table based quirk to reserve the hw msi
regions in the smmu-v3 driver which means these address regions will
not be translated and will be excluded from iova allocations.

Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/arm-smmu-v3.c | 27 ++-
 1 file changed, 22 insertions(+), 5 deletions(-)


Please can you also add a devicetree binding with corresponding
documentation to enable this workaround on non-ACPI based systems too? It
should be straightforward if you update the arm_smmu_options table.



Hi Will, Lorenzo, Robin,

I have created the patch to add DT support for this erratum. However, 
currently I have only added support for pci-based devices. I'm a bit 
stumped on how to add platform device support, or if we should also add 
support at all. And I would rather ask before sending the patches.


The specific issue is that I know of no platform device with an ITS 
msi-parent which we would want reserve, i.e. do not translate msi. And, 
if there were, how to do it.


The only platform devices I know off with msi ITS parents are SMMUv3 and 
mbigen. For mbigen, the msi are not translated. Actually, as I see under 
IORT solution, for mbigen we don't reserve the hw msi region as the 
SMMUv3 does not have an ID mapping. And we have no equivalent ID mapping 
property for DT SMMUv3, so cannot create an equivalent check.


So here is how the DT iommu get reserved region function with only pci 
device support looks:


/**
 * of_iommu_its_get_resv_regions - Reserved region driver helper
 * @dev: Device from iommu_get_resv_regions()
 * @list: Reserved region list from iommu_get_resv_regions()
 *
 * Returns: Number of reserved regions on success(0 if no associated ITS),
 *  appropriate error value otherwise.
 */
int of_iommu_its_get_resv_regions(struct device *dev, struct list_head 
*head)

{
struct device_node *of_node = NULL;
struct device *parent_dev;
const __be32 *msi_map;
int num_mappings, i, err, len, resv = 0;

/* walk up to find the parent with a msi-map property */
for (parent_dev = dev; parent_dev; parent_dev = parent_dev->parent) {
if (!parent_dev->of_node)
continue;

/*
 * Current logic to reserve ITS regions for only PCI root
 * complex.
 */
msi_map = of_get_property(parent_dev->of_node, "msi-map", );
if (msi_map) {
of_node = parent_dev->of_node;
break;
}
}

if (!of_node)
return 0;

num_mappings = of_count_phandle_with_args(of_node, "msi-map", NULL) 
/ 4;


for (i = 0; i < num_mappings; i++) {
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
struct iommu_resv_region *region;
struct device_node *msi_node;
struct resource resource;
u32 phandle;

phandle = be32_to_cpup(msi_map + 1);
msi_node = of_find_node_by_phandle(phandle);
if (!msi_node)
return -ENODEV;

/*
 * There may be different types of msi-controller, so check
 * for ITS.
 */
if (!of_device_is_compatible(msi_node, "arm,gic-v3-its")) {
of_node_put(msi_node);
continue;
}

err = of_address_to_resource(msi_node, 0, );

of_node_put(msi_node);
if (err)
return err;

region = iommu_alloc_resv_region(resource.start, SZ_128K, prot,
 IOMMU_RESV_MSI);
if (region) {
list_add_tail(>list, head);
resv++;
}
}

return (resv == num_mappings) ? resv : -ENODEV;
}

Any feedback is appreciated.

Thanks,
John



Thanks,

Will

.




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] MAINTAINERS: use the iommu list for the dma-mapping subsystem

2017-09-01 Thread Marek Szyprowski

Hi Christoph,

On 2017-09-01 09:50, Christoph Hellwig wrote:

Any comments?  I'd like to add this to the 4.14 pull request.

On Sat, Aug 26, 2017 at 11:27:34AM +0200, Christoph Hellwig wrote:

Maintaining a subsystem with linux-kernel as the main list is painful
as it has way to much traffic.  On the other hand the dma-mapping
subsystem is small enough that a list on its own would be silly.
So use the list for the closes subsystem instead instead.

Signed-off-by: Christoph Hellwig 


Like I already said, its fine for me.
Acked-by: Marek Szyprowski 


---
  MAINTAINERS | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6f7721d1634c..1df11ed346a7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4175,7 +4175,7 @@ DMA MAPPING HELPERS
  M:Christoph Hellwig 
  M:Marek Szyprowski 
  R:Robin Murphy 
-L: linux-ker...@vger.kernel.org
+L: iommu@lists.linux-foundation.org
  T:git git://git.infradead.org/users/hch/dma-mapping.git
  W:http://git.infradead.org/users/hch/dma-mapping.git
  S:Supported
--
2.11.0


Best regards
--
Marek Szyprowski, PhD
Samsung R Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] MAINTAINERS: use the iommu list for the dma-mapping subsystem

2017-09-01 Thread Christoph Hellwig
Any comments?  I'd like to add this to the 4.14 pull request.

On Sat, Aug 26, 2017 at 11:27:34AM +0200, Christoph Hellwig wrote:
> Maintaining a subsystem with linux-kernel as the main list is painful
> as it has way to much traffic.  On the other hand the dma-mapping
> subsystem is small enough that a list on its own would be silly.
> So use the list for the closes subsystem instead instead.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  MAINTAINERS | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 6f7721d1634c..1df11ed346a7 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4175,7 +4175,7 @@ DMA MAPPING HELPERS
>  M:   Christoph Hellwig 
>  M:   Marek Szyprowski 
>  R:   Robin Murphy 
> -L:   linux-ker...@vger.kernel.org
> +L:   iommu@lists.linux-foundation.org
>  T:   git git://git.infradead.org/users/hch/dma-mapping.git
>  W:   http://git.infradead.org/users/hch/dma-mapping.git
>  S:   Supported
> -- 
> 2.11.0
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
---end quoted text---
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu