Re: [PATCH] iommu/vt-d: Handle hotplug devices' default identity mapping setting

2019-02-22 Thread Jis Ben via iommu
An NVMe device configured in static identity mapping should also cause
this error when removed and rescanned. Essentially, a device that does
a DMA when its driver inits, or one that you can force a DMA from.



On Fri, Feb 22, 2019 at 12:28 AM James Dong  wrote:

> Baolu:
>
> The reproduction depends on devices. HW passthrough PCIe devices with
> default
> identity map could have the issue. Make sure that messages like following
> came
> out in dmesg and their mapping does not change after booting:
>   [   10.167809] DMAR: Hardware identity mapping for device :30:00.0
>   [   10.167823] DMAR: Hardware identity mapping for device :30:00.1
>
> Devices which make following true could also be used for the experiment:
>   > static int iommu_should_identity_map(struct device *dev, int startup)
>   > {
>   > if ((iommu_identity_mapping & IDENTMAP_AZALIA) &&
> IS_AZALIA(pdev))
>   > return 1;
>   > if ((iommu_identity_mapping & IDENTMAP_GFX) &&
> IS_GFX_DEVICE(pdev))
>   > return 1;
>
> Once they are up, remove them first by following command:
>   echo 1 > /sys/bus/pci/devices/\:03\:00.1/remove
>
> Then trigger the hotplug device rescanning:
>   echo 1 > /sys/bus/pci/rescan
>
> To provide an example of specific devices on the market, I need to try out.
> Or, if it is fine with you, forcing a PCIe NIC card to be default hardware
> passthrough by changing the intel-iommu.c is another easy way to reproduce
> this issue.
>
> Best Regards,
> James
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/vt-d: Handle hotplug devices' default identity mapping setting

2019-02-22 Thread Jis Ben via iommu
An NVMe device configured in static identity mapping should also cause
this error when removed and rescanned. Essentially, a device that does
a DMA when its driver inits, or one that you can force a DMA from.

-Jis


On Fri, Feb 22, 2019 at 12:28 AM James Dong  wrote:
>
> Baolu:
>
> The reproduction depends on devices. HW passthrough PCIe devices with default
> identity map could have the issue. Make sure that messages like following came
> out in dmesg and their mapping does not change after booting:
>   [   10.167809] DMAR: Hardware identity mapping for device :30:00.0
>   [   10.167823] DMAR: Hardware identity mapping for device :30:00.1
>
> Devices which make following true could also be used for the experiment:
>   > static int iommu_should_identity_map(struct device *dev, int startup)
>   > {
>   > if ((iommu_identity_mapping & IDENTMAP_AZALIA) && 
> IS_AZALIA(pdev))
>   > return 1;
>   > if ((iommu_identity_mapping & IDENTMAP_GFX) && 
> IS_GFX_DEVICE(pdev))
>   > return 1;
>
> Once they are up, remove them first by following command:
>   echo 1 > /sys/bus/pci/devices/\:03\:00.1/remove
>
> Then trigger the hotplug device rescanning:
>   echo 1 > /sys/bus/pci/rescan
>
> To provide an example of specific devices on the market, I need to try out.
> Or, if it is fine with you, forcing a PCIe NIC card to be default hardware
> passthrough by changing the intel-iommu.c is another easy way to reproduce
> this issue.
>
> Best Regards,
> James
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH V5 2/3] HYPERV/IOMMU: Add Hyper-V stub IOMMU driver

2019-02-22 Thread Michael Kelley via iommu
From: tianyu@microsoft.com  Sent: Friday, 
February 22, 2019 4:12 AM
> 
> On the bare metal, enabling X2APIC mode requires interrupt remapping
> function which helps to deliver irq to cpu with 32-bit APIC ID.
> Hyper-V doesn't provide interrupt remapping function so far and Hyper-V
> MSI protocol already supports to deliver interrupt to the CPU whose
> virtual processor index is more than 255. IO-APIC interrupt still has
> 8-bit APIC ID limitation.
> 
> This patch is to add Hyper-V stub IOMMU driver in order to enable
> X2APIC mode successfully in Hyper-V Linux guest. The driver returns X2APIC
> interrupt remapping capability when X2APIC mode is available. Otherwise,
> it creates a Hyper-V irq domain to limit IO-APIC interrupts' affinity
> and make sure cpus assigned with IO-APIC interrupt have 8-bit APIC ID.
> 
> Define 24 IO-APIC remapping entries because Hyper-V only expose one
> single IO-APIC and one IO-APIC has 24 pins according IO-APIC spec(
> https://pdos.csail.mit.edu/6.828/2016/readings/ia32/ioapic.pdf).
> 
> Signed-off-by: Lan Tianyu 
> ---
> Change since v4:
>- Fix the loop of scan cpu's APIC id
> 
> Change since v3:
>- Make Hyper-V IOMMU as Hyper-V default driver
>- Fix hypervisor_is_type() input parameter
>- Check possible cpu numbers during scan 0~255 cpu's apic id.
> 
> Change since v2:
>- Improve comment about why save IO-APIC entry in the irq chip data.
>- Some code improvement.
>- Improve statement in the IOMMU Kconfig.
> 
> Change since v1:
>   - Remove unused pr_fmt
>   - Make ioapic_ir_domain as static variable
>   - Remove unused variables cfg and entry in the 
> hyperv_irq_remapping_alloc()
>   - Fix comments

Reviewed-by: Michael Kelley 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH V4 2/3] HYPERV/IOMMU: Add Hyper-V stub IOMMU driver

2019-02-22 Thread Tianyu Lan via iommu
Hi Michael:
   Thanks for your review.

-Original Message-
From: Michael Kelley  
Sent: Friday, February 22, 2019 1:28 AM
To: lantianyu1...@gmail.com
Cc: Tianyu Lan ; j...@8bytes.org; 
mchehab+sams...@kernel.org; da...@davemloft.net; gre...@linuxfoundation.org; 
nicolas.fe...@microchip.com; a...@arndb.de; linux-ker...@vger.kernel.org; 
iommu@lists.linux-foundation.org; KY Srinivasan ; vkuznets 
; alex.william...@redhat.com; sas...@kernel.org; 
dan.carpen...@oracle.com; linux-hyp...@vger.kernel.org
Subject: RE: [PATCH V4 2/3] HYPERV/IOMMU: Add Hyper-V stub IOMMU driver

From: lantianyu1...@gmail.com  Sent: Monday, February 
11, 2019 6:20 AM
> + /*
> +  * Hyper-V doesn't provide irq remapping function for
> +  * IO-APIC and so IO-APIC only accepts 8-bit APIC ID.
> +  * Cpu's APIC ID is read from ACPI MADT table and APIC IDs
> +  * in the MADT table on Hyper-v are sorted monotonic increasingly.
> +  * APIC ID reflects cpu topology. There maybe some APIC ID
> +  * gaps when cpu number in a socket is not power of two. Prepare
> +  * max cpu affinity for IOAPIC irqs. Scan cpu 0-255 and set cpu
> +  * into ioapic_max_cpumask if its APIC ID is less than 256.
> +  */
> + for (i = min_t(unsigned int, num_possible_cpus(), 255); i >= 0; i--)

The above isn't quite right.  For example, if num_possible_cpus() is 8, then 
the loop will be executed 9 times, for values 8 down through 0.
It should be executed for values 7 down through 0.

Yes, fix this in the V5. Thanks.

> + if (cpu_physical_id(i) < 256)
> + cpumask_set_cpu(i, _max_cpumask);
> +
> + return 0;
> +}

Michael
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 7/9] vfio/mdev: Add iommu related member in mdev_device

2019-02-22 Thread Alex Williamson
On Fri, 22 Feb 2019 06:34:38 -0800
Christoph Hellwig  wrote:

> On Fri, Feb 22, 2019 at 10:19:25AM +0800, Lu Baolu wrote:
> > A parent device might create different types of mediated
> > devices. For example, a mediated device could be created
> > by the parent device with full isolation and protection
> > provided by the IOMMU. One usage case could be found on
> > Intel platforms where a mediated device is an assignable
> > subset of a PCI, the DMA requests on behalf of it are all
> > tagged with a PASID. Since IOMMU supports PASID-granular
> > translations (scalable mode in VT-d 3.0), this mediated
> > device could be individually protected and isolated by an
> > IOMMU.
> > 
> > This patch adds a new member in the struct mdev_device to
> > indicate that the mediated device represented by mdev could
> > be isolated and protected by attaching a domain to a device
> > represented by mdev->iommu_device. It also adds a helper to
> > add or set the iommu device.
> > 
> > * mdev_device->iommu_device
> >   - This, if set, indicates that the mediated device could
> > be fully isolated and protected by IOMMU via attaching
> > an iommu domain to this device. If empty, it indicates
> > using vendor defined isolation, hence bypass IOMMU.
> > 
> > * mdev_set/get_iommu_device(dev, iommu_device)
> >   - Set or get the iommu device which represents this mdev
> > in IOMMU's device scope. Drivers don't need to set the
> > iommu device if it uses vendor defined isolation.
> > 
> > Cc: Ashok Raj 
> > Cc: Jacob Pan 
> > Cc: Kevin Tian 
> > Cc: Liu Yi L 
> > Suggested-by: Kevin Tian 
> > Suggested-by: Alex Williamson 
> > Signed-off-by: Lu Baolu 
> > Reviewed-by: Jean-Philippe Brucker 
> > ---
> >  drivers/vfio/mdev/mdev_core.c| 18 ++
> >  drivers/vfio/mdev/mdev_private.h |  1 +
> >  include/linux/mdev.h | 14 ++
> >  3 files changed, 33 insertions(+)
> > 
> > diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> > index 0212f0ee8aea..9be58d392d2b 100644
> > --- a/drivers/vfio/mdev/mdev_core.c
> > +++ b/drivers/vfio/mdev/mdev_core.c
> > @@ -390,6 +390,24 @@ int mdev_device_remove(struct device *dev, bool 
> > force_remove)
> > return 0;
> >  }
> >  
> > +int mdev_set_iommu_device(struct device *dev, struct device *iommu_device)
> > +{
> > +   struct mdev_device *mdev = to_mdev_device(dev);
> > +
> > +   mdev->iommu_device = iommu_device;
> > +
> > +   return 0;
> > +}
> > +EXPORT_SYMBOL(mdev_set_iommu_device);  
> 
> As said before, please make all new mdev/vfio exports EXPORT_SYMBOL_GPL
> to fit the other exports in vfio.

Well...

$ grep EXPORT_SYMBOL drivers/vfio/mdev/*
drivers/vfio/mdev/mdev_core.c:EXPORT_SYMBOL(mdev_parent_dev);
drivers/vfio/mdev/mdev_core.c:EXPORT_SYMBOL(mdev_get_drvdata);
drivers/vfio/mdev/mdev_core.c:EXPORT_SYMBOL(mdev_set_drvdata);
drivers/vfio/mdev/mdev_core.c:EXPORT_SYMBOL(mdev_dev);
drivers/vfio/mdev/mdev_core.c:EXPORT_SYMBOL(mdev_from_dev);
drivers/vfio/mdev/mdev_core.c:EXPORT_SYMBOL(mdev_uuid);
drivers/vfio/mdev/mdev_core.c:EXPORT_SYMBOL(mdev_register_device);
drivers/vfio/mdev/mdev_core.c:EXPORT_SYMBOL(mdev_unregister_device);
drivers/vfio/mdev/mdev_driver.c:EXPORT_SYMBOL_GPL(mdev_bus_type);
drivers/vfio/mdev/mdev_driver.c:EXPORT_SYMBOL(mdev_register_driver);
drivers/vfio/mdev/mdev_driver.c:EXPORT_SYMBOL(mdev_unregister_driver);

For better or worse, the mdev interface does allow non-GPL vendor
drivers.  This export seems consistent with that, it's a simple
association allowing the vendor driver to define an IOMMU API backing
device for an mdev device.  I don't think this association implies
sufficient operational knowledge to require a GPL symbol and it's been
requested for use by one of those non-GPL mdev vendor drivers,
therefore I support this definition.  Thanks,

Alex
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 7/9] vfio/mdev: Add iommu related member in mdev_device

2019-02-22 Thread Christoph Hellwig
On Fri, Feb 22, 2019 at 10:19:25AM +0800, Lu Baolu wrote:
> A parent device might create different types of mediated
> devices. For example, a mediated device could be created
> by the parent device with full isolation and protection
> provided by the IOMMU. One usage case could be found on
> Intel platforms where a mediated device is an assignable
> subset of a PCI, the DMA requests on behalf of it are all
> tagged with a PASID. Since IOMMU supports PASID-granular
> translations (scalable mode in VT-d 3.0), this mediated
> device could be individually protected and isolated by an
> IOMMU.
> 
> This patch adds a new member in the struct mdev_device to
> indicate that the mediated device represented by mdev could
> be isolated and protected by attaching a domain to a device
> represented by mdev->iommu_device. It also adds a helper to
> add or set the iommu device.
> 
> * mdev_device->iommu_device
>   - This, if set, indicates that the mediated device could
> be fully isolated and protected by IOMMU via attaching
> an iommu domain to this device. If empty, it indicates
> using vendor defined isolation, hence bypass IOMMU.
> 
> * mdev_set/get_iommu_device(dev, iommu_device)
>   - Set or get the iommu device which represents this mdev
> in IOMMU's device scope. Drivers don't need to set the
> iommu device if it uses vendor defined isolation.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Cc: Liu Yi L 
> Suggested-by: Kevin Tian 
> Suggested-by: Alex Williamson 
> Signed-off-by: Lu Baolu 
> Reviewed-by: Jean-Philippe Brucker 
> ---
>  drivers/vfio/mdev/mdev_core.c| 18 ++
>  drivers/vfio/mdev/mdev_private.h |  1 +
>  include/linux/mdev.h | 14 ++
>  3 files changed, 33 insertions(+)
> 
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index 0212f0ee8aea..9be58d392d2b 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -390,6 +390,24 @@ int mdev_device_remove(struct device *dev, bool 
> force_remove)
>   return 0;
>  }
>  
> +int mdev_set_iommu_device(struct device *dev, struct device *iommu_device)
> +{
> + struct mdev_device *mdev = to_mdev_device(dev);
> +
> + mdev->iommu_device = iommu_device;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(mdev_set_iommu_device);

As said before, please make all new mdev/vfio exports EXPORT_SYMBOL_GPL
to fit the other exports in vfio.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 7/7] iommu/ipmmu-vmsa: Add suspend/resume support

2019-02-22 Thread Robin Murphy

Hi Geert,

On 20/02/2019 15:05, Geert Uytterhoeven wrote:

During PSCI system suspend, R-Car Gen3 SoCs are powered down, and all
IPMMU state is lost.  Hence after s2ram, devices wired behind an IPMMU,
and configured to use it, will see their DMA operations hang.

To fix this, restore all IPMMU contexts, and re-enable all active
micro-TLBs during system resume.

To avoid overhead on platforms not needing it, the resume code has a
build time dependency on sleep and PSCI support, and a runtime
dependency on PSCI.


Frankly, that aspect looks horribly hacky. It's not overly reasonable 
for random device drivers to be poking at PSCI internals, and while it 
might happen to correlate on some known set of current SoCs with current 
firmware, in general it's really too fragile to be accurate:


- Firmware is free to implement suspend callbacks in a way which doesn't 
actually lose power.
- Support for CPU_SUSPEND does not imply that SYSTEM_SUSPEND is even 
implemented, let alone how it might behave.
- The dev_pm_ops callbacks can also be used for hibernate, wherein it 
doesn't really matter what the firmware may or may not do if the user 
has pulled the plug and resumed us a week later.


Furthermore, I think any attempt to detect whether you need to resume or 
not is inherently fraught with danger - from testing the arm-smmu 
runtime PM ops, I've seen register state take a surprisingly long time 
to decay in a switched-off power domain, to the point where unless you 
check every bit of every register you can't necessarily be certain that 
they're really all still valid, and by that point you're doing far more 
work than just unconditionally reinitialising the whole state anyway. 
Upon resuming from hibernate, the state left by the cold-boot stage 
almost certainly *will* be valid, but it probably won't be the state you 
actually want.


Really, the whole idea smells of the premature optimisation demons 
anyway - is the resume overhead measurably significant?



Signed-off-by: Geert Uytterhoeven 
---
This patch takes a different approach than the BSP, which implements a
bulk save/restore of all registers during system suspend/resume.

  drivers/iommu/ipmmu-vmsa.c | 52 +-
  1 file changed, 51 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 92a766dd8b459f0c..5d22139914e8f033 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -22,6 +22,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -36,7 +37,10 @@
  #define arm_iommu_detach_device(...)  do {} while (0)
  #endif
  
-#define IPMMU_CTX_MAX 8U

+#define IPMMU_CTX_MAX  8U
+#define IPMMU_CTX_INVALID  -1
+
+#define IPMMU_UTLB_MAX 48U
  
  struct ipmmu_features {

bool use_ns_alias_offset;
@@ -58,6 +62,7 @@ struct ipmmu_vmsa_device {
spinlock_t lock;/* Protects ctx and domains[] */
DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
+   s8 utlb_ctx[IPMMU_UTLB_MAX];
  
  	struct iommu_group *group;

struct dma_iommu_mapping *mapping;
@@ -335,6 +340,7 @@ static void ipmmu_utlb_enable(struct ipmmu_vmsa_domain 
*domain,
ipmmu_write(mmu, IMUCTR(utlb),
IMUCTR_TTSEL_MMU(domain->context_id) | IMUCTR_FLUSH |
IMUCTR_MMUEN);
+   mmu->utlb_ctx[utlb] = domain->context_id;
  }
  
  /*

@@ -346,6 +352,7 @@ static void ipmmu_utlb_disable(struct ipmmu_vmsa_domain 
*domain,
struct ipmmu_vmsa_device *mmu = domain->mmu;
  
  	ipmmu_write(mmu, IMUCTR(utlb), 0);

+   mmu->utlb_ctx[utlb] = IPMMU_CTX_INVALID;
  }
  
  static void ipmmu_tlb_flush_all(void *cookie)

@@ -1043,6 +1050,7 @@ static int ipmmu_probe(struct platform_device *pdev)
spin_lock_init(>lock);
bitmap_zero(mmu->ctx, IPMMU_CTX_MAX);
mmu->features = of_device_get_match_data(>dev);
+   memset(mmu->utlb_ctx, IPMMU_CTX_INVALID, mmu->features->num_utlbs);
dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(40));
  
  	/* Map I/O memory and request IRQ. */

@@ -1158,10 +1166,52 @@ static int ipmmu_remove(struct platform_device *pdev)
return 0;
  }
  
+#if defined(CONFIG_PM_SLEEP) && defined(CONFIG_ARM_PSCI_FW)

+static int ipmmu_resume_noirq(struct device *dev)
+{
+   struct ipmmu_vmsa_device *mmu = dev_get_drvdata(dev);
+   unsigned int i;
+
+   /* This is the best we can do to check for the presence of PSCI */
+   if (!psci_ops.cpu_suspend)
+   return 0;
+
+   /* Reset root MMU and restore contexts */
+   if (ipmmu_is_root(mmu)) {
+   ipmmu_device_reset(mmu);
+
+   for (i = 0; i < mmu->num_ctx; i++) {
+   if (!mmu->domains[i])
+   continue;
+
+   ipmmu_context_init(mmu->domains[i]);
+   }


Unless the hardware has some 

Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-02-22 Thread Borislav Petkov
On Fri, Feb 22, 2019 at 09:42:41AM +0100, Joerg Roedel wrote:
> The current default of 256MB was found by experiments on a bigger
> number of machines, to create a reasonable default that is at least
> likely to be sufficient of an average machine.

Exactly, and this is what makes sense.

The code should try the requested reservation and if it fails, it should
try high allocation with default swiotlb size because we need to reserve
*some* range.

If that reservation succeeds, we should say something along the lines of

"... requested range failed, reserved  range instead."

And then in Documentation/admin-guide/kernel-parameters.txt above the
crashkernel= explanations, the allocation strategy of best effort should
be explained in short. That the kernel will try to allocate high if the
requested allocation didn't succeed and that the user can tweak the
allocation with the below options.

Bottom line is: the kernel should assist the user and try harder to
allocate *some* range for a crash kernel when there's no detailed
specification what that range should be.

*If* the user adds ,low, high, then the kernel should try only that
specified range because the assumption is that the user knows what she's
doing.

But if the user simply wants a range for a crash kernel without stating
where that range should be in particular and it's placement is a don't
care - as long as there is a range - then the kernel should simply try
high, etc.

Makes sense?

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 0/7] Add virtio-iommu driver

2019-02-22 Thread Jean-Philippe Brucker
Hi Thiago,

On 21/02/2019 22:18, Thiago Jung Bauermann wrote:
> 
> Hello Jean-Philippe,
> 
> Jean-Philippe Brucker  writes:
>> Makes sense, though I think other virtio devices have been developed a
>> little more organically: device and driver code got upstreamed first,
>> and then the specification describing their interface got merged into
>> the standard. For example I believe that code for crypto, input and GPU
>> devices were upstreamed long before the specification was merged. Once
>> an implementation is upstream, the interface is expected to be
>> backward-compatible (all subsequent changes are introduced using feature
>> bits).
>>
>> So I've been working with this process in mind, also described by Jens
>> at KVM forum 2017 [3]:
>> (1) Reserve a device ID, and get that merged into virtio (ID 23 for
>> virtio-iommu was reserved last year)
>> (2) Open-source an implementation (this driver and Eric's device)
>> (3) Formalize and upstream the device specification
>>
>> But I get that some overlap between (2) and (3) would have been better.
>> So far the spec document has been reviewed mainly from the IOMMU point
>> of view, and might require more changes to be in line with the other
>> virtio devices -- hopefully just wording changes. I'll kick off step
>> (3), but I think the virtio folks are a bit busy with finalizing the 1.1
>> spec so I expect it to take a while.
> 
> I read v0.9 of the spec and have some minor comments, hope this is a
> good place to send them:

Thanks a lot, I'll fix them in my next posting. Note that I recently
sent v0.10 to virtio-comment, to request inclusion into the standard [1]
but your comments still apply to v0.10.

[1]
https://lists.oasis-open.org/archives/virtio-comment/201901/msg00016.html

> 1. In section 2.6.2, one reads
> 
> If the VIRTIO_IOMMU_F_INPUT_RANGE feature is offered and the range
> described by fields virt_start and virt_end doesn’t fit in the range
> described by input_range, the device MAY set status to VIRTIO_-
> IOMMU_S_RANGE and ignore the request.
> 
> Shouldn't int say "If the VIRTIO_IOMMU_F_INPUT_RANGE feature is
> negotiated" instead?

Yes, that seems clearer and more consistent with other devices, I'll
change it. In this case "offered" is equivalent to "negotiated", because
the driver SHOULD accept the feature or else the device may refuse to
set FEATURES_OK. A valid input_range field generally indicates that the
device is incapable of creating mappings outside this range, and it's
important that the driver acknowledges it.

> 2. There's a typo at the end of section 2.6.5:
> 
> The VIRTIO_IOMMU_MAP_F_MMIO flag is a memory type rather than a
> protection lag.
> 
> s/lag/flag/

Fixed in v0.10

> 3. In section 3.1.2.1.1, the viommu compatible field says "virtio,mmio".
> Shouldn't it say "virtio,mmio-iommu" instead, to be consistent with
> "virtio,pci-iommu"?

"virtio,mmio" already exists, and allows the virtio-mmio driver to pick
up any virtio device. The device type is then discovered while probing,
and doesn't need to be in the compatible string.

"virtio,pci-iommu" is something I introduced specifically for the
virtio-iommu, since it's the only virtio-pci device that requires a
device tree node - to describe the IOMMU topology earlier than the PCI
probe. If we want symmetry I'd rather replace "virtio,pci-iommu" with
"virtio,pci", but it wouldn't be used by other virtio device types. And
I have to admit I'm reluctant to change this binding now, given that it
has been reviewed (patch 2/7) and is ready to go.

> 4. There's a typo in section 3.3:
> 
> A host bridge may limit the input address space – transaction
> accessing some addresses won’t reach the physical IOMMU.
> 
> s/transaction/transactions/

I'll fix it, thanks

> I also have one last comment which you may freely ignore, considering
> it's clearly just personal opinion and also considering that the
> specification is mature at this point: it specifies memory ranges by
> specifying start and end addresses. My experience has been that this is
> error prone, leading to confusion and bugs regarding whether the end
> address is inclusive or exclusive. I tend to prefer expressing memory
> ranges by specifying a start address and a length, which eliminates
> ambiguity.

While the initial versions had start and length, I changed it because it
cannot express the whole 64-bit range. If the guest wants to do
unmap-all (and the input range is 64-bit), then it can send a single
UNMAP request with start=0, end=~0ULL, which wouldn't be possible with
start and length. Arguably a very rare use-case, but one I've tried to
implement at least twice with VFIO :) I'll see if I can make it more
obvious that end is inclusive, since the word doesn't appear at all in
the current draft.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [01/32] net: pasemi: set a 64-bit DMA mask on the DMA device

2019-02-22 Thread Michael Ellerman
On Wed, 2019-02-13 at 07:01:02 UTC, Christoph Hellwig wrote:
> The pasemi driver never set a DMA mask, and given that the powerpc
> DMA mapping routines never check it this worked ok so far.  But the
> generic dma-direct code which I plan to switch on for powerpc checks
> the DMA mask and fails unsupported mapping requests, so we need to
> make sure the proper 64-bit mask is set.
> 
> Reported-by: Christian Zigotzky 
> Signed-off-by: Christoph Hellwig 
> Tested-by: Christian Zigotzky 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/74ebe3e733b791f37415b3a1b917ee50

cheers
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-02-22 Thread Joerg Roedel
On Fri, Feb 22, 2019 at 10:11:01AM +0800, Dave Young wrote:
> In case people have a lot of devices need more swiotlb, then he manually
> set the ,high with ,low together.

The option to specify the high and low values for the crashkernel are
important for certain machines. The point is that swiotlb already
allocates 64MB of low memory by default. But that memory is only used
for 32bit DMA-mask devices that want to DMA into high memory. There are
drivers just allocating GFP_DMA32 memory, which also ends up in the low
region (but not swiotlb), that is why the previous default of 72MB low
memory was not enough, it only left 8MB of GFP_DMA32 memory. The current
default of 256MB was found by experiments on a bigger number of
machines, to create a reasonable default that is at least likely to be
sufficient of an average machine.

There is no way today for the kernel to find an optimum value for the
amount of low memory required to successfully create a crash dump. It
depends on the amount of devices in the system and how the drivers
for them are written. The drivers have no way to report back their
requirements, and even if they had, at the time the allocation happens
no driver is loaded yet.

So it is up to the system administrator to find workable values for the
high and low memory requirements, even using experiments as a last
resort.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH] iommu/vt-d: Handle hotplug devices' default identity mapping setting

2019-02-22 Thread James Dong via iommu
Baolu:

The reproduction depends on devices. HW passthrough PCIe devices with default
identity map could have the issue. Make sure that messages like following came
out in dmesg and their mapping does not change after booting:
  [   10.167809] DMAR: Hardware identity mapping for device :30:00.0
  [   10.167823] DMAR: Hardware identity mapping for device :30:00.1

Devices which make following true could also be used for the experiment:
  > static int iommu_should_identity_map(struct device *dev, int startup)
  > {
  > if ((iommu_identity_mapping & IDENTMAP_AZALIA) && 
IS_AZALIA(pdev))
  > return 1;
  > if ((iommu_identity_mapping & IDENTMAP_GFX) && 
IS_GFX_DEVICE(pdev))
  > return 1;

Once they are up, remove them first by following command:
  echo 1 > /sys/bus/pci/devices/\:03\:00.1/remove

Then trigger the hotplug device rescanning:
  echo 1 > /sys/bus/pci/rescan

To provide an example of specific devices on the market, I need to try out.
Or, if it is fine with you, forcing a PCIe NIC card to be default hardware
passthrough by changing the intel-iommu.c is another easy way to reproduce
this issue.

Best Regards,
James
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu