Re: [PATCH v2] iommu/vt-d: Don't be too aggressive when clearing one context entry

2017-08-31 Thread Woodhouse, David via iommu
On Thu, 2017-08-31 at 10:58 +0200, Filippo Sironi wrote:
> Previously, we were invalidating context cache and IOTLB globally when
> clearing one context entry.  This is a tad too aggressive.
> Invalidate the context cache and IOTLB for the interested device only.
> 
> Signed-off-by: Filippo Sironi 
> Cc: David Woodhouse 
> Cc: David Woodhouse 
> Cc: Joerg Roedel 
> Cc: Jacob Pan 
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-ker...@vger.kernel.org

Acked-by: David Woodhouse 

I'd actually quite like to see us tidy this up some more. If there are
aliases, we're doing the flush multiple times. It might be nicer to do
no flushing in domain_contact_clear_one(), which is invoked for each
alias, and just to update a (u16 *)did. If *did was empty, set it. If
it was set (i.e. if there are aliases), then set it to a magic value
like 0x. 

Then we can do the actual flush — either device-specific, or global —
just once in domain_context_clear().

We'd probably just move domain_context_clear_one() into its only caller
domain_context_clear_one_cb() while we're at it.

But in the meantime I'm more than happy with this patch which turns
multiple global flushes into multiple domain-specific flushes, and does
the right thing in the common case.

> ---

v2: Changed subject, killed clear_context_table(). Anything else?

>  drivers/iommu/intel-iommu.c | 42 --
>  1 file changed, 24 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 3e8636f1220e..1aa4ad7974b9 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -974,20 +974,6 @@ static int device_context_mapped(struct intel_iommu 
> *iommu, u8 bus, u8 devfn)
>   return ret;
>  }
>  
> -static void clear_context_table(struct intel_iommu *iommu, u8 bus, u8 devfn)
> -{
> - struct context_entry *context;
> - unsigned long flags;
> -
> - spin_lock_irqsave(>lock, flags);
> - context = iommu_context_addr(iommu, bus, devfn, 0);
> - if (context) {
> - context_clear_entry(context);
> - __iommu_flush_cache(iommu, context, sizeof(*context));
> - }
> - spin_unlock_irqrestore(>lock, flags);
> -}
> -
>  static void free_context_table(struct intel_iommu *iommu)
>  {
>   int i;
> @@ -2351,13 +2337,33 @@ static inline int domain_pfn_mapping(struct 
> dmar_domain *domain, unsigned long i
>  
>  static void domain_context_clear_one(struct intel_iommu *iommu, u8 bus, u8 
> devfn)
>  {
> + unsigned long flags;
> + struct context_entry *context;
> + u16 did_old;
> +
>   if (!iommu)
>   return;
>  
> - clear_context_table(iommu, bus, devfn);
> - iommu->flush.flush_context(iommu, 0, 0, 0,
> -    DMA_CCMD_GLOBAL_INVL);
> - iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH);
> + spin_lock_irqsave(>lock, flags);
> + context = iommu_context_addr(iommu, bus, devfn, 0);
> + if (!context) {
> + spin_unlock_irqrestore(>lock, flags);
> + return;
> + }
> + did_old = context_domain_id(context);
> + context_clear_entry(context);
> + __iommu_flush_cache(iommu, context, sizeof(*context));
> + spin_unlock_irqrestore(>lock, flags);
> + iommu->flush.flush_context(iommu,
> +    did_old,
> +    (((u16)bus) << 8) | devfn,

Arguably we ought to be PCI_DEVID() there, but I feel bad pointing that
out when you're adding about the fifth instance of open-coding it... :)

-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature



Amazon Web Services UK Limited. Registered in England and Wales with 
registration number 08650665 and which has its registered office at 60 Holborn 
Viaduct, London EC1A 2FD, United Kingdom.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/4] iommu: add a bitmap based dma address allocator

2015-11-25 Thread Woodhouse, David
On Wed, 2015-11-25 at 15:46 +0100, jroe...@suse.de wrote:
> On Tue, Nov 24, 2015 at 02:05:12PM -0800, Shaohua Li wrote:
> > The lib/iommu-common.c uses a bitmap and a lock. This implementation
> > actually uses a percpu_ida which completely avoids locking. It would be
> > possible to make lib/iommu-common.c use percpu_ida too if somebody wants
> > to do it, but I think this shouldn't be a blocker for these patches
> > giving it has huge performance gain.
> 
> It doesn't "completely avoids locking", the percpu_ida code uses a lock
> internally too. Also, what is the memory and device address space
> overhead per cpu?

A percpu lock doesn't bounce cachelines between CPUs very much, so from
that point of view it might as well not exist :)

-- 
  Sent with Evolution's ActiveSync support.

David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation




smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [patch] iommu/vt-d: shift wrapping bug in prq_event_thread()

2015-10-16 Thread Woodhouse, David
On Thu, 2015-10-15 at 21:25 +0300, Dan Carpenter wrote:
> The "req->addr" variable is a bit field declared as "u64 addr:52;".
> The "address" variable is a u64.  We need to cast "req->addr" to a u64
> before the shift or the result is truncated to 52 bits.
> 
> Fixes: 0b9252a34858 ('iommu/vt-d: Implement page request handling')
> Signed-off-by: Dan Carpenter 

Applied; thanks.

> Also does this code work if PAGE_SHIFT is more than 12?  (I am a newbie
> so this is not rhetorical, I don't know the answer).

Er, no it doesn't. That should have been VTD_PAGE_SHIFT, not PAGE_SHIFT
— and then it *will* always be 12. Thanks for pointing it out.

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation



smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: iommu/vt-d: Implement page request handling

2015-10-15 Thread Woodhouse, David
On Thu, 2015-10-15 at 21:25 +0300, Dan Carpenter wrote:
> 
> The patch 0b9252a34858: "iommu/vt-d: Implement page request handling" 
> from Oct 8, 2015, leads to the following Smatch complaint:
> 
> drivers/iommu/intel-svm.c:452 prq_event_thread()
>  error: we previously assumed 'svm' could be null (see line 412)

Thanks. I'm going to fix it thus:

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 817be76..b7e923a 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -510,7 +510,7 @@ static irqreturn_t prq_event_thread(int irq, void *d)
pr_err("%s: Page request for invalid PASID %d: 
%08llx %08llx\n",
   iommu->name, req->pasid, ((unsigned long 
long *)req)[0],
   ((unsigned long long *)req)[1]);
-   goto bad_req;
+   goto no_pasid;
}
}
 
@@ -552,7 +552,11 @@ static irqreturn_t prq_event_thread(int irq, void *d)
(req->wr_req << 1) | (req->exe_req);
sdev->ops->fault_cb(sdev->dev, req->pasid, req->addr, 
req->private, rwxp, result);
}
-
+   /* We get here in the error case where the PASID lookup failed,
+  and these can be NULL. Do not use them below this point! */
+   sdev = NULL;
+   svm = NULL;
+   no_pasid:
if (req->lpig) {
/* Page Group Response */
resp.low = QI_PGRP_PASID(req->pasid) |
@@ -562,7 +566,7 @@ static irqreturn_t prq_event_thread(int irq, void *d)
resp.high = QI_PGRP_IDX(req->prg_index) |
QI_PGRP_PRIV(req->private) | 
QI_PGRP_RESP_CODE(result);
 
-   qi_submit_sync(, svm->iommu);
+   qi_submit_sync(, iommu);
} else if (req->srr) {
/* Page Stream Response */
resp.low = QI_PSTRM_IDX(req->prg_index) |
@@ -571,7 +575,7 @@ static irqreturn_t prq_event_thread(int irq, void *d)
resp.high = QI_PSTRM_ADDR(address) | 
QI_PSTRM_DEVFN(req->devfn) |
QI_PSTRM_RESP_CODE(result);
 
-   qi_submit_sync(, svm->iommu);
+   qi_submit_sync(, iommu);
}
 
head = (head + sizeof(*req)) & PRQ_RING_MASK;
-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation



smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH char-misc-next v2 04/22] iommu: Allow iova to be used without requiring IOMMU_SUPPORT

2015-10-05 Thread Woodhouse, David
On Tue, 2015-09-29 at 18:09 -0700, Ashutosh Dixit wrote:
> From: Sudeep Dutt 
> 
> iova is a library which can be built without IOMMU_SUPPORT
> 
> Signed-off-by: Sudeep Dutt 

The first three of these patches are in 4.3-rc4 already. Apologies for
the delay in pushing them out.

This one looks sane enough too, but perhaps in that case we should move
the code *out* of drivers/iommu/ and into lib/iova/ ?

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation



smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Intel-iommu panic on kernel v4.1-rc1

2015-04-27 Thread Woodhouse, David
On Mon, 2015-04-27 at 15:27 +0800, sl4e...@gmail.com wrote:
 commit 18436afdc11a00ac881990b454cfb2eae81d6003 iommu/vt-d: Allow 
 RMRR on graphics devices too
 
 cause kernel panic with Z77+vt-d and kernel parameters: 
 intel_iommu=igfx_off iommu=pt
 the panic path is:
 dev_prepare_static_identity_mapping()
   domain_add_dev_info()
 dmar_insert_dev_info()
   find_domain()
 info = dev-archdata.iommu; 
 if (info) 
 -return info-domain;
 
 igfx_off cause igfx@0:2.0.archdata.iommu assigned to
  DUMMY_DEVICE_DOMAIN_INFO. 
 Above patch cause iommu_should_identity_map() return 1, and attempt 
 to setup identity mapping for igfx.

Can you try this please?

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 68d43be..2ffe589 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -696,6 +696,11 @@ static inline struct context_entry 
*iommu_context_addr(struct intel_iommu *iommu
return context[devfn];
 }
 
+static int iommu_dummy(struct device *dev)
+{
+   return dev-archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO;
+}
+
 static struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 
*devfn)
 {
struct dmar_drhd_unit *drhd = NULL;
@@ -705,6 +710,9 @@ static struct intel_iommu *device_to_iommu(struct device 
*dev, u8 *bus, u8 *devf
u16 segment = 0;
int i;
 
+   if (iommu_dummy(dev))
+   return NULL;
+
if (dev_is_pci(dev)) {
pdev = to_pci_dev(dev);
segment = pci_domain_nr(pdev-bus);
@@ -2969,11 +2977,6 @@ static inline struct dmar_domain 
*get_valid_domain_for_dev(struct device *dev)
return __get_valid_domain_for_dev(dev);
 }
 
-static int iommu_dummy(struct device *dev)
-{
-   return dev-archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO;
-}
-
 /* Check if the dev needs to go through non-identity map and unmap process.*/
 static int iommu_no_mapping(struct device *dev)
 {

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: hpsa driver bug crack kernel down!

2014-04-16 Thread Woodhouse, David
On Wed, 2014-04-16 at 15:37 +0200, j...@8bytes.org wrote:
 Hey David,
 
 On Mon, Apr 14, 2014 at 05:03:51PM +, Woodhouse, David wrote:
  Jiang, if you can then let me have a copy with a signed-off-by I'll
  shepherd it upstream along with your other patch which is already in my
  iommu-2.6.git tree.
 
 What is the state of these fixes? I plan to send out a pull-request
 before easter and hoped to include these fixes as well.

I'm travelling and was going to do some final testing and send out a
pull request after I got home tomorrow. But since you ask...

Please pull from
git://git.infradead.org/iommu-2.6.git

David Woodhouse (1):
  iommu/vt-d: Fix get_domain_for_dev() handling of upstream PCIe bridges

Jiang Liu (2):
  iommu/vt-d: fix memory leakage caused by commit ea8ea46
  iommu/vt-d: fix bug in matching PCI devices with DRHD/RMRR descriptors

 drivers/iommu/dmar.c|  3 ++-
 drivers/iommu/intel-iommu.c | 10 +++---
 2 files changed, 9 insertions(+), 4 deletions(-)



-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Woodhouse, David
On Mon, 2014-04-14 at 09:47 -0700, Davidlohr Bueso wrote:
 On Mon, 2014-04-14 at 09:44 -0700, Davidlohr Bueso wrote:
  On Tue, 2014-04-15 at 00:19 +0800, Jiang Liu wrote:
   Hi Davidlohr,
 Thanks for providing the DMAR table. According to the DMAR
   table, one bug in the iommu driver fails to handle this entry:
   [1D2h 0466   1]  Device Scope Entry Type : 01
   [1D3h 0467   1] Entry Length : 0A
   [1D4h 0468   2] Reserved : 
   [1D6h 0470   1]   Enumeration ID : 00
   [1D7h 0471   1]   PCI Bus Number : 00
   [1D8h 0472   2] PCI Path : 1C,04
   [1DAh 0474   2] PCI Path : 00,02
   
 And the patch sent out by me should fix this bug. Could you please help
   to have a try?
  
  Sorry, I am unable to find any patches from you regarding this issue...
  I must be missing something. Could you please point me to the lkml link?
 
 Never mind, I got it internally. I'll let you know  as soon as I can
 test it later today.

Thanks.

Jiang, if you can then let me have a copy with a signed-off-by I'll
shepherd it upstream along with your other patch which is already in my
iommu-2.6.git tree.

-- 
  Sent with Evolution's ActiveSync support.

David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation






smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 21/33] iommu/vt-d: Make get_domain_for_dev() take struct device

2014-04-14 Thread Woodhouse, David
On Mon, 2014-04-14 at 15:22 -0600, Alex Williamson wrote:
 
  + if (dev_tmp) {
  + if (pci_is_pcie(dev_tmp)) {
  + bridge_bus = dev_tmp-subordinate-number;
  + bridge_devfn = 0;
  + } else {
  + bridge_bus = dev_tmp-bus-number;
  + bridge_devfn = dev_tmp-devfn;
  + }
  + spin_lock_irqsave(device_domain_lock, flags);
  + info = dmar_search_domain_by_dev_info(segment, bus, 
  devfn);
 
 
 bus and devfn are uninitialized here, CID 1197747  1197746.  Thanks,

Oops. That should be using bridge_bus and bridge_devfn, shouldn't it?

Will fix; thanks.

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: hpsa driver bug crack kernel down!

2014-04-11 Thread Woodhouse, David
On Thu, 2014-04-10 at 09:19 -0700, Davidlohr Bueso wrote:
 Attaching a dmesg from one of the kernels that boots. It doesn't appear
 to have much of the related information... is there any debug config
 option I can enable that might give you more data?

I'd like the contents of /sys/firmware/acpi/tables/DMAR please. And
please could you also apply this patch to both the last-working and
first-failing kernels and show me the output in both cases?

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index dd576c0..d52ac03 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -683,6 +683,12 @@ static struct intel_iommu *device_to_iommu(int segment, u8 
bus, u8 devfn)
 out:
rcu_read_unlock();
 
+   if (iommu)
+   printk(Device %x:%02x:%02x.%d on IOMMU at %llx\n, segment, 
bus,
+  PCI_SLOT(devfn), PCI_FUNC(devfn), drhd-reg_base_addr);
+   else
+   printk(Device %x:%02x:%02x.%d on no IOMMU\n, segment, bus,
+  PCI_SLOT(devfn), PCI_FUNC(devfn));
return iommu;
 }
 


-- 
  Sent with Evolution's ActiveSync support.

David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation






smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Woodhouse, David
On Thu, 2014-04-10 at 09:15 +0200, Joerg Roedel wrote:
 [+ David, VT-d maintainer ]
 
 Jiang, David, can you please have a look into this issue?
 

DMAR:[fault reason 02] Present bit in context entry is clear
dmar: DRHD: handling fault status reg 602
dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr 
7f61e000

That Present bit in context entry is clear fault means that we have
not set up *any* mappings for this PCI device… on this IOMMU.

  Yes, specifically (finally done bisecting):
  
  commit 2e45528930388658603ea24d49cf52867b928d3e
  Author: Jiang Liu jiang@linux.intel.com
  Date:   Wed Feb 19 14:07:36 2014 +0800
  
  iommu/vt-d: Unify the way to process DMAR device scope array

This commit is about how we decide which IOMMU a given PCI device is
attached to.

Thus, my first guess would be that we are quite happily setting up the
requested DMA maps on the *wrong* IOMMU, and then taking faults when the
device actually tries to do DMA.

However, I'm not 100% convinced of that. The fault address looks
suspiciously like a true physical address, not a virtual bus address of
the type that we'd normally allocate for a dma_map_* operation. Those
would start at 0xf000 and work downwards, typically.

Do you have 'iommu=pt' on the kernel command line? Can I see the full
dmesg as this system boots, and also a copy of the DMAR table?


We should also rate-limit DMA faults, which would avoid the lockup
failure mode. Bjorn, what should an IOMMU driver *do* when it detects
that a device is creating an endless stream of DMA faults and isn't
aborting the transaction?

I can set it to silent so that it just stops *reporting* the DMA faults
for that device... and I suppose I can re-enable them when I next see a
DMA mapping for it (although actually it'd be better to have a hook to
do that on FLR or something like that). But there must be a better
answer than that, surely? And I don't want to hack it up locally in
*one* specific IOMMU driver, any more than I have to.

On a POWER system with EEH, the kernel would end up isolating the
offending device completely, and subsequently resetting it...

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Woodhouse, David
On Thu, 2014-04-10 at 09:14 -0600, Bjorn Helgaas wrote:
  Thus, my first guess would be that we are quite happily setting up the
  requested DMA maps on the *wrong* IOMMU, and then taking faults when the
  device actually tries to do DMA.
 
 I like the wrong IOMMU (or no IOMMU at all) theory.  If we didn't
 connect the device with an IOMMU at all, that would explain the device
 DMAing directly to a physical address, wouldn't it?

An unlikely failure mode. We're much more likely to see *wrong* IOMMU
than no IOMMU. And thus we'd still see the distinctive virtual addresses
just below 4GiB.

However, Rob's answer may solve that puzzle. If this is one of those
abominations where the device continues to do DMA to system memory even
after the OS is up and running and *thinks* it has control of the
hardware, then the offending address will be listed in an RMRR entry
(which tells the OS to set up a 1:1 mapping for access to certain memory
ranges for a given device). And will be inside an E820 reserved region.

A little odd that such an error would trigger only when we're actually
trying to initialise the device from the Linux driver, not as soon as we
enable the IOMMU. But all things are possible.

But the DMAR table and dmesg that I asked for would give us a bit more
information and hopefully let us stop speculating...

  We should also rate-limit DMA faults, which would avoid the lockup
  failure mode. Bjorn, what should an IOMMU driver *do* when it detects
  that a device is creating an endless stream of DMA faults and isn't
  aborting the transaction?
 
 You mentioned that POWER with EEH does something intelligent in this
 case, but I'm not familiar with that code.  We have AER support, which
 can result in resetting a device, but I think DMA faults are reported
 differently, and I don't think there's any nice existing way for PCI
 to deal with them.  Maybe there should be, though.

Quite frankly, I don't care how *you* deal with them, or even if you
can. All I want to know is how I tell you about the problem, because *I*
sure as hell don't want to be trying to deal with it in the IOMMU code.
That's a generic PCI layer thing. :)

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: hpsa driver bug crack kernel down!

2014-04-10 Thread Woodhouse, David
On Thu, 2014-04-10 at 09:19 -0700, Davidlohr Bueso wrote:
  dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr 
  7f61e000
  
  That Present bit in context entry is clear fault means that we have
  not set up *any* mappings for this PCI device… on this IOMMU.
  
Yes, specifically (finally done bisecting):

commit 2e45528930388658603ea24d49cf52867b928d3e
Author: Jiang Liu jiang@linux.intel.com
Date:   Wed Feb 19 14:07:36 2014 +0800

iommu/vt-d: Unify the way to process DMAR device scope array
  
  This commit is about how we decide which IOMMU a given PCI device is
  attached to.
  
  Thus, my first guess would be that we are quite happily setting up the
  requested DMA maps on the *wrong* IOMMU, and then taking faults when the
  device actually tries to do DMA.
  
  However, I'm not 100% convinced of that. The fault address looks
  suspiciously like a true physical address, not a virtual bus address of
  the type that we'd normally allocate for a dma_map_* operation. Those
  would start at 0xf000 and work downwards, typically.
  
  Do you have 'iommu=pt' on the kernel command line? 
 
 No.
 
  Can I see the full
  dmesg as this system boots, and also a copy of the DMAR table?
 
 Attaching a dmesg from one of the kernels that boots. It doesn't appear
 to have much of the related information... 

It shows us that the address 0x7f61e000 is in an E820-reserved region,
and that there's and RMRR covering that region for an unspecified PCI
device, but that's going to be the hpsa.

So if isn't just a simple case of us assigning this device to the wrong
IOMMU, *perhaps* it's that we lose the RMRR when the driver takes
control of the device. RMRRs are generally expected to be a boot-time
thing, for things like legacy keyboard/mouse emulation via USB. Using
them while the system is *active* is... horrid. We've often not quite
handled that right.

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/vt-d: Check for NULL pointer in dmar_acpi_dev_scope_init() array

2014-03-26 Thread Woodhouse, David
On Tue, 2014-03-25 at 20:30 +0100, Joerg Roedel wrote:
 
 When ir_dev_scope_init() is called via a rootfs initcall it
 will check for irq_remapping_enabled before it calls
 (indirectly) into dmar_acpi_dev_scope_init() which uses the
 dmar_tbl pointer without any checks.

Ew. If irq_remapping_enabled is not (any more?) Intel-specific, then
ir_dev_scope_init() ought to be checking something more appropriate.

But defensively fixing dmar_dev_scope_init() as you have done certainly
seems reasonable. On a slow path, it doesn't hurt to check for NULL.

Acked-by: David Woodhouse david.woodho...@intel.com


However, I note that dmar_dev_scope_init() already *does* effectively
check for the existence of Intel IOMMU hardware, because it checks the
dmar_drhd_units list. So the patch below would also make sense, and
should be sufficient to fix the problem on its own...



From: David Woodhouse david.woodho...@intel.com
Subject: iommu/vt-d: Only call dmar_acpi_dev_scope_init() if there are DRHD 
units

For arcane reasons, this code path gets exercised even on AMD systems
when IRQ remapping is enabled.

Signed-off-by: David Woodhouse david.woodho...@intel.com

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 56e1c79..8e1a4ae 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -693,13 +693,13 @@ int __init dmar_dev_scope_init(void)
if (dmar_dev_scope_status != 1)
return dmar_dev_scope_status;
 
-   dmar_acpi_dev_scope_init();
-
if (list_empty(dmar_drhd_units)) {
dmar_dev_scope_status = -ENODEV;
} else {
dmar_dev_scope_status = 0;
 
+   dmar_acpi_dev_scope_init();
+
for_each_pci_dev(dev) {
if (dev-is_virtfn)
continue;



-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 34/33] iommu/vt-d: Fix RCU annotations on device scope lists

2014-03-24 Thread Woodhouse, David
In commit d78e0ac577 (iommu/vt-d: Change scope lists to struct device,
bus, devfn) I converted the 'struct device *' arrays into an array of
'struct dmar_dev_scope', where:

struct dmar_dev_scope {
   struct device *dev;
   u8 bus;
   u8 devfn;
};

That commit contained changes such as this:

-extern void dmar_free_dev_scope(struct pci_dev __rcu ***devices, int *cnt);
+extern void dmar_free_dev_scope(struct dmar_dev_scope __rcu **devices, int 
*cnt);

This was wrong; it is the device pointer which needs to be protected
with RCU, and the above-cited change was moving the __rcu annotation so
that it actually referred to the enclosing structure, not the device
pointer itself.

Reported by the Intel 0-day build testing infrastructure. Thanks, Fengguang.

Signed-off-by: David Woodhouse david.woodho...@intel.com
---
 drivers/iommu/dmar.c|  8 
 drivers/iommu/intel-iommu.c |  4 ++--
 include/linux/dmar.h| 10 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index cd05a4b..56e1c79 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -100,7 +100,7 @@ void *dmar_alloc_dev_scope(void *start, void *end, int *cnt)
return kcalloc(*cnt, sizeof(struct dmar_dev_scope), GFP_KERNEL);
 }
 
-void dmar_free_dev_scope(struct dmar_dev_scope __rcu **devices, int *cnt)
+void dmar_free_dev_scope(struct dmar_dev_scope **devices, int *cnt)
 {
int i;
struct device *tmp_dev;
@@ -191,7 +191,7 @@ static bool dmar_match_pci_path(struct dmar_pci_notify_info 
*info, int bus,
 /* Return:  0 if match found, 0 if no match found,  0 if error happens */
 int dmar_insert_dev_scope(struct dmar_pci_notify_info *info,
  void *start, void*end, u16 segment,
- struct dmar_dev_scope __rcu *devices,
+ struct dmar_dev_scope *devices,
  int devices_cnt)
 {
int i, level;
@@ -235,7 +235,7 @@ int dmar_insert_dev_scope(struct dmar_pci_notify_info *info,
 }
 
 int dmar_remove_dev_scope(struct dmar_pci_notify_info *info, u16 segment,
- struct dmar_dev_scope __rcu *devices, int count)
+ struct dmar_dev_scope *devices, int count)
 {
int index;
struct device *tmp;
@@ -565,7 +565,7 @@ parse_dmar_table(void)
return ret;
 }
 
-static int dmar_pci_device_match(struct dmar_dev_scope __rcu devices[],
+static int dmar_pci_device_match(struct dmar_dev_scope devices[],
 int cnt, struct pci_dev *dev)
 {
int index;
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 53996d9..6fbce01 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -381,14 +381,14 @@ struct dmar_rmrr_unit {
struct acpi_dmar_header *hdr;   /* ACPI header  */
u64 base_address;   /* reserved base address*/
u64 end_address;/* reserved end address */
-   struct dmar_dev_scope __rcu *devices;   /* target devices */
+   struct dmar_dev_scope *devices; /* target devices */
int devices_cnt;/* target device count */
 };
 
 struct dmar_atsr_unit {
struct list_head list;  /* list of ATSR units */
struct acpi_dmar_header *hdr;   /* ACPI header */
-   struct dmar_dev_scope __rcu *devices;   /* target devices */
+   struct dmar_dev_scope *devices; /* target devices */
int devices_cnt;/* target device count */
u8 include_all:1;   /* include all ports */
 };
diff --git a/include/linux/dmar.h b/include/linux/dmar.h
index 4c2caab..23c8db1 100644
--- a/include/linux/dmar.h
+++ b/include/linux/dmar.h
@@ -37,7 +37,7 @@ struct acpi_dmar_header;
 struct intel_iommu;
 
 struct dmar_dev_scope {
-   struct device *dev;
+   struct device __rcu *dev;
u8 bus;
u8 devfn;
 };
@@ -48,7 +48,7 @@ struct dmar_drhd_unit {
struct list_head list;  /* list of drhd units   */
struct  acpi_dmar_header *hdr;  /* ACPI header  */
u64 reg_base_addr;  /* register base address*/
-   struct  dmar_dev_scope __rcu *devices;/* target device array*/
+   struct  dmar_dev_scope *devices;/* target device array  */
int devices_cnt;/* target device count  */
u16 segment;/* PCI domain   */
u8  ignored:1;  /* ignore drhd  */
@@ -103,13 +103,13 @@ extern int dmar_dev_scope_init(void);
 extern int dmar_parse_dev_scope(void *start, void *end, int *cnt,
struct dmar_dev_scope **devices, u16 segment);
 extern void *dmar_alloc_dev_scope(void *start, void *end, int *cnt);
-extern void dmar_free_dev_scope(struct dmar_dev_scope __rcu **devices, int 
*cnt);
+extern void dmar_free_dev_scope(struct 

[PATCH] intel-iommu: Free old page tables before creating superpage

2012-12-19 Thread Woodhouse, David
The dma_pte_free_pagetable() function will only free a page table page
if it is asked to free the *entire* 2MiB range that it covers. So if a
page table page was used for one or more small mappings, it's likely to
end up still present in the page tables... but with no valid PTEs.

This was fine when we'd only be repopulating it with 4KiB PTEs anyway
but the same virtual address range can end up being reused for a
*large-page* mapping. And in that case were were trying to insert the
large page into the second-level page table, and getting a complaint
from the sanity check in __domain_mapping() because there was already a
corresponding entry. This was *relatively* harmless; it led to a memory
leak of the old page table page, but no other ill-effects.

Fix it by calling dma_pte_clear_range (hopefully redundant) and
dma_pte_free_pagetable() before setting up the new large page.

Signed-off-by: David Woodhouse david.woodho...@intel.com
Tested-by: Ravi Murty ravi.mu...@intel.com
Tested-by: Sudeep Dutt sudeep.d...@intel.com
Cc: sta...@kernel.org [3.0+]
---
Oops, should have chased this up before the 3.7 release. Oh well, it can
make it into 3.7.2.

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0badfa4..9476c1b 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1827,10 +1827,17 @@ static int __domain_mapping(struct dmar_domain *domain, 
unsigned long iov_pfn,
if (!pte)
return -ENOMEM;
/* It is large page*/
-   if (largepage_lvl  1)
+   if (largepage_lvl  1) {
pteval |= DMA_PTE_LARGE_PAGE;
-   else
+   /* Ensure that old small page tables are 
removed to make room
+  for superpage, if they exist. */
+   dma_pte_clear_range(domain, iov_pfn,
+   iov_pfn + 
lvl_to_nr_pages(largepage_lvl) - 1);
+   dma_pte_free_pagetable(domain, iov_pfn,
+  iov_pfn + 
lvl_to_nr_pages(largepage_lvl) - 1);
+   } else {
pteval = ~(uint64_t)DMA_PTE_LARGE_PAGE;
+   }
 
}
/* We don't need lock here, nobody else

-- 
   Sent with MeeGo's ActiveSync support.

David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] intel-iommu: Free old page tables before creating superpage

2012-12-04 Thread Woodhouse, David
The dma_pte_free_pagetable() function will only free a page table page
if it is asked to free the *entire* 2MiB range that it covers. So if a
page table page was used for one or more small mappings, it's likely to
end up still present in the page tables... but with no valid PTEs.

This was fine when we'd only be repopulating it with 4KiB PTEs anyway
but the same virtual address range can end up being reused for a
*large-page* mapping. And in that case were were trying to insert the
large page into the second-level page table, and getting a complaint
from the sanity check in __domain_mapping() because there was already a
corresponding entry. This was *relatively* harmless; it led to a memory
leak of the old page table page, but no other ill-effects.

Fix it by calling dma_pte_clear_range (hopefully redundant) and
dma_pte_free_pagetable() before setting up the new large page.

Signed-off-by: David Woodhouse david.woodho...@intel.com
Tested-by: Ravi Murty ravi.mu...@intel.com
Tested-by: Sudeep Dutt sudeep.d...@intel.com
Cc: sta...@kernel.org [3.0+]
---
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0badfa4..9476c1b 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1827,10 +1827,17 @@ static int __domain_mapping(struct dmar_domain *domain, 
unsigned long iov_pfn,
if (!pte)
return -ENOMEM;
/* It is large page*/
-   if (largepage_lvl  1)
+   if (largepage_lvl  1) {
pteval |= DMA_PTE_LARGE_PAGE;
-   else
+   /* Ensure that old small page tables are 
removed to make room
+  for superpage, if they exist. */
+   dma_pte_clear_range(domain, iov_pfn,
+   iov_pfn + 
lvl_to_nr_pages(largepage_lvl) - 1);
+   dma_pte_free_pagetable(domain, iov_pfn,
+  iov_pfn + 
lvl_to_nr_pages(largepage_lvl) - 1);
+   } else {
pteval = ~(uint64_t)DMA_PTE_LARGE_PAGE;
+   }
 
}
/* We don't need lock here, nobody else

-- 
   Sent with MeeGo's ActiveSync support.

David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation





smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu