Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Jiang Liu
Hi Davidlohr,
Thanks for the information!
According to lspci output, device :02:00.2 is HP ILO
controller, device :03:00.0 is RAID controller. Both ILO and
RAID controllers need to access reserved memory range
[0x7f61e000 - 0x7f61] in physical mode.

According to dmesg output, BIOS has reserved memory and
IOMMU has setup 1:1 mapping for ILO and RAID controller to access
this range. Related log messages as below:
BIOS-e820: [mem 0x7f61d000-0x8fff] reserved
IOMMU: Setting RMRR:
IOMMU: Setting identity map for device :03:00.0 [0x7f61e000 -
0x7f61]
IOMMU: Setting identity map for device :02:00.0 [0x7f61e000 -
0x7f61]
IOMMU: Setting identity map for device :02:00.2 [0x7f61e000 -
0x7f61]

From the screenshot, device :02:00.2 fails to access
memory address 0x7f61e000. That indicates IOMMU driver fails to
setup 1:1 mapping for Reserved Memory Range for ILO controller.
So could you please help to check whether you could observe boot
messages like IOMMU: Setting identity map for device :02:00.2
[0x7f61e000 - 0x7f61] with the failure kernel image?

It would be great if boot messages could be saved when
failing to boot, so we could get more information from log.

BTW, I have double checked related code, and still can't
find a reliable explanation for the regression:(

Thanks!
Gerry

On 2014/4/11 0:19, Davidlohr Bueso wrote:
 On Thu, 2014-04-10 at 08:46 +, Woodhouse, David wrote:
 On Thu, 2014-04-10 at 09:15 +0200, Joerg Roedel wrote:
 [+ David, VT-d maintainer ]

 Jiang, David, can you please have a look into this issue?


 DMAR:[fault reason 02] Present bit in context entry is clear
 dmar: DRHD: handling fault status reg 602
 dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr 7f61e000

 That Present bit in context entry is clear fault means that we have
 not set up *any* mappings for this PCI device… on this IOMMU.

 Yes, specifically (finally done bisecting):

 commit 2e45528930388658603ea24d49cf52867b928d3e
 Author: Jiang Liu jiang@linux.intel.com
 Date:   Wed Feb 19 14:07:36 2014 +0800

 iommu/vt-d: Unify the way to process DMAR device scope array

 This commit is about how we decide which IOMMU a given PCI device is
 attached to.

 Thus, my first guess would be that we are quite happily setting up the
 requested DMA maps on the *wrong* IOMMU, and then taking faults when the
 device actually tries to do DMA.

 However, I'm not 100% convinced of that. The fault address looks
 suspiciously like a true physical address, not a virtual bus address of
 the type that we'd normally allocate for a dma_map_* operation. Those
 would start at 0xf000 and work downwards, typically.

 Do you have 'iommu=pt' on the kernel command line? 
 
 No.
 
 Can I see the full
 dmesg as this system boots, and also a copy of the DMAR table?
 
 Attaching a dmesg from one of the kernels that boots. It doesn't appear
 to have much of the related information... is there any debug config
 option I can enable that might give you more data?
 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Jiang Liu
Hi all,
I guess I found the root cause. It's a bug in matching
device scope, variable 'level' should be decreased when walking up PCI
topology.
Could you please help to test following patch?
Thanks!
Gerry

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index f445c10..1f8308c 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -152,7 +152,7 @@ dmar_alloc_pci_notify_info(struct pci_dev *dev,
unsigned long event)
info-seg = pci_domain_nr(dev-bus);
info-level = level;
if (event == BUS_NOTIFY_ADD_DEVICE) {
-   for (tmp = dev, level--; tmp; tmp = tmp-bus-self) {
+   for (tmp = dev, level--; tmp; level--, tmp =
tmp-bus-self) {
info-path[level].device = PCI_SLOT(tmp-devfn);
info-path[level].function = PCI_FUNC(tmp-devfn);
if (pci_is_root_bus(tmp-bus))


On 2014/4/11 0:19, Davidlohr Bueso wrote:
 On Thu, 2014-04-10 at 08:46 +, Woodhouse, David wrote:
 On Thu, 2014-04-10 at 09:15 +0200, Joerg Roedel wrote:
 [+ David, VT-d maintainer ]

 Jiang, David, can you please have a look into this issue?


 DMAR:[fault reason 02] Present bit in context entry is clear
 dmar: DRHD: handling fault status reg 602
 dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr 7f61e000

 That Present bit in context entry is clear fault means that we have
 not set up *any* mappings for this PCI device… on this IOMMU.

 Yes, specifically (finally done bisecting):

 commit 2e45528930388658603ea24d49cf52867b928d3e
 Author: Jiang Liu jiang@linux.intel.com
 Date:   Wed Feb 19 14:07:36 2014 +0800

 iommu/vt-d: Unify the way to process DMAR device scope array

 This commit is about how we decide which IOMMU a given PCI device is
 attached to.

 Thus, my first guess would be that we are quite happily setting up the
 requested DMA maps on the *wrong* IOMMU, and then taking faults when the
 device actually tries to do DMA.

 However, I'm not 100% convinced of that. The fault address looks
 suspiciously like a true physical address, not a virtual bus address of
 the type that we'd normally allocate for a dma_map_* operation. Those
 would start at 0xf000 and work downwards, typically.

 Do you have 'iommu=pt' on the kernel command line? 
 
 No.
 
 Can I see the full
 dmesg as this system boots, and also a copy of the DMAR table?
 
 Attaching a dmesg from one of the kernels that boots. It doesn't appear
 to have much of the related information... is there any debug config
 option I can enable that might give you more data?
 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] hpsa: fix uninitialized trans_support in hpsa_put_ctlr_into_performant_mode()

2014-04-14 Thread James Bottomley
Your subject line is very tame.  It should be the one line summary of
why we apply the patch, so it should read something like

 hpsa: fix NULL deref in performant mode

On Thu, 2014-04-10 at 17:17 -0500, scame...@beardog.cce.hp.com wrote:
 Without this, you'll see a null pointer dereference in
 hpsa_enter_performant_mode().

The description should be more comprehensible.

I'm clear that the use before initialisation is a bug ... I'm less clear
on why it causes an oops.

 Signed-off-by: Stephen M. Cameron scame...@beardog.cce.hp.com
 ---
  drivers/scsi/hpsa.c |4 
  1 files changed, 4 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
 index 8cf4a0c..ef4dfdd 100644
 --- a/drivers/scsi/hpsa.c
 +++ b/drivers/scsi/hpsa.c
 @@ -7463,6 +7463,10 @@ static void hpsa_put_ctlr_into_performant_mode(struct 
 ctlr_info *h)
   if (hpsa_simple_mode)
   return;
  
 + trans_support = readl((h-cfgtable-TransportSupport));
 + if (!(trans_support  PERFORMANT_MODE))
 + return;
 +
   /* Check for I/O accelerator mode support */
   if (trans_support  CFGTBL_Trans_io_accel1) {
   transMethod |= CFGTBL_Trans_io_accel1 |

Shouldn't you be moving this check from its previous location, rather
than adding a new one that makes the original obsolete?

James

---

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 8cf4a0c..9a6e4a2 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -7463,6 +7463,10 @@ static void hpsa_put_ctlr_into_performant_mode(struct 
ctlr_info *h)
if (hpsa_simple_mode)
return;
 
+   trans_support = readl((h-cfgtable-TransportSupport));
+   if (!(trans_support  PERFORMANT_MODE))
+   return;
+
/* Check for I/O accelerator mode support */
if (trans_support  CFGTBL_Trans_io_accel1) {
transMethod |= CFGTBL_Trans_io_accel1 |
@@ -7479,10 +7483,6 @@ static void hpsa_put_ctlr_into_performant_mode(struct 
ctlr_info *h)
}
 
/* TODO, check that this next line h-nreply_queues is correct */
-   trans_support = readl((h-cfgtable-TransportSupport));
-   if (!(trans_support  PERFORMANT_MODE))
-   return;
-
h-nreply_queues = h-msix_vector  0 ? h-msix_vector : 1;
hpsa_get_max_perf_mode_cmds(h);
/* Performant mode ring buffer and supporting data structures */


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Davidlohr Bueso
Sorry for the delay, I've been having to take turns for this box.

On Fri, 2014-04-11 at 09:18 +, Woodhouse, David wrote:
 On Thu, 2014-04-10 at 09:19 -0700, Davidlohr Bueso wrote:
  Attaching a dmesg from one of the kernels that boots. It doesn't appear
  to have much of the related information... is there any debug config
  option I can enable that might give you more data?
 
 I'd like the contents of /sys/firmware/acpi/tables/DMAR please.

Attached is the disassembly of the raw output.

  And
 please could you also apply this patch to both the last-working and
 first-failing kernels and show me the output in both cases?

So I still cannot get around getting the info for the first failing
kernel, but below is for the last working. Thanks.

Device 0:03:00.0 on IOMMU at a800
Device 0:03:00.0 on IOMMU at a800
IOMMU: Setting identity map for device :02:00.0 [0x7f61e000 - 0x7f61]
Device 0:02:00.0 on IOMMU at a800
Device 0:02:00.0 on IOMMU at a800
IOMMU: Setting identity map for device :02:00.2 [0x7f61e000 - 0x7f61]
Device 0:02:00.2 on IOMMU at a800
Device 0:02:00.2 on IOMMU at a800
IOMMU: Setting identity map for device :00:1d.0 [0x7f7e7000 - 0x7f7ecfff]
Device 0:00:1d.0 on IOMMU at a800
Device 0:00:1d.0 on IOMMU at a800
IOMMU: Setting identity map for device :00:1d.1 [0x7f7e7000 - 0x7f7ecfff]
Device 0:00:1d.1 on IOMMU at a800
Device 0:00:1d.1 on IOMMU at a800
IOMMU: Setting identity map for device :00:1d.2 [0x7f7e7000 - 0x7f7ecfff]
Device 0:00:1d.2 on IOMMU at a800
Device 0:00:1d.2 on IOMMU at a800
IOMMU: Setting identity map for device :00:1d.3 [0x7f7e7000 - 0x7f7ecfff]
Device 0:00:1d.3 on IOMMU at a800
Device 0:00:1d.3 on IOMMU at a800
IOMMU: Setting identity map for device :02:00.0 [0x7f7e7000 - 0x7f7ecfff]
Device 0:02:00.0 on IOMMU at a800
IOMMU: Setting identity map for device :02:00.2 [0x7f7e7000 - 0x7f7ecfff]
Device 0:02:00.2 on IOMMU at a800
IOMMU: Setting identity map for device :02:00.4 [0x7f7e7000 - 0x7f7ecfff]
Device 0:02:00.4 on IOMMU at a800
Device 0:02:00.4 on IOMMU at a800
IOMMU: Setting identity map for device :00:1d.7 [0x7f7ee000 - 0x7f7e]
Device 0:00:1d.7 on IOMMU at a800
Device 0:00:1d.7 on IOMMU at a800
IOMMU: Prepare 0-16MiB unity mapping for LPC
IOMMU: Setting identity map for device :00:1f.0 [0x0 - 0xff]
Device 0:00:1f.0 on IOMMU at a800
Device 0:00:1f.0 on IOMMU at a800
PCI-DMA: Intel(R) Virtualization Technology for Directed I/O
Device 0:00:00.0 on IOMMU at a800
Device 0:00:01.0 on IOMMU at a800
Device 0:00:02.0 on IOMMU at a800
Device 0:00:03.0 on IOMMU at a800
Device 0:00:04.0 on IOMMU at a800
Device 0:00:05.0 on IOMMU at a800
Device 0:00:06.0 on IOMMU at a800
Device 0:00:07.0 on IOMMU at a800
Device 0:00:08.0 on IOMMU at a800
Device 0:00:09.0 on IOMMU at a800
Device 0:00:0a.0 on IOMMU at a800
Device 0:00:14.0 on IOMMU at a800
Device 0:00:1c.0 on IOMMU at a800
Device 0:00:1c.4 on IOMMU at a800
Device 0:00:1d.0 on IOMMU at a800
Device 0:00:1d.1 on IOMMU at a800
Device 0:00:1d.2 on IOMMU at a800
Device 0:00:1d.3 on IOMMU at a800
Device 0:00:1d.7 on IOMMU at a800
Device 0:00:1e.0 on IOMMU at a800
Device 0:00:1f.0 on IOMMU at a800
Device 0:04:00.0 on IOMMU at a800
Device 0:04:00.1 on IOMMU at a800
Device 0:04:00.2 on IOMMU at a800
Device 0:04:00.3 on IOMMU at a800
Device 0:03:00.0 on IOMMU at a800
Device 0:02:00.0 on IOMMU at a800
Device 0:02:00.2 on IOMMU at a800
Device 0:02:00.4 on IOMMU at a800
Device 0:01:03.0 on IOMMU at a800
Device 0:50:00.0 on IOMMU at ac00
Device 0:50:01.0 on IOMMU at ac00
Device 0:50:02.0 on IOMMU at ac00
Device 0:50:03.0 on IOMMU at ac00
Device 0:50:04.0 on IOMMU at ac00
Device 0:50:05.0 on IOMMU at ac00
Device 0:50:06.0 on IOMMU at ac00
Device 0:50:07.0 on IOMMU at ac00
Device 0:50:08.0 on IOMMU at ac00
Device 0:50:09.0 on IOMMU at ac00
Device 0:50:0a.0 on IOMMU at ac00
Device 0:50:14.0 on IOMMU at a800
Device 0:a0:00.0 on IOMMU at b000
Device 0:a0:01.0 on IOMMU at b000
Device 0:a0:02.0 on IOMMU at b000
Device 0:a0:03.0 on IOMMU at b000
Device 0:a0:04.0 on IOMMU at b000
Device 0:a0:05.0 on IOMMU at b000
Device 0:a0:06.0 on IOMMU at b000
Device 0:a0:07.0 on IOMMU at b000
Device 0:a0:08.0 on IOMMU at b000
Device 0:a0:09.0 on IOMMU at b000
Device 0:a0:0a.0 on IOMMU at b000
Device 0:a0:14.0 on IOMMU at a800
Device 0:7c:00.0 on IOMMU at a800
Device 0:7c:08.0 on IOMMU at a800
Device 0:82:00.0 on IOMMU at a800
Device 0:82:08.0 on IOMMU at a800

/*
 * Intel ACPI Component Architecture
 * AML Disassembler version 20140325-64 [Apr 11 2014]
 * Copyright (c) 2000 - 2014 Intel Corporation
 * 
 * Disassembly of DMAR.raw, Fri Apr 11 09:10:10 2014
 *
 * ACPI Data Table [DMAR]
 *
 * Format: 

Re: [PATCH] hpsa: fix uninitialized trans_support in hpsa_put_ctlr_into_performant_mode()

2014-04-14 Thread scameron
On Mon, Apr 14, 2014 at 08:45:16AM -0700, James Bottomley wrote:
 Your subject line is very tame.  It should be the one line summary of
 why we apply the patch, so it should read something like
 
  hpsa: fix NULL deref in performant mode
 
 On Thu, 2014-04-10 at 17:17 -0500, scame...@beardog.cce.hp.com wrote:
  Without this, you'll see a null pointer dereference in
  hpsa_enter_performant_mode().
 
 The description should be more comprehensible.
 
 I'm clear that the use before initialisation is a bug ... I'm less clear
 on why it causes an oops.
 
  Signed-off-by: Stephen M. Cameron scame...@beardog.cce.hp.com
  ---
   drivers/scsi/hpsa.c |4 
   1 files changed, 4 insertions(+), 0 deletions(-)
  
  diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
  index 8cf4a0c..ef4dfdd 100644
  --- a/drivers/scsi/hpsa.c
  +++ b/drivers/scsi/hpsa.c
  @@ -7463,6 +7463,10 @@ static void 
  hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h)
  if (hpsa_simple_mode)
  return;
   
  +   trans_support = readl((h-cfgtable-TransportSupport));
  +   if (!(trans_support  PERFORMANT_MODE))
  +   return;
  +
  /* Check for I/O accelerator mode support */
  if (trans_support  CFGTBL_Trans_io_accel1) {
  transMethod |= CFGTBL_Trans_io_accel1 |
 
 Shouldn't you be moving this check from its previous location, rather
 than adding a new one that makes the original obsolete?

Oh... I didn't notice that.   So that's what happened to that hunk.
Yes, that is what should be done.

I will resend the patch after fixing up this and the commit message.

-- steve


 
 James
 
 ---
 
 diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
 index 8cf4a0c..9a6e4a2 100644
 --- a/drivers/scsi/hpsa.c
 +++ b/drivers/scsi/hpsa.c
 @@ -7463,6 +7463,10 @@ static void hpsa_put_ctlr_into_performant_mode(struct 
 ctlr_info *h)
   if (hpsa_simple_mode)
   return;
  
 + trans_support = readl((h-cfgtable-TransportSupport));
 + if (!(trans_support  PERFORMANT_MODE))
 + return;
 +
   /* Check for I/O accelerator mode support */
   if (trans_support  CFGTBL_Trans_io_accel1) {
   transMethod |= CFGTBL_Trans_io_accel1 |
 @@ -7479,10 +7483,6 @@ static void hpsa_put_ctlr_into_performant_mode(struct 
 ctlr_info *h)
   }
  
   /* TODO, check that this next line h-nreply_queues is correct */
 - trans_support = readl((h-cfgtable-TransportSupport));
 - if (!(trans_support  PERFORMANT_MODE))
 - return;
 -
   h-nreply_queues = h-msix_vector  0 ? h-msix_vector : 1;
   hpsa_get_max_perf_mode_cmds(h);
   /* Performant mode ring buffer and supporting data structures */
 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Jiang Liu
Hi Davidlohr,
Thanks for providing the DMAR table. According to the DMAR
table, one bug in the iommu driver fails to handle this entry:
[1D2h 0466   1]  Device Scope Entry Type : 01
[1D3h 0467   1] Entry Length : 0A
[1D4h 0468   2] Reserved : 
[1D6h 0470   1]   Enumeration ID : 00
[1D7h 0471   1]   PCI Bus Number : 00
[1D8h 0472   2] PCI Path : 1C,04
[1DAh 0474   2] PCI Path : 00,02

And the patch sent out by me should fix this bug. Could you please help
to have a try?
Thanks!
Gerry

On 2014/4/14 23:45, Davidlohr Bueso wrote:
 Sorry for the delay, I've been having to take turns for this box.
 
 On Fri, 2014-04-11 at 09:18 +, Woodhouse, David wrote:
 On Thu, 2014-04-10 at 09:19 -0700, Davidlohr Bueso wrote:
 Attaching a dmesg from one of the kernels that boots. It doesn't appear
 to have much of the related information... is there any debug config
 option I can enable that might give you more data?

 I'd like the contents of /sys/firmware/acpi/tables/DMAR please.
 
 Attached is the disassembly of the raw output.
 
  And
 please could you also apply this patch to both the last-working and
 first-failing kernels and show me the output in both cases?
 
 So I still cannot get around getting the info for the first failing
 kernel, but below is for the last working. Thanks.
 
 Device 0:03:00.0 on IOMMU at a800
 Device 0:03:00.0 on IOMMU at a800
 IOMMU: Setting identity map for device :02:00.0 [0x7f61e000 - 0x7f61]
 Device 0:02:00.0 on IOMMU at a800
 Device 0:02:00.0 on IOMMU at a800
 IOMMU: Setting identity map for device :02:00.2 [0x7f61e000 - 0x7f61]
 Device 0:02:00.2 on IOMMU at a800
 Device 0:02:00.2 on IOMMU at a800
 IOMMU: Setting identity map for device :00:1d.0 [0x7f7e7000 - 0x7f7ecfff]
 Device 0:00:1d.0 on IOMMU at a800
 Device 0:00:1d.0 on IOMMU at a800
 IOMMU: Setting identity map for device :00:1d.1 [0x7f7e7000 - 0x7f7ecfff]
 Device 0:00:1d.1 on IOMMU at a800
 Device 0:00:1d.1 on IOMMU at a800
 IOMMU: Setting identity map for device :00:1d.2 [0x7f7e7000 - 0x7f7ecfff]
 Device 0:00:1d.2 on IOMMU at a800
 Device 0:00:1d.2 on IOMMU at a800
 IOMMU: Setting identity map for device :00:1d.3 [0x7f7e7000 - 0x7f7ecfff]
 Device 0:00:1d.3 on IOMMU at a800
 Device 0:00:1d.3 on IOMMU at a800
 IOMMU: Setting identity map for device :02:00.0 [0x7f7e7000 - 0x7f7ecfff]
 Device 0:02:00.0 on IOMMU at a800
 IOMMU: Setting identity map for device :02:00.2 [0x7f7e7000 - 0x7f7ecfff]
 Device 0:02:00.2 on IOMMU at a800
 IOMMU: Setting identity map for device :02:00.4 [0x7f7e7000 - 0x7f7ecfff]
 Device 0:02:00.4 on IOMMU at a800
 Device 0:02:00.4 on IOMMU at a800
 IOMMU: Setting identity map for device :00:1d.7 [0x7f7ee000 - 0x7f7e]
 Device 0:00:1d.7 on IOMMU at a800
 Device 0:00:1d.7 on IOMMU at a800
 IOMMU: Prepare 0-16MiB unity mapping for LPC
 IOMMU: Setting identity map for device :00:1f.0 [0x0 - 0xff]
 Device 0:00:1f.0 on IOMMU at a800
 Device 0:00:1f.0 on IOMMU at a800
 PCI-DMA: Intel(R) Virtualization Technology for Directed I/O
 Device 0:00:00.0 on IOMMU at a800
 Device 0:00:01.0 on IOMMU at a800
 Device 0:00:02.0 on IOMMU at a800
 Device 0:00:03.0 on IOMMU at a800
 Device 0:00:04.0 on IOMMU at a800
 Device 0:00:05.0 on IOMMU at a800
 Device 0:00:06.0 on IOMMU at a800
 Device 0:00:07.0 on IOMMU at a800
 Device 0:00:08.0 on IOMMU at a800
 Device 0:00:09.0 on IOMMU at a800
 Device 0:00:0a.0 on IOMMU at a800
 Device 0:00:14.0 on IOMMU at a800
 Device 0:00:1c.0 on IOMMU at a800
 Device 0:00:1c.4 on IOMMU at a800
 Device 0:00:1d.0 on IOMMU at a800
 Device 0:00:1d.1 on IOMMU at a800
 Device 0:00:1d.2 on IOMMU at a800
 Device 0:00:1d.3 on IOMMU at a800
 Device 0:00:1d.7 on IOMMU at a800
 Device 0:00:1e.0 on IOMMU at a800
 Device 0:00:1f.0 on IOMMU at a800
 Device 0:04:00.0 on IOMMU at a800
 Device 0:04:00.1 on IOMMU at a800
 Device 0:04:00.2 on IOMMU at a800
 Device 0:04:00.3 on IOMMU at a800
 Device 0:03:00.0 on IOMMU at a800
 Device 0:02:00.0 on IOMMU at a800
 Device 0:02:00.2 on IOMMU at a800
 Device 0:02:00.4 on IOMMU at a800
 Device 0:01:03.0 on IOMMU at a800
 Device 0:50:00.0 on IOMMU at ac00
 Device 0:50:01.0 on IOMMU at ac00
 Device 0:50:02.0 on IOMMU at ac00
 Device 0:50:03.0 on IOMMU at ac00
 Device 0:50:04.0 on IOMMU at ac00
 Device 0:50:05.0 on IOMMU at ac00
 Device 0:50:06.0 on IOMMU at ac00
 Device 0:50:07.0 on IOMMU at ac00
 Device 0:50:08.0 on IOMMU at ac00
 Device 0:50:09.0 on IOMMU at ac00
 Device 0:50:0a.0 on IOMMU at ac00
 Device 0:50:14.0 on IOMMU at a800
 Device 0:a0:00.0 on IOMMU at b000
 Device 0:a0:01.0 on IOMMU at b000
 Device 0:a0:02.0 on IOMMU at b000

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Davidlohr Bueso
On Tue, 2014-04-15 at 00:19 +0800, Jiang Liu wrote:
 Hi Davidlohr,
   Thanks for providing the DMAR table. According to the DMAR
 table, one bug in the iommu driver fails to handle this entry:
 [1D2h 0466   1]  Device Scope Entry Type : 01
 [1D3h 0467   1] Entry Length : 0A
 [1D4h 0468   2] Reserved : 
 [1D6h 0470   1]   Enumeration ID : 00
 [1D7h 0471   1]   PCI Bus Number : 00
 [1D8h 0472   2] PCI Path : 1C,04
 [1DAh 0474   2] PCI Path : 00,02
 
   And the patch sent out by me should fix this bug. Could you please help
 to have a try?

Sorry, I am unable to find any patches from you regarding this issue...
I must be missing something. Could you please point me to the lkml link?

Thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Davidlohr Bueso
On Mon, 2014-04-14 at 09:44 -0700, Davidlohr Bueso wrote:
 On Tue, 2014-04-15 at 00:19 +0800, Jiang Liu wrote:
  Hi Davidlohr,
  Thanks for providing the DMAR table. According to the DMAR
  table, one bug in the iommu driver fails to handle this entry:
  [1D2h 0466   1]  Device Scope Entry Type : 01
  [1D3h 0467   1] Entry Length : 0A
  [1D4h 0468   2] Reserved : 
  [1D6h 0470   1]   Enumeration ID : 00
  [1D7h 0471   1]   PCI Bus Number : 00
  [1D8h 0472   2] PCI Path : 1C,04
  [1DAh 0474   2] PCI Path : 00,02
  
  And the patch sent out by me should fix this bug. Could you please help
  to have a try?
 
 Sorry, I am unable to find any patches from you regarding this issue...
 I must be missing something. Could you please point me to the lkml link?

Never mind, I got it internally. I'll let you know  as soon as I can
test it later today.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] ARM SMMU fixes

2014-04-14 Thread Will Deacon
On Tue, Apr 08, 2014 at 02:57:43PM +0100, Marc Zyngier wrote:
 On 08/04/14 14:41, Laurent Pinchart wrote:
  I've obviously forgotten that Will was away for a month. CC'ing Marc 
  Zyngier.
  
  On Thursday 03 April 2014 01:52:55 Laurent Pinchart wrote:
  On Friday 28 February 2014 16:37:08 Laurent Pinchart wrote:
  Hello Will,
 
  I've studied your arm-smmu driver as a base to write a Renesas IOMMU
  driver and found two small issues. Here are patches to fix them. Please
  bear with me if my understanding was incorrect and the patches wrong :-)
 
  Laurent Pinchart (2):
iommu/arm-smmu: Replace list walk with platform driver data
iommu/arm-smmu: Return 0 on unmap failure
   
   drivers/iommu/arm-smmu.c | 17 +
   1 file changed, 5 insertions(+), 12 deletions(-)
 
  Do you plan to take these patches (or at least patch 2/2) in your tree ? I
  can send a pull request to Joerg if you give me your acked-by.
  
  Marc, would you like to handle this, or would you prefer to wait until Will 
  comes back ?
 
 Hi Laurent,
 
 Yup, I'll have a look and stash them in a temp tree. Given that Will
 will be back in about a week, he will have the final say.

I've already got the fix queued (Return 0 on unmap failure) and plan to
send it to Joerg this week. I think the other patch doesn't really add
anything to the driver :)

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Woodhouse, David
On Mon, 2014-04-14 at 09:47 -0700, Davidlohr Bueso wrote:
 On Mon, 2014-04-14 at 09:44 -0700, Davidlohr Bueso wrote:
  On Tue, 2014-04-15 at 00:19 +0800, Jiang Liu wrote:
   Hi Davidlohr,
 Thanks for providing the DMAR table. According to the DMAR
   table, one bug in the iommu driver fails to handle this entry:
   [1D2h 0466   1]  Device Scope Entry Type : 01
   [1D3h 0467   1] Entry Length : 0A
   [1D4h 0468   2] Reserved : 
   [1D6h 0470   1]   Enumeration ID : 00
   [1D7h 0471   1]   PCI Bus Number : 00
   [1D8h 0472   2] PCI Path : 1C,04
   [1DAh 0474   2] PCI Path : 00,02
   
 And the patch sent out by me should fix this bug. Could you please help
   to have a try?
  
  Sorry, I am unable to find any patches from you regarding this issue...
  I must be missing something. Could you please point me to the lkml link?
 
 Never mind, I got it internally. I'll let you know  as soon as I can
 test it later today.

Thanks.

Jiang, if you can then let me have a copy with a signed-off-by I'll
shepherd it upstream along with your other patch which is already in my
iommu-2.6.git tree.

-- 
  Sent with Evolution's ActiveSync support.

David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation






smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: hpsa driver bug crack kernel down!

2014-04-14 Thread Davidlohr Bueso
On Mon, 2014-04-14 at 16:57 +0800, Jiang Liu wrote:
 Hi all,
   I guess I found the root cause. It's a bug in matching
 device scope, variable 'level' should be decreased when walking up PCI
 topology.
   Could you please help to test following patch?
 Thanks!
 Gerry

Worked like a charm -- I no longer see all those DMAR messages and the
hpsa hard lockup is gone, thanks. Feel free to add my:

Reported-and-tested-by: Davidlohr Bueso davidl...@hp.com

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 21/33] iommu/vt-d: Make get_domain_for_dev() take struct device

2014-04-14 Thread Alex Williamson
On Fri, 2014-03-21 at 17:19 +, David Woodhouse wrote:
 Signed-off-by: David Woodhouse david.woodho...@intel.com
 ---
  drivers/iommu/intel-iommu.c | 75 
 ++---
  1 file changed, 36 insertions(+), 39 deletions(-)
 
 diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
 index 741fb1d..05c5214 100644
 --- a/drivers/iommu/intel-iommu.c
 +++ b/drivers/iommu/intel-iommu.c
 @@ -2207,52 +2207,51 @@ static struct dmar_domain 
 *dmar_insert_dev_info(struct intel_iommu *iommu,
  }
  
  /* domain is initialized */
 -static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw)
 +static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
  {
   struct dmar_domain *domain, *free = NULL;
   struct intel_iommu *iommu = NULL;
   struct device_domain_info *info;
 - struct dmar_drhd_unit *drhd;
 - struct pci_dev *dev_tmp;
 + struct pci_dev *dev_tmp = NULL;
   unsigned long flags;
 - int bus = 0, devfn = 0;
 - int segment;
 + u8 bus, devfn, bridge_bus, bridge_devfn;
  
 - domain = find_domain(pdev-dev);
 + domain = find_domain(dev);
   if (domain)
   return domain;
  
 - segment = pci_domain_nr(pdev-bus);
 + if (dev_is_pci(dev)) {
 + struct pci_dev *pdev = to_pci_dev(dev);
 + u16 segment;
  
 - dev_tmp = pci_find_upstream_pcie_bridge(pdev);
 - if (dev_tmp) {
 - if (pci_is_pcie(dev_tmp)) {
 - bus = dev_tmp-subordinate-number;
 - devfn = 0;
 - } else {
 - bus = dev_tmp-bus-number;
 - devfn = dev_tmp-devfn;
 - }
 - spin_lock_irqsave(device_domain_lock, flags);
 - info = dmar_search_domain_by_dev_info(segment, bus, devfn);
 - if (info) {
 - iommu = info-iommu;
 - domain = info-domain;
 + segment = pci_domain_nr(pdev-bus);
 + dev_tmp = pci_find_upstream_pcie_bridge(pdev);
 + if (dev_tmp) {
 + if (pci_is_pcie(dev_tmp)) {
 + bridge_bus = dev_tmp-subordinate-number;
 + bridge_devfn = 0;
 + } else {
 + bridge_bus = dev_tmp-bus-number;
 + bridge_devfn = dev_tmp-devfn;
 + }
 + spin_lock_irqsave(device_domain_lock, flags);
 + info = dmar_search_domain_by_dev_info(segment, bus, 
 devfn);


bus and devfn are uninitialized here, CID 1197747  1197746.  Thanks,

Alex


 + if (info) {
 + iommu = info-iommu;
 + domain = info-domain;
 + }
 + spin_unlock_irqrestore(device_domain_lock, flags);
 + /* pcie-pci bridge already has a domain, uses it */
 + if (info)
 + goto found_domain;
   }
 - spin_unlock_irqrestore(device_domain_lock, flags);
 - if (info)
 - goto found_domain;
   }
  
 - drhd = dmar_find_matched_drhd_unit(pdev);
 - if (!drhd) {
 - printk(KERN_ERR IOMMU: can't find DMAR for device %s\n,
 - pci_name(pdev));
 - return NULL;
 - }
 - iommu = drhd-iommu;
 + iommu = device_to_iommu(dev, bus, devfn);
 + if (!iommu)
 + goto error;
  
 - /* Allocate and intialize new domain for the device */
 + /* Allocate and initialize new domain for the device */
   domain = alloc_domain(false);
   if (!domain)
   goto error;
 @@ -2266,15 +2265,14 @@ static struct dmar_domain *get_domain_for_dev(struct 
 pci_dev *pdev, int gaw)
  
   /* register pcie-to-pci device */
   if (dev_tmp) {
 - domain = dmar_insert_dev_info(iommu, bus, devfn, NULL,
 -   domain);
 + domain = dmar_insert_dev_info(iommu, bridge_bus, bridge_devfn,
 +   NULL, domain);
   if (!domain)
   goto error;
   }
  
  found_domain:
 - domain = dmar_insert_dev_info(iommu, pdev-bus-number,
 -   pdev-devfn, pdev-dev, domain);
 + domain = dmar_insert_dev_info(iommu, bus, devfn, dev, domain);
  error:
   if (free != domain)
   domain_exit(free);
 @@ -2320,7 +2318,7 @@ static int iommu_prepare_identity_map(struct pci_dev 
 *pdev,
   struct dmar_domain *domain;
   int ret;
  
 - domain = get_domain_for_dev(pdev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
 + domain = get_domain_for_dev(pdev-dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
   if (!domain)
   return -ENOMEM;
  
 @@ -2864,8 +2862,7 @@ static struct dmar_domain 
 

Re: [PATCH 21/33] iommu/vt-d: Make get_domain_for_dev() take struct device

2014-04-14 Thread Woodhouse, David
On Mon, 2014-04-14 at 15:22 -0600, Alex Williamson wrote:
 
  + if (dev_tmp) {
  + if (pci_is_pcie(dev_tmp)) {
  + bridge_bus = dev_tmp-subordinate-number;
  + bridge_devfn = 0;
  + } else {
  + bridge_bus = dev_tmp-bus-number;
  + bridge_devfn = dev_tmp-devfn;
  + }
  + spin_lock_irqsave(device_domain_lock, flags);
  + info = dmar_search_domain_by_dev_info(segment, bus, 
  devfn);
 
 
 bus and devfn are uninitialized here, CID 1197747  1197746.  Thanks,

Oops. That should be using bridge_bus and bridge_devfn, shouldn't it?

Will fix; thanks.

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation


smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 21/33] iommu/vt-d: Make get_domain_for_dev() take struct device

2014-04-14 Thread Alex Williamson
On Mon, 2014-04-14 at 21:40 +, Woodhouse, David wrote:
 On Mon, 2014-04-14 at 15:22 -0600, Alex Williamson wrote:
  
   + if (dev_tmp) {
   + if (pci_is_pcie(dev_tmp)) {
   + bridge_bus = dev_tmp-subordinate-number;
   + bridge_devfn = 0;
   + } else {
   + bridge_bus = dev_tmp-bus-number;
   + bridge_devfn = dev_tmp-devfn;
   + }
   + spin_lock_irqsave(device_domain_lock, flags);
   + info = dmar_search_domain_by_dev_info(segment, bus, 
   devfn);
  
  
  bus and devfn are uninitialized here, CID 1197747  1197746.  Thanks,
 
 Oops. That should be using bridge_bus and bridge_devfn, shouldn't it?
 
 Will fix; thanks.
 

Yep, I think it was supposed to be bridge_*.  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/vt-d: fix bug in matching PCI devices with DRHD/RMRR descriptors

2014-04-14 Thread Jiang Liu
Commit 59ce0515cdaf iommu/vt-d: Update DRHD/RMRR/ATSR device scope
caches when PCI hotplug happens introduces a bug, which fails to
match PCI devices with DMAR device scope entries if PCI path array
in the entry has more than one level.

For example, it fails to handle
[1D2h 0466   1]  Device Scope Entry Type : 01
[1D3h 0467   1] Entry Length : 0A
[1D4h 0468   2] Reserved : 
[1D6h 0470   1]   Enumeration ID : 00
[1D7h 0471   1]   PCI Bus Number : 00
[1D8h 0472   2] PCI Path : 1C,04
[1DAh 0474   2] PCI Path : 00,02

And cause DMA failure on HP DL980 as:
DMAR:[fault reason 02] Present bit in context entry is clear
dmar: DRHD: handling fault status reg 602
dmar: DMAR:[DMA Read] Request device [02:00.2] fault addr 7f61e000

Reported-and-tested-by: Davidlohr Bueso davidl...@hp.com
Signed-off-by: Jiang Liu jiang@linux.intel.com
---
Hi David and Davidlohr,
I have made minor syntax change to the patch, but there should be no
functional change.
Thanks!
Gerry
---
 drivers/iommu/dmar.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index f445c10df8df..39f8b717fe84 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -152,7 +152,8 @@ dmar_alloc_pci_notify_info(struct pci_dev *dev, unsigned 
long event)
info-seg = pci_domain_nr(dev-bus);
info-level = level;
if (event == BUS_NOTIFY_ADD_DEVICE) {
-   for (tmp = dev, level--; tmp; tmp = tmp-bus-self) {
+   for (tmp = dev; tmp; tmp = tmp-bus-self) {
+   level--;
info-path[level].device = PCI_SLOT(tmp-devfn);
info-path[level].function = PCI_FUNC(tmp-devfn);
if (pci_is_root_bus(tmp-bus))
-- 
1.7.10.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu