Re: hpsa driver bug crack kernel down!
Hi Davidlohr, Thanks for the information! According to lspci output, device :02:00.2 is HP ILO controller, device :03:00.0 is RAID controller. Both ILO and RAID controllers need to access reserved memory range [0x7f61e000 - 0x7f61] in physical mode. According to dmesg output, BIOS has reserved memory and IOMMU has setup 1:1 mapping for ILO and RAID controller to access this range. Related log messages as below: BIOS-e820: [mem 0x7f61d000-0x8fff] reserved IOMMU: Setting RMRR: IOMMU: Setting identity map for device :03:00.0 [0x7f61e000 - 0x7f61] IOMMU: Setting identity map for device :02:00.0 [0x7f61e000 - 0x7f61] IOMMU: Setting identity map for device :02:00.2 [0x7f61e000 - 0x7f61] From the screenshot, device :02:00.2 fails to access memory address 0x7f61e000. That indicates IOMMU driver fails to setup 1:1 mapping for Reserved Memory Range for ILO controller. So could you please help to check whether you could observe boot messages like IOMMU: Setting identity map for device :02:00.2 [0x7f61e000 - 0x7f61] with the failure kernel image? It would be great if boot messages could be saved when failing to boot, so we could get more information from log. BTW, I have double checked related code, and still can't find a reliable explanation for the regression:( Thanks! Gerry On 2014/4/11 0:19, Davidlohr Bueso wrote: On Thu, 2014-04-10 at 08:46 +, Woodhouse, David wrote: On Thu, 2014-04-10 at 09:15 +0200, Joerg Roedel wrote: [+ David, VT-d maintainer ] Jiang, David, can you please have a look into this issue? DMAR:[fault reason 02] Present bit in context entry is clear dmar: DRHD: handling fault status reg 602 dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr 7f61e000 That Present bit in context entry is clear fault means that we have not set up *any* mappings for this PCI device… on this IOMMU. Yes, specifically (finally done bisecting): commit 2e45528930388658603ea24d49cf52867b928d3e Author: Jiang Liu jiang@linux.intel.com Date: Wed Feb 19 14:07:36 2014 +0800 iommu/vt-d: Unify the way to process DMAR device scope array This commit is about how we decide which IOMMU a given PCI device is attached to. Thus, my first guess would be that we are quite happily setting up the requested DMA maps on the *wrong* IOMMU, and then taking faults when the device actually tries to do DMA. However, I'm not 100% convinced of that. The fault address looks suspiciously like a true physical address, not a virtual bus address of the type that we'd normally allocate for a dma_map_* operation. Those would start at 0xf000 and work downwards, typically. Do you have 'iommu=pt' on the kernel command line? No. Can I see the full dmesg as this system boots, and also a copy of the DMAR table? Attaching a dmesg from one of the kernels that boots. It doesn't appear to have much of the related information... is there any debug config option I can enable that might give you more data? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: hpsa driver bug crack kernel down!
Hi all, I guess I found the root cause. It's a bug in matching device scope, variable 'level' should be decreased when walking up PCI topology. Could you please help to test following patch? Thanks! Gerry diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index f445c10..1f8308c 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -152,7 +152,7 @@ dmar_alloc_pci_notify_info(struct pci_dev *dev, unsigned long event) info-seg = pci_domain_nr(dev-bus); info-level = level; if (event == BUS_NOTIFY_ADD_DEVICE) { - for (tmp = dev, level--; tmp; tmp = tmp-bus-self) { + for (tmp = dev, level--; tmp; level--, tmp = tmp-bus-self) { info-path[level].device = PCI_SLOT(tmp-devfn); info-path[level].function = PCI_FUNC(tmp-devfn); if (pci_is_root_bus(tmp-bus)) On 2014/4/11 0:19, Davidlohr Bueso wrote: On Thu, 2014-04-10 at 08:46 +, Woodhouse, David wrote: On Thu, 2014-04-10 at 09:15 +0200, Joerg Roedel wrote: [+ David, VT-d maintainer ] Jiang, David, can you please have a look into this issue? DMAR:[fault reason 02] Present bit in context entry is clear dmar: DRHD: handling fault status reg 602 dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr 7f61e000 That Present bit in context entry is clear fault means that we have not set up *any* mappings for this PCI device… on this IOMMU. Yes, specifically (finally done bisecting): commit 2e45528930388658603ea24d49cf52867b928d3e Author: Jiang Liu jiang@linux.intel.com Date: Wed Feb 19 14:07:36 2014 +0800 iommu/vt-d: Unify the way to process DMAR device scope array This commit is about how we decide which IOMMU a given PCI device is attached to. Thus, my first guess would be that we are quite happily setting up the requested DMA maps on the *wrong* IOMMU, and then taking faults when the device actually tries to do DMA. However, I'm not 100% convinced of that. The fault address looks suspiciously like a true physical address, not a virtual bus address of the type that we'd normally allocate for a dma_map_* operation. Those would start at 0xf000 and work downwards, typically. Do you have 'iommu=pt' on the kernel command line? No. Can I see the full dmesg as this system boots, and also a copy of the DMAR table? Attaching a dmesg from one of the kernels that boots. It doesn't appear to have much of the related information... is there any debug config option I can enable that might give you more data? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] hpsa: fix uninitialized trans_support in hpsa_put_ctlr_into_performant_mode()
Your subject line is very tame. It should be the one line summary of why we apply the patch, so it should read something like hpsa: fix NULL deref in performant mode On Thu, 2014-04-10 at 17:17 -0500, scame...@beardog.cce.hp.com wrote: Without this, you'll see a null pointer dereference in hpsa_enter_performant_mode(). The description should be more comprehensible. I'm clear that the use before initialisation is a bug ... I'm less clear on why it causes an oops. Signed-off-by: Stephen M. Cameron scame...@beardog.cce.hp.com --- drivers/scsi/hpsa.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 8cf4a0c..ef4dfdd 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -7463,6 +7463,10 @@ static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h) if (hpsa_simple_mode) return; + trans_support = readl((h-cfgtable-TransportSupport)); + if (!(trans_support PERFORMANT_MODE)) + return; + /* Check for I/O accelerator mode support */ if (trans_support CFGTBL_Trans_io_accel1) { transMethod |= CFGTBL_Trans_io_accel1 | Shouldn't you be moving this check from its previous location, rather than adding a new one that makes the original obsolete? James --- diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 8cf4a0c..9a6e4a2 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -7463,6 +7463,10 @@ static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h) if (hpsa_simple_mode) return; + trans_support = readl((h-cfgtable-TransportSupport)); + if (!(trans_support PERFORMANT_MODE)) + return; + /* Check for I/O accelerator mode support */ if (trans_support CFGTBL_Trans_io_accel1) { transMethod |= CFGTBL_Trans_io_accel1 | @@ -7479,10 +7483,6 @@ static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h) } /* TODO, check that this next line h-nreply_queues is correct */ - trans_support = readl((h-cfgtable-TransportSupport)); - if (!(trans_support PERFORMANT_MODE)) - return; - h-nreply_queues = h-msix_vector 0 ? h-msix_vector : 1; hpsa_get_max_perf_mode_cmds(h); /* Performant mode ring buffer and supporting data structures */ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: hpsa driver bug crack kernel down!
Sorry for the delay, I've been having to take turns for this box. On Fri, 2014-04-11 at 09:18 +, Woodhouse, David wrote: On Thu, 2014-04-10 at 09:19 -0700, Davidlohr Bueso wrote: Attaching a dmesg from one of the kernels that boots. It doesn't appear to have much of the related information... is there any debug config option I can enable that might give you more data? I'd like the contents of /sys/firmware/acpi/tables/DMAR please. Attached is the disassembly of the raw output. And please could you also apply this patch to both the last-working and first-failing kernels and show me the output in both cases? So I still cannot get around getting the info for the first failing kernel, but below is for the last working. Thanks. Device 0:03:00.0 on IOMMU at a800 Device 0:03:00.0 on IOMMU at a800 IOMMU: Setting identity map for device :02:00.0 [0x7f61e000 - 0x7f61] Device 0:02:00.0 on IOMMU at a800 Device 0:02:00.0 on IOMMU at a800 IOMMU: Setting identity map for device :02:00.2 [0x7f61e000 - 0x7f61] Device 0:02:00.2 on IOMMU at a800 Device 0:02:00.2 on IOMMU at a800 IOMMU: Setting identity map for device :00:1d.0 [0x7f7e7000 - 0x7f7ecfff] Device 0:00:1d.0 on IOMMU at a800 Device 0:00:1d.0 on IOMMU at a800 IOMMU: Setting identity map for device :00:1d.1 [0x7f7e7000 - 0x7f7ecfff] Device 0:00:1d.1 on IOMMU at a800 Device 0:00:1d.1 on IOMMU at a800 IOMMU: Setting identity map for device :00:1d.2 [0x7f7e7000 - 0x7f7ecfff] Device 0:00:1d.2 on IOMMU at a800 Device 0:00:1d.2 on IOMMU at a800 IOMMU: Setting identity map for device :00:1d.3 [0x7f7e7000 - 0x7f7ecfff] Device 0:00:1d.3 on IOMMU at a800 Device 0:00:1d.3 on IOMMU at a800 IOMMU: Setting identity map for device :02:00.0 [0x7f7e7000 - 0x7f7ecfff] Device 0:02:00.0 on IOMMU at a800 IOMMU: Setting identity map for device :02:00.2 [0x7f7e7000 - 0x7f7ecfff] Device 0:02:00.2 on IOMMU at a800 IOMMU: Setting identity map for device :02:00.4 [0x7f7e7000 - 0x7f7ecfff] Device 0:02:00.4 on IOMMU at a800 Device 0:02:00.4 on IOMMU at a800 IOMMU: Setting identity map for device :00:1d.7 [0x7f7ee000 - 0x7f7e] Device 0:00:1d.7 on IOMMU at a800 Device 0:00:1d.7 on IOMMU at a800 IOMMU: Prepare 0-16MiB unity mapping for LPC IOMMU: Setting identity map for device :00:1f.0 [0x0 - 0xff] Device 0:00:1f.0 on IOMMU at a800 Device 0:00:1f.0 on IOMMU at a800 PCI-DMA: Intel(R) Virtualization Technology for Directed I/O Device 0:00:00.0 on IOMMU at a800 Device 0:00:01.0 on IOMMU at a800 Device 0:00:02.0 on IOMMU at a800 Device 0:00:03.0 on IOMMU at a800 Device 0:00:04.0 on IOMMU at a800 Device 0:00:05.0 on IOMMU at a800 Device 0:00:06.0 on IOMMU at a800 Device 0:00:07.0 on IOMMU at a800 Device 0:00:08.0 on IOMMU at a800 Device 0:00:09.0 on IOMMU at a800 Device 0:00:0a.0 on IOMMU at a800 Device 0:00:14.0 on IOMMU at a800 Device 0:00:1c.0 on IOMMU at a800 Device 0:00:1c.4 on IOMMU at a800 Device 0:00:1d.0 on IOMMU at a800 Device 0:00:1d.1 on IOMMU at a800 Device 0:00:1d.2 on IOMMU at a800 Device 0:00:1d.3 on IOMMU at a800 Device 0:00:1d.7 on IOMMU at a800 Device 0:00:1e.0 on IOMMU at a800 Device 0:00:1f.0 on IOMMU at a800 Device 0:04:00.0 on IOMMU at a800 Device 0:04:00.1 on IOMMU at a800 Device 0:04:00.2 on IOMMU at a800 Device 0:04:00.3 on IOMMU at a800 Device 0:03:00.0 on IOMMU at a800 Device 0:02:00.0 on IOMMU at a800 Device 0:02:00.2 on IOMMU at a800 Device 0:02:00.4 on IOMMU at a800 Device 0:01:03.0 on IOMMU at a800 Device 0:50:00.0 on IOMMU at ac00 Device 0:50:01.0 on IOMMU at ac00 Device 0:50:02.0 on IOMMU at ac00 Device 0:50:03.0 on IOMMU at ac00 Device 0:50:04.0 on IOMMU at ac00 Device 0:50:05.0 on IOMMU at ac00 Device 0:50:06.0 on IOMMU at ac00 Device 0:50:07.0 on IOMMU at ac00 Device 0:50:08.0 on IOMMU at ac00 Device 0:50:09.0 on IOMMU at ac00 Device 0:50:0a.0 on IOMMU at ac00 Device 0:50:14.0 on IOMMU at a800 Device 0:a0:00.0 on IOMMU at b000 Device 0:a0:01.0 on IOMMU at b000 Device 0:a0:02.0 on IOMMU at b000 Device 0:a0:03.0 on IOMMU at b000 Device 0:a0:04.0 on IOMMU at b000 Device 0:a0:05.0 on IOMMU at b000 Device 0:a0:06.0 on IOMMU at b000 Device 0:a0:07.0 on IOMMU at b000 Device 0:a0:08.0 on IOMMU at b000 Device 0:a0:09.0 on IOMMU at b000 Device 0:a0:0a.0 on IOMMU at b000 Device 0:a0:14.0 on IOMMU at a800 Device 0:7c:00.0 on IOMMU at a800 Device 0:7c:08.0 on IOMMU at a800 Device 0:82:00.0 on IOMMU at a800 Device 0:82:08.0 on IOMMU at a800 /* * Intel ACPI Component Architecture * AML Disassembler version 20140325-64 [Apr 11 2014] * Copyright (c) 2000 - 2014 Intel Corporation * * Disassembly of DMAR.raw, Fri Apr 11 09:10:10 2014 * * ACPI Data Table [DMAR] * * Format:
Re: [PATCH] hpsa: fix uninitialized trans_support in hpsa_put_ctlr_into_performant_mode()
On Mon, Apr 14, 2014 at 08:45:16AM -0700, James Bottomley wrote: Your subject line is very tame. It should be the one line summary of why we apply the patch, so it should read something like hpsa: fix NULL deref in performant mode On Thu, 2014-04-10 at 17:17 -0500, scame...@beardog.cce.hp.com wrote: Without this, you'll see a null pointer dereference in hpsa_enter_performant_mode(). The description should be more comprehensible. I'm clear that the use before initialisation is a bug ... I'm less clear on why it causes an oops. Signed-off-by: Stephen M. Cameron scame...@beardog.cce.hp.com --- drivers/scsi/hpsa.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 8cf4a0c..ef4dfdd 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -7463,6 +7463,10 @@ static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h) if (hpsa_simple_mode) return; + trans_support = readl((h-cfgtable-TransportSupport)); + if (!(trans_support PERFORMANT_MODE)) + return; + /* Check for I/O accelerator mode support */ if (trans_support CFGTBL_Trans_io_accel1) { transMethod |= CFGTBL_Trans_io_accel1 | Shouldn't you be moving this check from its previous location, rather than adding a new one that makes the original obsolete? Oh... I didn't notice that. So that's what happened to that hunk. Yes, that is what should be done. I will resend the patch after fixing up this and the commit message. -- steve James --- diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 8cf4a0c..9a6e4a2 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -7463,6 +7463,10 @@ static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h) if (hpsa_simple_mode) return; + trans_support = readl((h-cfgtable-TransportSupport)); + if (!(trans_support PERFORMANT_MODE)) + return; + /* Check for I/O accelerator mode support */ if (trans_support CFGTBL_Trans_io_accel1) { transMethod |= CFGTBL_Trans_io_accel1 | @@ -7479,10 +7483,6 @@ static void hpsa_put_ctlr_into_performant_mode(struct ctlr_info *h) } /* TODO, check that this next line h-nreply_queues is correct */ - trans_support = readl((h-cfgtable-TransportSupport)); - if (!(trans_support PERFORMANT_MODE)) - return; - h-nreply_queues = h-msix_vector 0 ? h-msix_vector : 1; hpsa_get_max_perf_mode_cmds(h); /* Performant mode ring buffer and supporting data structures */ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: hpsa driver bug crack kernel down!
Hi Davidlohr, Thanks for providing the DMAR table. According to the DMAR table, one bug in the iommu driver fails to handle this entry: [1D2h 0466 1] Device Scope Entry Type : 01 [1D3h 0467 1] Entry Length : 0A [1D4h 0468 2] Reserved : [1D6h 0470 1] Enumeration ID : 00 [1D7h 0471 1] PCI Bus Number : 00 [1D8h 0472 2] PCI Path : 1C,04 [1DAh 0474 2] PCI Path : 00,02 And the patch sent out by me should fix this bug. Could you please help to have a try? Thanks! Gerry On 2014/4/14 23:45, Davidlohr Bueso wrote: Sorry for the delay, I've been having to take turns for this box. On Fri, 2014-04-11 at 09:18 +, Woodhouse, David wrote: On Thu, 2014-04-10 at 09:19 -0700, Davidlohr Bueso wrote: Attaching a dmesg from one of the kernels that boots. It doesn't appear to have much of the related information... is there any debug config option I can enable that might give you more data? I'd like the contents of /sys/firmware/acpi/tables/DMAR please. Attached is the disassembly of the raw output. And please could you also apply this patch to both the last-working and first-failing kernels and show me the output in both cases? So I still cannot get around getting the info for the first failing kernel, but below is for the last working. Thanks. Device 0:03:00.0 on IOMMU at a800 Device 0:03:00.0 on IOMMU at a800 IOMMU: Setting identity map for device :02:00.0 [0x7f61e000 - 0x7f61] Device 0:02:00.0 on IOMMU at a800 Device 0:02:00.0 on IOMMU at a800 IOMMU: Setting identity map for device :02:00.2 [0x7f61e000 - 0x7f61] Device 0:02:00.2 on IOMMU at a800 Device 0:02:00.2 on IOMMU at a800 IOMMU: Setting identity map for device :00:1d.0 [0x7f7e7000 - 0x7f7ecfff] Device 0:00:1d.0 on IOMMU at a800 Device 0:00:1d.0 on IOMMU at a800 IOMMU: Setting identity map for device :00:1d.1 [0x7f7e7000 - 0x7f7ecfff] Device 0:00:1d.1 on IOMMU at a800 Device 0:00:1d.1 on IOMMU at a800 IOMMU: Setting identity map for device :00:1d.2 [0x7f7e7000 - 0x7f7ecfff] Device 0:00:1d.2 on IOMMU at a800 Device 0:00:1d.2 on IOMMU at a800 IOMMU: Setting identity map for device :00:1d.3 [0x7f7e7000 - 0x7f7ecfff] Device 0:00:1d.3 on IOMMU at a800 Device 0:00:1d.3 on IOMMU at a800 IOMMU: Setting identity map for device :02:00.0 [0x7f7e7000 - 0x7f7ecfff] Device 0:02:00.0 on IOMMU at a800 IOMMU: Setting identity map for device :02:00.2 [0x7f7e7000 - 0x7f7ecfff] Device 0:02:00.2 on IOMMU at a800 IOMMU: Setting identity map for device :02:00.4 [0x7f7e7000 - 0x7f7ecfff] Device 0:02:00.4 on IOMMU at a800 Device 0:02:00.4 on IOMMU at a800 IOMMU: Setting identity map for device :00:1d.7 [0x7f7ee000 - 0x7f7e] Device 0:00:1d.7 on IOMMU at a800 Device 0:00:1d.7 on IOMMU at a800 IOMMU: Prepare 0-16MiB unity mapping for LPC IOMMU: Setting identity map for device :00:1f.0 [0x0 - 0xff] Device 0:00:1f.0 on IOMMU at a800 Device 0:00:1f.0 on IOMMU at a800 PCI-DMA: Intel(R) Virtualization Technology for Directed I/O Device 0:00:00.0 on IOMMU at a800 Device 0:00:01.0 on IOMMU at a800 Device 0:00:02.0 on IOMMU at a800 Device 0:00:03.0 on IOMMU at a800 Device 0:00:04.0 on IOMMU at a800 Device 0:00:05.0 on IOMMU at a800 Device 0:00:06.0 on IOMMU at a800 Device 0:00:07.0 on IOMMU at a800 Device 0:00:08.0 on IOMMU at a800 Device 0:00:09.0 on IOMMU at a800 Device 0:00:0a.0 on IOMMU at a800 Device 0:00:14.0 on IOMMU at a800 Device 0:00:1c.0 on IOMMU at a800 Device 0:00:1c.4 on IOMMU at a800 Device 0:00:1d.0 on IOMMU at a800 Device 0:00:1d.1 on IOMMU at a800 Device 0:00:1d.2 on IOMMU at a800 Device 0:00:1d.3 on IOMMU at a800 Device 0:00:1d.7 on IOMMU at a800 Device 0:00:1e.0 on IOMMU at a800 Device 0:00:1f.0 on IOMMU at a800 Device 0:04:00.0 on IOMMU at a800 Device 0:04:00.1 on IOMMU at a800 Device 0:04:00.2 on IOMMU at a800 Device 0:04:00.3 on IOMMU at a800 Device 0:03:00.0 on IOMMU at a800 Device 0:02:00.0 on IOMMU at a800 Device 0:02:00.2 on IOMMU at a800 Device 0:02:00.4 on IOMMU at a800 Device 0:01:03.0 on IOMMU at a800 Device 0:50:00.0 on IOMMU at ac00 Device 0:50:01.0 on IOMMU at ac00 Device 0:50:02.0 on IOMMU at ac00 Device 0:50:03.0 on IOMMU at ac00 Device 0:50:04.0 on IOMMU at ac00 Device 0:50:05.0 on IOMMU at ac00 Device 0:50:06.0 on IOMMU at ac00 Device 0:50:07.0 on IOMMU at ac00 Device 0:50:08.0 on IOMMU at ac00 Device 0:50:09.0 on IOMMU at ac00 Device 0:50:0a.0 on IOMMU at ac00 Device 0:50:14.0 on IOMMU at a800 Device 0:a0:00.0 on IOMMU at b000 Device 0:a0:01.0 on IOMMU at b000 Device 0:a0:02.0 on IOMMU at b000
Re: hpsa driver bug crack kernel down!
On Tue, 2014-04-15 at 00:19 +0800, Jiang Liu wrote: Hi Davidlohr, Thanks for providing the DMAR table. According to the DMAR table, one bug in the iommu driver fails to handle this entry: [1D2h 0466 1] Device Scope Entry Type : 01 [1D3h 0467 1] Entry Length : 0A [1D4h 0468 2] Reserved : [1D6h 0470 1] Enumeration ID : 00 [1D7h 0471 1] PCI Bus Number : 00 [1D8h 0472 2] PCI Path : 1C,04 [1DAh 0474 2] PCI Path : 00,02 And the patch sent out by me should fix this bug. Could you please help to have a try? Sorry, I am unable to find any patches from you regarding this issue... I must be missing something. Could you please point me to the lkml link? Thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: hpsa driver bug crack kernel down!
On Mon, 2014-04-14 at 09:44 -0700, Davidlohr Bueso wrote: On Tue, 2014-04-15 at 00:19 +0800, Jiang Liu wrote: Hi Davidlohr, Thanks for providing the DMAR table. According to the DMAR table, one bug in the iommu driver fails to handle this entry: [1D2h 0466 1] Device Scope Entry Type : 01 [1D3h 0467 1] Entry Length : 0A [1D4h 0468 2] Reserved : [1D6h 0470 1] Enumeration ID : 00 [1D7h 0471 1] PCI Bus Number : 00 [1D8h 0472 2] PCI Path : 1C,04 [1DAh 0474 2] PCI Path : 00,02 And the patch sent out by me should fix this bug. Could you please help to have a try? Sorry, I am unable to find any patches from you regarding this issue... I must be missing something. Could you please point me to the lkml link? Never mind, I got it internally. I'll let you know as soon as I can test it later today. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/2] ARM SMMU fixes
On Tue, Apr 08, 2014 at 02:57:43PM +0100, Marc Zyngier wrote: On 08/04/14 14:41, Laurent Pinchart wrote: I've obviously forgotten that Will was away for a month. CC'ing Marc Zyngier. On Thursday 03 April 2014 01:52:55 Laurent Pinchart wrote: On Friday 28 February 2014 16:37:08 Laurent Pinchart wrote: Hello Will, I've studied your arm-smmu driver as a base to write a Renesas IOMMU driver and found two small issues. Here are patches to fix them. Please bear with me if my understanding was incorrect and the patches wrong :-) Laurent Pinchart (2): iommu/arm-smmu: Replace list walk with platform driver data iommu/arm-smmu: Return 0 on unmap failure drivers/iommu/arm-smmu.c | 17 + 1 file changed, 5 insertions(+), 12 deletions(-) Do you plan to take these patches (or at least patch 2/2) in your tree ? I can send a pull request to Joerg if you give me your acked-by. Marc, would you like to handle this, or would you prefer to wait until Will comes back ? Hi Laurent, Yup, I'll have a look and stash them in a temp tree. Given that Will will be back in about a week, he will have the final say. I've already got the fix queued (Return 0 on unmap failure) and plan to send it to Joerg this week. I think the other patch doesn't really add anything to the driver :) Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: hpsa driver bug crack kernel down!
On Mon, 2014-04-14 at 09:47 -0700, Davidlohr Bueso wrote: On Mon, 2014-04-14 at 09:44 -0700, Davidlohr Bueso wrote: On Tue, 2014-04-15 at 00:19 +0800, Jiang Liu wrote: Hi Davidlohr, Thanks for providing the DMAR table. According to the DMAR table, one bug in the iommu driver fails to handle this entry: [1D2h 0466 1] Device Scope Entry Type : 01 [1D3h 0467 1] Entry Length : 0A [1D4h 0468 2] Reserved : [1D6h 0470 1] Enumeration ID : 00 [1D7h 0471 1] PCI Bus Number : 00 [1D8h 0472 2] PCI Path : 1C,04 [1DAh 0474 2] PCI Path : 00,02 And the patch sent out by me should fix this bug. Could you please help to have a try? Sorry, I am unable to find any patches from you regarding this issue... I must be missing something. Could you please point me to the lkml link? Never mind, I got it internally. I'll let you know as soon as I can test it later today. Thanks. Jiang, if you can then let me have a copy with a signed-off-by I'll shepherd it upstream along with your other patch which is already in my iommu-2.6.git tree. -- Sent with Evolution's ActiveSync support. David WoodhouseOpen Source Technology Centre david.woodho...@intel.com Intel Corporation smime.p7s Description: S/MIME cryptographic signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: hpsa driver bug crack kernel down!
On Mon, 2014-04-14 at 16:57 +0800, Jiang Liu wrote: Hi all, I guess I found the root cause. It's a bug in matching device scope, variable 'level' should be decreased when walking up PCI topology. Could you please help to test following patch? Thanks! Gerry Worked like a charm -- I no longer see all those DMAR messages and the hpsa hard lockup is gone, thanks. Feel free to add my: Reported-and-tested-by: Davidlohr Bueso davidl...@hp.com ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 21/33] iommu/vt-d: Make get_domain_for_dev() take struct device
On Fri, 2014-03-21 at 17:19 +, David Woodhouse wrote: Signed-off-by: David Woodhouse david.woodho...@intel.com --- drivers/iommu/intel-iommu.c | 75 ++--- 1 file changed, 36 insertions(+), 39 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 741fb1d..05c5214 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2207,52 +2207,51 @@ static struct dmar_domain *dmar_insert_dev_info(struct intel_iommu *iommu, } /* domain is initialized */ -static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) +static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw) { struct dmar_domain *domain, *free = NULL; struct intel_iommu *iommu = NULL; struct device_domain_info *info; - struct dmar_drhd_unit *drhd; - struct pci_dev *dev_tmp; + struct pci_dev *dev_tmp = NULL; unsigned long flags; - int bus = 0, devfn = 0; - int segment; + u8 bus, devfn, bridge_bus, bridge_devfn; - domain = find_domain(pdev-dev); + domain = find_domain(dev); if (domain) return domain; - segment = pci_domain_nr(pdev-bus); + if (dev_is_pci(dev)) { + struct pci_dev *pdev = to_pci_dev(dev); + u16 segment; - dev_tmp = pci_find_upstream_pcie_bridge(pdev); - if (dev_tmp) { - if (pci_is_pcie(dev_tmp)) { - bus = dev_tmp-subordinate-number; - devfn = 0; - } else { - bus = dev_tmp-bus-number; - devfn = dev_tmp-devfn; - } - spin_lock_irqsave(device_domain_lock, flags); - info = dmar_search_domain_by_dev_info(segment, bus, devfn); - if (info) { - iommu = info-iommu; - domain = info-domain; + segment = pci_domain_nr(pdev-bus); + dev_tmp = pci_find_upstream_pcie_bridge(pdev); + if (dev_tmp) { + if (pci_is_pcie(dev_tmp)) { + bridge_bus = dev_tmp-subordinate-number; + bridge_devfn = 0; + } else { + bridge_bus = dev_tmp-bus-number; + bridge_devfn = dev_tmp-devfn; + } + spin_lock_irqsave(device_domain_lock, flags); + info = dmar_search_domain_by_dev_info(segment, bus, devfn); bus and devfn are uninitialized here, CID 1197747 1197746. Thanks, Alex + if (info) { + iommu = info-iommu; + domain = info-domain; + } + spin_unlock_irqrestore(device_domain_lock, flags); + /* pcie-pci bridge already has a domain, uses it */ + if (info) + goto found_domain; } - spin_unlock_irqrestore(device_domain_lock, flags); - if (info) - goto found_domain; } - drhd = dmar_find_matched_drhd_unit(pdev); - if (!drhd) { - printk(KERN_ERR IOMMU: can't find DMAR for device %s\n, - pci_name(pdev)); - return NULL; - } - iommu = drhd-iommu; + iommu = device_to_iommu(dev, bus, devfn); + if (!iommu) + goto error; - /* Allocate and intialize new domain for the device */ + /* Allocate and initialize new domain for the device */ domain = alloc_domain(false); if (!domain) goto error; @@ -2266,15 +2265,14 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw) /* register pcie-to-pci device */ if (dev_tmp) { - domain = dmar_insert_dev_info(iommu, bus, devfn, NULL, - domain); + domain = dmar_insert_dev_info(iommu, bridge_bus, bridge_devfn, + NULL, domain); if (!domain) goto error; } found_domain: - domain = dmar_insert_dev_info(iommu, pdev-bus-number, - pdev-devfn, pdev-dev, domain); + domain = dmar_insert_dev_info(iommu, bus, devfn, dev, domain); error: if (free != domain) domain_exit(free); @@ -2320,7 +2318,7 @@ static int iommu_prepare_identity_map(struct pci_dev *pdev, struct dmar_domain *domain; int ret; - domain = get_domain_for_dev(pdev, DEFAULT_DOMAIN_ADDRESS_WIDTH); + domain = get_domain_for_dev(pdev-dev, DEFAULT_DOMAIN_ADDRESS_WIDTH); if (!domain) return -ENOMEM; @@ -2864,8 +2862,7 @@ static struct dmar_domain
Re: [PATCH 21/33] iommu/vt-d: Make get_domain_for_dev() take struct device
On Mon, 2014-04-14 at 15:22 -0600, Alex Williamson wrote: + if (dev_tmp) { + if (pci_is_pcie(dev_tmp)) { + bridge_bus = dev_tmp-subordinate-number; + bridge_devfn = 0; + } else { + bridge_bus = dev_tmp-bus-number; + bridge_devfn = dev_tmp-devfn; + } + spin_lock_irqsave(device_domain_lock, flags); + info = dmar_search_domain_by_dev_info(segment, bus, devfn); bus and devfn are uninitialized here, CID 1197747 1197746. Thanks, Oops. That should be using bridge_bus and bridge_devfn, shouldn't it? Will fix; thanks. -- David WoodhouseOpen Source Technology Centre david.woodho...@intel.com Intel Corporation smime.p7s Description: S/MIME cryptographic signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 21/33] iommu/vt-d: Make get_domain_for_dev() take struct device
On Mon, 2014-04-14 at 21:40 +, Woodhouse, David wrote: On Mon, 2014-04-14 at 15:22 -0600, Alex Williamson wrote: + if (dev_tmp) { + if (pci_is_pcie(dev_tmp)) { + bridge_bus = dev_tmp-subordinate-number; + bridge_devfn = 0; + } else { + bridge_bus = dev_tmp-bus-number; + bridge_devfn = dev_tmp-devfn; + } + spin_lock_irqsave(device_domain_lock, flags); + info = dmar_search_domain_by_dev_info(segment, bus, devfn); bus and devfn are uninitialized here, CID 1197747 1197746. Thanks, Oops. That should be using bridge_bus and bridge_devfn, shouldn't it? Will fix; thanks. Yep, I think it was supposed to be bridge_*. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/vt-d: fix bug in matching PCI devices with DRHD/RMRR descriptors
Commit 59ce0515cdaf iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens introduces a bug, which fails to match PCI devices with DMAR device scope entries if PCI path array in the entry has more than one level. For example, it fails to handle [1D2h 0466 1] Device Scope Entry Type : 01 [1D3h 0467 1] Entry Length : 0A [1D4h 0468 2] Reserved : [1D6h 0470 1] Enumeration ID : 00 [1D7h 0471 1] PCI Bus Number : 00 [1D8h 0472 2] PCI Path : 1C,04 [1DAh 0474 2] PCI Path : 00,02 And cause DMA failure on HP DL980 as: DMAR:[fault reason 02] Present bit in context entry is clear dmar: DRHD: handling fault status reg 602 dmar: DMAR:[DMA Read] Request device [02:00.2] fault addr 7f61e000 Reported-and-tested-by: Davidlohr Bueso davidl...@hp.com Signed-off-by: Jiang Liu jiang@linux.intel.com --- Hi David and Davidlohr, I have made minor syntax change to the patch, but there should be no functional change. Thanks! Gerry --- drivers/iommu/dmar.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index f445c10df8df..39f8b717fe84 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -152,7 +152,8 @@ dmar_alloc_pci_notify_info(struct pci_dev *dev, unsigned long event) info-seg = pci_domain_nr(dev-bus); info-level = level; if (event == BUS_NOTIFY_ADD_DEVICE) { - for (tmp = dev, level--; tmp; tmp = tmp-bus-self) { + for (tmp = dev; tmp; tmp = tmp-bus-self) { + level--; info-path[level].device = PCI_SLOT(tmp-devfn); info-path[level].function = PCI_FUNC(tmp-devfn); if (pci_is_root_bus(tmp-bus)) -- 1.7.10.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu